Skip to main content

(Un)ordinary type-fu in Haskell

· 7 min read

I'm progressing with my learning of Haskell. After grasping basic concepts time has come to face more challenging stuff.

There are several "patterns" on the intermediate level and all make heavy use of types. The tennets are to reduce typing and making more generalized abstractions. They come at cost of upfront learning time and are hard to grasp really.

This time I tackled 2 topics: structural pattern and then generics. I managed to combine these two in one post, so I will share it today.

Semantic search for dynamically built queries in Java and CodeQL

· 7 min read

There was a challenge for me recently to search for SQL queries in large codebase. There is a problem with using basic grep or even IntelliJ search here because of the performance issues.

  • queries are long and dynamically appended
  • codebase is large
  • string searching is not performant enough.

An answer how to solve this task is buried in history of beginnings of static analysis tools. The first tools used basic regexes, but that turned out inefficient pretty quickly. Then incrementally more focus has been put to parse source files to Abstract Syntax Trees which is allows more freedom to write queries. Then finally Data Flow approach was added alongside Taint Analysis to make current landscape of security today.

Semantic searching has 2 advantages:

  • searching bare tokens is orders of magnitude faster than strings, in turn searching Abstract Syntax Trees is order of magnitude faster than tokens
  • semantic search offers more precision in designing the queries which only reinforces the first point.

CodeQL is one such tool that knows the syntax of major languages (Java) and caters for performant search of large codebases. I decided to have fun with it over the weekend and push it to it's limits as searching for dynamic queries is hard enough. I will show how to set up the project and write some queries for toy source file.

Let's get started.

Wring simple parser with Megaparsec in Haskell

· 11 min read

There goes around opinion that pure functional languages are rock solid and well suited for critical systems. For example Facebook uses it in anti-spam filters, serval financial companies for derivative modelling and there is also some documented usage in compilers.

I tiptoed in Haskell long time ago, but didn't really get it. This time, my particular usecase was that I wanted to have parser for toy language with minimal effort. Parser combinators like Parsec or Megaparsec are known for purely declarative approach to modelling grammars.

After 2 weeks of playing with the language I must say that there is something strangely addictive in writing pure functional code. Reading it is hard, writing it even harder, but when it starts to work there is a lot of satisfaction. I don't know maybe I wasn't feeling confident about it before, but I finally started to like it.

Introducing concurrency solver

· 4 min read

Lately at work most of the staff is puzzled with mysterious bug. In short there is a statemachine that processes movements in batches. But sometimes one particular movement is duplicated and nobody knows why...

I wish I could brag I solved it myself, but that is not the case. But it inspired me to dig a little bit in theory how distributed systems/concurrency is reasoned about and visualized.

Time space diagrams

I have read about them in some book long ago and was looking for some time find the correct name. It's pretty niche concept, but in my opinon unjustly. They are so good to visualize not only distributed systems but also concurrency.

Let me show you.

My exploration of WASM/WASI

· 4 min read

Assembler was developed in 1947, wow! It makes 78 years of computing development in which we saw higher level programming languages, virtual machine programming languages (write once run everywhere), virtual machines, cloud and so on. A lot of knowledge got accumulated over the time which you can see in size of artifacts deployed to cloud.

Over the years we have seen several attempts reach for the roots. Similarly WebAssembly is instruction format for virtual machine designed to be portable compilation target for any language willing. It is fast as nearly native speed, secure and sandboxed and language agnostic. Originally wrote for the browsers it is beginning to get traction as microservice runtime, which btw is topic of this post.

Christmas with Quantum Mechanics

· 4 min read

Christmas is wonderful time, although not without challenges. Particular challenge for me is to keep mind fresh and not sleep all the time. I like the atmospehere of going to saint masses and eating good food (with couple of drinks) and stuff but as always I felt a little bit lazy.

So to exercise my brain a little I found out this course in quantum computation. It was great fun in short. I was constantly on the edge of my cognitive possiblities, but the material was made crystal clear. I developed a particular method to finish one module at the time in the afternoon and rehearse reading part during mornings. I had a little bit of crysis on first day of Christmas but managed to carry on half consciously.

And finally on the return home I passed the test with 75% score.

Thoughts on observability

· 4 min read

Everything is complicated, even those things that seem flat in their bleakness.

Debugging microservices application based on scarce information is one of those cases that I don't wish anyone. But it is how it is at my current project, so management started to put some measures in motion.

I reaserched topic a bit at work and a bit on my own and I have something to share - OpenTelemetry is the future. Bu it is still work in progress.

In this post I will tell you everything I learned.

Reflections after writing simple Spring Boot library

· 4 min read

Sometimes learning from adversity is better than trying to avoid it. Taking it into careful consideration provides valuable lessons that will support you in the future.

I appreciate my job for one particular thing. That is, it provides steady steam of difficult problems that challenge my intellect. Recently I tried to wrap my head around problem how to write tests for semi-large Spring Boot codebase and refactor it (with no tests whatsoever).

I started from the assumption that when you don't have any legacy tests at hand first you write them. How can you know you don't break functionality without running the tests? But the code was very unfriendly and writing them would require writing mocks.

So I thought - why not automate stuff a little bit:

  • instrument given beans with reflection
  • dump args and results to json
  • load json directly in tests instead of writing mocks in plain Java

1BRC Challenge

· 12 min read

One thing that recently got nerd the hell out of me was 1 billion row challenge. Citing the original site:

Your mission, should you decide to accept it, is deceptively simple: write a Java program for retrieving temperature measurement values from a text file and calculating the min, mean, and max temperature per weather station. There’s just one caveat: the file has 1,000,000,000 rows!

I was working on it after hourse and 1 week after taking on the challenge there are several conclusions worth writing about.

Topology graphs are important (and fun)

· 3 min read

Recently I watched the presentation from Microsoft about Radius. They developed the tool to foster the collaboration between devops and developers. In nutshell devops create "recipies" in Bicep or Terraform and developers use them to deploy the application. Seems cool.