From 64 hours to 300 seconds !!!That’s the sort of runtime improvement I was able to achieve by using Arrays as the primary data structure & reducing the time complexity from n3mC1 to n*m in a clustering algorithm I improved. Now this post is a bit lengthy & do I recommend going through the entire post to understand…
Read More
Data Enrichment – Designing & Optimizing a Real Time Stream Joining Pipeline
In my previous article, I wrote my thoughts on the Paradigm Shift I underwent adapting to the idiosyncrasies of real time systems when compared to batch processing. Picking up from where I left off, this article focuses on applying all those concepts to build a realtime data enrichment pipeline which performs join & lookup in…
Read More