- The Finish Slime
- Posts
- Finish Slime #35
Finish Slime #35
Polars, Bad Graphs, Unstructured Data
Data Engineering, Analytics. No ML, no AI. The weekly dose of the data content you actually want to read!
Want to share anything with me? Hit me up on Twitter @sbalnojan or LinkedIn.
Great Recent Stuff
Sometimes, the best decision is to avoid getting into something. That’s also true for a bunch of data projects; you should think of every single one just as you do of any investment decision.
This is just super fun to read! So dear friend, please don’t make a bad graph.
Handling and getting information out of unstructured data is always a hassle - be sure to skim through the Airbnb approach.
This is a rare retrospective; the DEW newsletter only does this once a year. The author is in the perfect situation to do so!
You don’t need complex orchestrators to manage data pipelines. A lot of the time, cron is enough, and here’s how you use cron at scale.
Spark has held a deep grip on the delta lake for almost a decade; maybe Polars can start to free that grip.
This is food for thought - Benoit makes a good point on the shortcomings of SQL and the rise of new semantic languages like Malloy.