Finish Slime #35

Polars, Bad Graphs, Unstructured Data

Data Engineering, Analytics. No ML, no AI. The weekly dose of the data content you actually want to read!

Want to share anything with me? Hit me up on Twitter @sbalnojan or LinkedIn.

Great Recent Stuff

Sometimes, the best decision is to avoid getting into something. That’s also true for a bunch of data projects; you should think of every single one just as you do of any investment decision.

This is just super fun to read! So dear friend, please don’t make a bad graph.

Handling and getting information out of unstructured data is always a hassle - be sure to skim through the Airbnb approach.

This is a rare retrospective; the DEW newsletter only does this once a year. The author is in the perfect situation to do so!

You don’t need complex orchestrators to manage data pipelines. A lot of the time, cron is enough, and here’s how you use cron at scale.

Spark has held a deep grip on the delta lake for almost a decade; maybe Polars can start to free that grip.

This is food for thought - Benoit makes a good point on the shortcomings of SQL and the rise of new semantic languages like Malloy.

Subscribe to keep reading

This content is free, but you must be subscribed to The Finish Slime to continue reading.

Already a subscriber?Sign In.Not now