- The Finish Slime
- Posts
- Finish Slime #14
Finish Slime #14
ByConity, Dbt, Streaming, Dagster, GX, Data Modeling, No ETL.
Data Engineering, Analytics. No ML, no AI. The weekly dose of the data content you actually want to read!
Want to share anything with me? Hit me up on Twitter @sbalnojan or Linkedin.
Tools
ByConity is an open-source cloud data warehouse that adds virtual warehouses to cloud storage. It’s inspired by Snowflake and built from the core of Clickhouse. It allows batch and stream data ingestion and runs queries on large-scale data.
Tutorials & Showcases
The team at Tempus is a big dbt user, with over 1,000 models in production. This article is about their approach to reducing the number of tests by getting a better idea of what tests really help to improve quality and which don’t.
Noah Kennedy | 25 Jan 2023
You should analyze any streaming pipeline using three questions, is there a bottleneck? Is the performance optimal? Will it continue to scale? This article explains how to review these questions on whatever streaming platform you’re on.
Rakesh Kumar | 6 June 2023
Great Expectations is an exciting data quality tool; this article explains how to utilize it in an Airflow setting to automate data quality in your team.
Charles Verleyen | 9 May 2023
Resources
Nick Schrock shares his vision for the next decade of dagster, an open-source data orchestrator. The key insights: Complexity is the big challenge of data; data engineering is a discipline, not a job title, and we need to move from silos to layers and away from data workflows.
Nick Schrock | 24 May 2023
This is a lovely short four-reasons piece on the weaknesses of ETL and why they are becoming problematic now.
Christianlauer | 13 Apr 2023
Max, for years, argues that data modeling is a forgotten art but essential to modern data teams. Yet it never has gotten an upgrade. Entity Centric Data Modeling is this upgrade, a new approach to modeling data.
Maxime Beauchemin | 06 April 2023
In this article, Vin shares the five phases’ of data models: Transparency > Novelty > Experimentation > Data-generating systems > Advanced data models. It’s an interesting perspective on data in use.
Vin Vashishta | 29 May 2023