Finish Slime #14

ByConity, Dbt, Streaming, Dagster, GX, Data Modeling, No ETL.

Data Engineering, Analytics. No ML, no AI. The weekly dose of the data content you actually want to read!

Want to share anything with me? Hit me up on Twitter @sbalnojan or Linkedin.

Tools

ByConity is an open-source cloud data warehouse that adds virtual warehouses to cloud storage. It’s inspired by Snowflake and built from the core of Clickhouse. It allows batch and stream data ingestion and runs queries on large-scale data.

Tutorials & Showcases

The team at Tempus is a big dbt user, with over 1,000 models in production. This article is about their approach to reducing the number of tests by getting a better idea of what tests really help to improve quality and which don’t.

Noah Kennedy | 25 Jan 2023

You should analyze any streaming pipeline using three questions, is there a bottleneck? Is the performance optimal? Will it continue to scale? This article explains how to review these questions on whatever streaming platform you’re on.

Rakesh Kumar | 6 June 2023

Great Expectations is an exciting data quality tool; this article explains how to utilize it in an Airflow setting to automate data quality in your team.

Charles Verleyen | 9 May 2023

Resources

Nick Schrock shares his vision for the next decade of dagster, an open-source data orchestrator. The key insights: Complexity is the big challenge of data; data engineering is a discipline, not a job title, and we need to move from silos to layers and away from data workflows.

Nick Schrock | 24 May 2023

This is a lovely short four-reasons piece on the weaknesses of ETL and why they are becoming problematic now.

Christianlauer | 13 Apr 2023

Max, for years, argues that data modeling is a forgotten art but essential to modern data teams. Yet it never has gotten an upgrade. Entity Centric Data Modeling is this upgrade, a new approach to modeling data.

Maxime Beauchemin | 06 April 2023

In this article, Vin shares the five phases’ of data models: Transparency > Novelty > Experimentation > Data-generating systems > Advanced data models. It’s an interesting perspective on data in use.

Vin Vashishta | 29 May 2023

Subscribe to keep reading

This content is free, but you must be subscribed to The Finish Slime to continue reading.

Already a subscriber?Sign In.Not now