Finish Slime - #5

Airflow 2.6.0, Data Mesh, Data Quality Survey, Kestra & Data Catalogs

Data Engineering, Analytics. No ML, no AI. The weekly dose of the data content you actually want to read!

News

Apache Airflow 2.6.0 is here, packed with new features and improvements. Some highlights include a new Kubernetes scheduler, a new web UI, and several bug fixes and performance improvements. If you're a data engineer using Airflow, upgrade to 2.6.0 today!

Speaking of data orchestrators…

Tutorials & Show Cases

The traditional data warehouse and lake architecture is no longer sufficient for the needs of modern enterprises. The data mesh is a new approach to data architecture that is more flexible, scalable, and resilient. The data mesh is based on the principle of decentralizing data ownership and control, and it enables teams to work independently with greater autonomy and agility. The data mesh is worth considering if you're looking for a more modern and scalable data architecture.

Legacy Customer Data Platforms (CDPs) are becoming outdated. They're expensive, inflexible, and don't offer the data ownership and control that businesses need. Composable CDPs are the new wave. They're modular and cloud-based, giving businesses more flexibility and control over their data.

Data quality is still a massive problem for data teams. The time it takes to resolve data quality issues is increasing, hurting revenue. Data teams must find ways to improve data quality without spending too much time on it.

Tools & Resources

Kestra is a lightweight, easy-to-use, versatile workflow orchestration tool that automates and manages complex pipelines. It's an excellent alternative to Apache Airflow because it's declarative, meaning you can define your workflows in a human-readable way using YAML files.

Open source data catalogs are a great way to organize and manage your data. They can help you find the needed data, understand its quality, and share it with others. Some of the most popular open source data catalogs include Amundsen, Metabase, Superset, DataHub, and Alation. These catalogs offer a variety of features, so you can choose the one that best meets your needs.