Finish Slime - #1

Analytics engineering, data policies, Flink on Kubernetes, experimentation platforms, data orchestration and the future of data engineering.

Simple. So you don’t slip. In data engineering. No opinions. Just great content, shared.

 

News

The dbt manifesto has changed analytics inside modern data teams. Software engineering techniques for data people, like modularity, documentation, CI/CD, source control, and testing, are becoming mainstream. The next level is awaiting.

Reddit updates its policy to stop the usage of the data for training of machine learning models without their permission. The company introduces premium access for anyone in that category.

Tutorials & Show Cases

Instacart handles over two trillion events yearly through its low-latency data pipelines to gain deeper business insight. To leverage real-time events for business expansion, they adopted Apache Flink on Kubernetes; let’s see how.

Resources

The company SEEK uses AI to match job seekers with job opportunities and has a dedicated team known as Artificial Intelligence & Platform Services (AIPS) to work on these solutions.

The data engineering landscape is vast. The expectations of the data engineering role are too big for most experienced data engineers to handle. Split the role into multiple different roles to absorb this overload.

Orchestrators are weird beasts; the need for data orchestration, on the other hand, is a different thing. While some may proclaim orchestrators as dead, the need for orchestration isn’t going anywhere.