Finish Slime #9

Salaries, Trino, Presto, Table Formats, Apache Doris and Data Reliability.

Data Engineering, Analytics. No ML, no AI. The weekly dose of the data content you actually want to read!

Want to share anything with me? Hit me up on Twitter @sbalnojan or Linkedin.

We got some slimy news today!

  • All subscribers can now access our exclusive gift (the data engineering reading guide); see below.

  • Once you refer one new subscriber, you get this fantastic 14-page guide: Must Know, Need to Know, and Nice to Know Skills of the Data Engineer - The Data Engineering Roadmap.

Data Tools

Apache Doris is an easy-to-use, high-performance, real-time analytical database based on MPP architecture. It offers several features that make it well-suited for analytical workloads.

Presto and Trino are distributed SQL engines designed for querying large datasets. Presto and Trino are similar in many ways as they originated from the same project, but the two engines also have some critical differences. Also, check out the

Data Practices

Data reliability is essential for businesses to ensure their data is accurate and consistent. Here are six common signs it’s time to start investing in it.

Business-critical data is data that is essential for the day-to-day operations of a business. There are three main types of business-critical data: Transactional data, Master data & Analytical data. Let’s get into the details.

Resources

Key stats on data engineering salaries were collected from 260 companies and 500 jobs. London and Dublin lead the way with the highest compensation; top tech companies like Amazon, Meta, and Netflix generally pay better.

Tutorials

Several open table formats can be used for transactional data lakes on AWS: Apache Hudi, Apache Iceberg, and Delta Lake. Let’s choose the right one.

Subscribe to keep reading

This content is free, but you must be subscribed to The Finish Slime to continue reading.

Already a subscriber?Sign In.Not now