Your data team wants to start using a data transformation tool or is looking for a dbt alternative? Great, then you’re in the right place.

Dbt led a small revolution inside the data transformation world, pushing forward ELT as a pattern and enabling thousands of analysts to turn into analytics engineers.

The analytics engineering revolution makes dbt seem huge and the default choice for transforming data. But that’s a fallacy.

  1. There always have been good alternatives to dbt even before, just with a slightly different target audience.

  2. Multiple great tools grew in the shadows of dbt that are now ready for prime time.

So today, we have alternatives to dbt that suit your workflow better. We have alternatives for every type of data hero.

Data transformation vs. everything else

TL;DR: What you need is a data transformation AND processing tool. (And dbt does both!)

Let’s get an understanding of the task at hand first. Dbt in the past propagated this picture:

But as dbt already realized, the correct picture for what dbt is used for looks like this:

It turns raw data into more. Not just inside a data warehouse (there are dbt => data lake adaptors via spark), not just “renaming & casting”. It’s the complete processing of data.

What dbt does, and what we all need, is a way of doing data transformation AND processing. If you look into the Unified Data Infrastructure below, we’re trying to find a solution for data processing and transformation without necessarily a querying engine.

The two great things about Dbt

TL;DR: Your tool should have versioning support AND either SQL or Python support.

Dbt introduces two important, while small, innovations to the data workflow of many people:

  1. templated (!) SQL

  2. Versionable transformation + processing code

Templated SQL is great because SQL has a shallow barrier to entry, making dbt accessible even to analysts. Dbt has since expanded to include Python models. “Templated” means you can do slightly more than with regular SQL, including having reusable code and basic control patterns.

Versionable code is excellent for audibility, but the essential parts are the consequences of versionable code. Once your data transformation code is versionable, it becomes auditable, testable, deployable to different environments, and many other things.

Dbt alternatives

This isn’t just a plain old list of random tools. This list includes only tools that have the following:

  1. SQL support or Python support

  2. Allow for versioned code

Here they are

#1 SQLMesh

  • Website: https://sqlmesh.com/

  • Free/OS: yes

  • SQL support: yes, templated

  • Python support: yes

  • Versionable code: yes

  • Focus: SQLMesh focuses more on data engineers and providing them with a great development environment; it makes versioning & testing a first-class citizen.

  • Comment from us: New and completely underrated tool!

#2 Databricks

  • Website: https://www.databricks.com/

  • Free/OS: No (free trial available)

  • SQL support: yes, templated

  • Python support: yes

  • Versionable code: yes

  • Focus: Databricks has a unified platform making it great for whole departments to use.

  • Comment from us: The databricks focus has previously been mostly on data scientists, MLers and data engineers with heavy technical skills. But that changed. Today, databricks has everything and more (including a dashboarding tool out of the box).

#3 Datameer (Snowflake only)

  • Website: https://www.datameer.com/

  • Free/OS: No

  • SQL support: yes, no template support

  • Python support: No

  • Versionable code: yes (inside the tool)

  • Focus: Both on less technical users (offers a GUI) and SQL-heavy users. Snowflake only

#4 DIY Python

  • Website: None

  • Free/OS: yes

  • SQL support: (yes) see #5

  • Python support: yes

  • Versionable code: yes!

  • Focus: Experienced Python engineers.

  • Comment from us: The benefits of DIY Python are all inside using the full power of a proper programming language. It means writing and reusing modules, testing, and versioning. This can be deployed using orchestrators like Apache Airflow, but it doesn’t have to. Sometimes cron will do the job just fine.

#5 SQL Heavy: Parametrized SQL Statements

  • Website: None

  • Free/OS: yes

  • SQL support: see

  • Python support: see #4

  • Versionable code: yes!

  • Focus: Experienced SQLers engineers.

  • Comment from us: Parametrized SQL statements usually are wrapped into Python code. They are either hand-built or use Jinja, just like dbt does. The benefits are in having reusable code, testing small snippets like a CTE, and the like. This can be deployed using orchestrators like Apache Airflow but doesn’t have to. Sometimes cron will do the job just fine.

#6 Notebooks/ Apache Spark/Pandas

  • Website: https://jupyter.org/ 

  • Free/OS: yes (depends on deployment & orchestration, paid options available)

  • SQL support: yes

  • Python support: yes

  • Versionable code: yes

#7 AWS Glue (or GCP, or Azure alt.)

#8 Keboola

  • Website: https://www.keboola.com/

  • Free/OS: no

  • SQL support: yes (incl. dbt runner)

  • Python support: yes (& R, Julia)

  • Versionable code: yes

  • Focus: Complete package for E(t)LT, including orchestration.

#9 SDF

  • Website: https://www.sdf.com/

  • Free/OS: no (still in beta?)

  • SQL support: yes

  • Python support: no

  • Versionable code: yes

  • Focus: SQL development for enterprises, good comparison article is this one: “dbt vs sdf vs SQLMesh”.

Want to add your dbt alternative? Just fill out the basic info and ping me on Twitter/ LinkedIn.