Migrating from Airflow to Dagster, or integrating Dagster into your existing workflow orchestration stack, can be accomplished in many ways. The Pick your own journey guide provides a variety suggestions in how one can migrate their Airflow pipelines to Dagster, or build a platform where both tools co-exist.
While Airflow and Dagster have some significant differences, there are many concepts that overlap. Use this cheatsheet to understand how Airflow concepts map to Dagster.
Want a look at this in code? Check out the Learning Dagster from Airflow guide.
| Airflow concept | Dagster concept | Notes |
|---|---|---|
| Directed Acyclic Graphs (DAG) | Jobs | |
| Task | Ops | |
| Datasets | Assets | Dagster assets are more powerful and mature than datasets and include support for things like partitioning. |
| Connections/Variables |
| |
| DagBags | Code locations | Multiple isolated code locations with different system and Python dependencies can exist within the same Dagster instance. |
| DAG runs | Job runs | |
depends_on_past | An asset can depend on earlier partitions of itself. When this is the case, backfills and auto-materialize will only materialize later partitions after earlier partitions have completed. | |
| Executors | Executors | |
| Hooks | Resources | Dagster resource contain a superset of the functionality of hooks and have much stronger composition guarantees. |
| Instances | Instances | |
| Operators | None | Dagster uses normal Python functions instead of framework-specific operator classes. For off-the-shelf functionality with third-party tools, Dagster provides integration libraries. |
| Pools | Run coordinators | |
| Plugins/Providers | Integrations | |
| Schedulers | Schedules | |
| Sensors | Sensors | |
| SubDAGs/TaskGroups | Dagster provides rich, searchable metadata and tagging support beyond what’s offered by Airflow. | |
task_concurrency | Asset/op-level concurrency limits | |
| Trigger | Dagster UI Launchpad | Triggering and configuring ad-hoc runs is easier in Dagster which allows them to be initiated through the Dagster UI, the GraphQL API, or the CLI. |
| XComs | I/O managers | I/O managers are more powerful than XComs and allow the passing large datasets between jobs. |