Welcome to Dagster!#

Dagster is the data orchestration platform built for productivity.

Get started with Dagster in just three quick steps:

  1. Install Dagster
  2. Define assets
  3. Materialize the assets

Step 1: Install Dagster#

Dagster requires Python 3.6+. Refer to the Installation documentation for more info.

To install Dagster into an existing Python environment, run:

pip install dagster

This installs the latest stable version of the core Dagster packages in your current Python environment.


Step 2: Define assets#

To get started, we'll define two simple data assets:

  • A cereals asset that represents a CSV dataset about breakfast cereals, and
  • A nabisco_cereals asset, which is a downstream dependency of cereals and only contains cereals manufactured by Nabisco

In the directory where you installed Dagster, copy this code and save it in a file named cereal.py:

import csv
import requests
from dagster import asset


@asset
def cereals():
    response = requests.get("https://docs.dagster.io/assets/cereal.csv")
    lines = response.text.split("\n")
    return [row for row in csv.DictReader(lines)]


@asset
def nabisco_cereals(cereals):
    """Cereals manufactured by Nabisco"""
    return [row for row in cereals if row["mfr"] == "N"]

Step 3: Materialize the assets#

Next, you'll materialize the assets. Materialization computes an asset's contents and writes them to persistent storage. By default, this is a pickle file on the local system.

There are a few ways to materialize an asset:

Using Dagit#

Dagit is a web-based interface for viewing and interacting with Dagster objects.

  1. To install Dagit, run:

    pip install dagit
    
  2. To launch Dagit, run:

    dagit -f cereal.py
    

    You should see output similar to:

    Serving dagit on http://127.0.0.1:3000 in process 70635
    
  3. Navigate to http://localhost:3000 in your web browser to view your assets.

  4. Click the Materialize All button to launch a run that materializes the assets:

Using the Dagster Python API#

You can also use the Dagster Python API to materialize the assets as a script.

Add a few lines to cereal.py, which executes a run within the Python process:

from dagster import materialize

if __name__ == "__main__":
    materialize([cereals, nabisco_cereals])

Now you can run:

python cereal.py

What's next?#

Congrats - you just created and materialized your first Dagster assets! Now that you've done that, what's next?

  • Learn about Dagster with hands-on examples using our tutorials
  • Get the most out of Dagster by familiarizing yourself with its core concepts
  • Accomplish common tasks using our step-by-step guides
  • Deploy Dagster to your platform of choice with our deployment guides

If you get stuck or have any other questions, we'd love to hear from you on Slack:

join-us-on-slack