llms.txt Content
# Dagster
> Dagster is a data orchestrator built for data engineers, with integrated lineage, observability, a declarative programming model, and best-in-class testability. It is designed for developing and maintaining data assets, such as tables, data sets, machine learning models, and reports. With Dagster, you declare—as Python functions—the data assets that you want to build. Dagster then helps you run your functions at the right time and keep your assets up-to-date.
Dagster's design allows it to model and manage the flow of data and the execution of compute tasks across various systems, which can include tasks such as data ingestion, transformation, and analysis. Unlike traditional task-centric orchestrators, Dagster's core abstractions of 'ops', 'assets', and 'resources' facilitate code-native pipeline definitions with an asset-first approach that focuses on the data products you want to create.
**Key Features:**
- **Asset-centric orchestration**: Model data assets (tables, ML models, reports) rather than just tasks
- **Software engineering best practices**: Built to be used at every stage of the data development lifecycle - local development, unit tests, integration tests, staging environments, all the way up to production
- **Integrated observability**: Built-in lineage tracking, data quality monitoring, and operational metadata
- **Python-native**: Python-native data orchestrator for complex, modern data pipelines
- **Flexible deployment**: From local development to production clusters
- **Rich integrations**: Works with dbt, Snowflake, Spark, Databricks, and other modern data tools
**Core Concepts:**
- **Assets**: Data assets are a fundamental concept in Dagster. They represent the tangible outputs of your data pipelines, and are ultimately the end product your stakeholders care about
- **Ops**: Individual units of computation that can be composed into jobs
- **Jobs**: Collections of ops that define how to compute a set o