Ask AI

You are viewing an unreleased or outdated version of the documentation

Changelog#

1.6.11 (core) / 0.22.11 (libraries)#

Bugfixes#

  • Fixed an issue where dagster dev or the Dagster UI would display an error when loading jobs created with op or asset selections.

1.6.10 (core) / 0.22.10 (libraries)#

New#

  • Latency improvements to the scheduler when running many simultaneous schedules.

Bugfixes#

  • The performance of loading the Definitions snapshot from a code server when large @multi_asset s are in use has been drastically improved.
  • The snowflake quickstart example project now renames the “by” column to avoid reserved snowflake names. Thanks @jcampbell!
  • The existing group name (if any) for an asset is now retained if the_asset.with_attributes is called without providing a group name. Previously, the existing group name was erroneously dropped. Thanks @ion-elgreco!
  • [dagster-dbt] Fixed an issue where Dagster events could not be streamed from dbt source freshness.
  • [dagster university] Removed redundant use of MetadataValue in Essentials course. Thanks @stianthaulow!
  • [ui] Increased the max number of plots on the asset plots page to 100.

Breaking Changes#

  • The tag_keys argument on DagsterInstance.get_run_tagsis no longer optional. This has been done to remove an easy way of accidentally executing an extremely expensive database operation.

Dagster Cloud#

  • The maximum number of concurrent runs across all branch deployments is now configurable. This setting can now be set via GraphQL or the CLI.
  • [ui] In Insights, fixed display of table rows with zero change in value from the previous time period.
  • [ui] Added deployment-level Insights.
  • [ui] Fixed an issue causing void invoices to show up as “overdue” on the billing page.
  • [experimental] Branch deployments can now indicate the new and modified assets in the branch deployment as compared to the main deployment. To enable this feature, turn on the “Enable experimental branch deployment asset graph diffing” user setting.

1.6.9 (core) / 0.22.9 (libraries)#

New#

  • [ui] When viewing logs for a run, the date for a single log row is now shown in the tooltip on the timestamp. This helps when viewing a run that takes place over more than one date.
  • Added suggestions to the error message when selecting asset keys that do not exist as an upstream asset or in an AssetSelection.
  • Improved error messages when trying to materialize a subset of a multi-asset which cannot be subset.
  • [dagster-snowflake] dagster-snowflake now requires snowflake-connector-python>=3.4.0
  • [embedded-elt] @sling_assets accepts an optional name parameter for the underlying op
  • [dagster-openai] dagster-openai library is now available.
  • [dagster-dbt] Added a new setting on DagsterDbtTranslatorSettings called enable_duplicate_source_asset_keys that allows users to set duplicate asset keys for their dbt sources. Thanks @hello-world-bfree!
  • Log messages in the Dagster daemon for unloadable sensors and schedules have been removed.
  • [ui] Search now uses a cache that persists across pageloads which should greatly improve search performance for very large orgs.
  • [ui] groups/code locations in the asset graph’s sidebar are now sorted alphabetically.

Bugfixes#

  • Fixed issue where the input/output schemas of configurable IOManagers could be ignored when providing explicit input / output run config.
  • Fixed an issue where enum values could not properly have a default value set in a ConfigurableResource.
  • Fixed an issue where graph-backed assets would sometimes lose user-provided descriptions due to a bug in internal copying.
  • [auto-materialize] Fixed an issue introduced in 1.6.7 where updates to ExternalAssets would be ignored when using AutoMaterializePolicies which depended on parent updates.
  • [asset checks] Fixed a bug with asset checks in step launchers.
  • [embedded-elt] Fix a bug when creating a SlingConnectionResource where a blank keyword argument would be emitted as an environment variable
  • [dagster-dbt] Fixed a bug where emitting events from dbt source freshness would cause an error.
  • [ui] Fixed a bug where using the “Terminate all runs” button with filters selected would not apply the filters to the action.
  • [ui] Fixed an issue where typing a search query into the search box before the search data was fetched would yield “No results” even after the data was fetched.

Community Contributions#

  • [docs] fixed typo in embedded-elt.mdx (thanks @cameronmartin)!
  • [dagster-databricks] log the url for the run of a databricks job (thanks @smats0n)!
  • Fix missing partition property (thanks christeefy)!
  • Add op_tags to @observable_source_asset decorator (thanks @maxfirman)!
  • [docs] typo in MultiPartitionMapping docs (thanks @dschafer)
  • Allow github actions to checkout branch from forked repo for docs changes (ci fix) (thanks hainenber)!

Experimental#

  • [asset checks] UI performance of asset checks related pages has been improved.
  • [dagster-dbt] The class DbtArtifacts has been added for managing the behavior of rebuilding the manifest during development but expecting a pre-built one in production.

Documentation#

  • Added example of writing compute logs to AWS S3 when customizing agent configuration.
  • "Hello, Dagster" is now "Dagster Quickstart" with the option to use a Github Codespace to explore Dagster.
  • Improved guides and reference to better running multiple isolated agents with separate queues on ECS.

Dagster Cloud#

  • Microsoft Teams is now supported for alerts. Documentation
  • A send sample alert button now exists on both the alert policies page and in the alert policies editor to make it easier to debug and configure alerts without having to wait for an event to kick them off.

1.6.8 (core) / 0.22.8 (libraries)#

Bugfixes#

  • [dagster-embedded-elt] Fixed a bug in the SlingConnectionResource that raised an error when connecting to a database.

Experimental#

  • [asset checks] graph_multi_assets with check_specs now support subsetting.

1.6.7 (core) / 0.22.7 (libraries)#

New#

  • Added a new run_retries.retry_on_op_or_asset_failures setting that can be set to false to make run retries only occur when there is an unexpected failure that crashes the run, allowing run-level retries to co-exist more naturally with op or asset retries. See the docs for more information.
  • dagster dev now sets the environment variable DAGSTER_IS_DEV_CLI allowing subprocesses to know that they were launched in a development context.
  • [ui] The Asset Checks page has been updated to show more information on the page itself rather than in a dialog.

Bugfixes#

  • [ui] Fixed an issue where the UI disallowed creating a dynamic partition if its name contained the “|” pipe character.
  • AssetSpec previously dropped the metadata and code_version fields, resulting in them not being attached to the corresponding asset. This has been fixed.

Experimental#

  • The new @multi_observable_source_asset decorator enables defining a set of assets that can be observed together with the same function.
  • [dagster-embedded-elt] New Asset Decorator @sling_assets and Resource SlingConnectionResource have been added for the [dagster-embedded-elt.sling](http://dagster-embedded-elt.sling) package. Deprecated build_sling_asset, SlingSourceConnection and SlingTargetConnection.
  • Added support for op-concurrency aware run dequeuing for the QueuedRunCoordinator.

Documentation#

  • Fixed reference documentation for isolated agents in ECS.
  • Corrected an example in the Airbyte Cloud documentation.
  • Added API links to OSS Helm deployment guide.
  • Fixed in-line pragmas showing up in the documentation.

Dagster Cloud#

  • Alerts now support Microsoft Teams.
  • [ECS] Fixed an issue where code locations could be left undeleted.
  • [ECS] ECS agents now support setting multiple replicas per code server.
  • [Insights] You can now toggle the visibility of a row in the chart by clicking on the dot for the row in the table.
  • [Users] Added a new column “Licensed role” that shows the user's most permissive role.

1.6.6 (core) / 0.22.6 (libraries)#

New#

  • Dagster officially supports Python 3.12.
  • dagster-polars has been added as an integration. Thanks @danielgafni!
  • [dagster-dbt] @dbt_assets now supports loading projects with semantic models.
  • [dagster-dbt] @dbt_assets now supports loading projects with model versions.
  • [dagster-dbt] get_asset_key_for_model now supports retrieving asset keys for seeds and snapshots. Thanks @aksestok!
  • [dagster-duckdb] The Dagster DuckDB integration supports DuckDB version 0.10.0.
  • [UPath I/O manager] If a non-partitioned asset is updated to have partitions, the file containing the non-partitioned asset data will be deleted when the partitioned asset is materialized, rather than raising an error.

Bugfixes#

  • Fixed an issue where creating a backfill of assets with dynamic partitions and a backfill policy would sometimes fail with an exception.
  • Fixed an issue with the type annotations on the @asset decorator causing a false positive in Pyright strict mode. Thanks @tylershunt!
  • [ui] On the asset graph, nodes are slightly wider allowing more text to be displayed, and group names are no longer truncated.
  • [ui] Fixed an issue where the groups in the asset graph would not update after an asset was switched between groups.
  • [dagster-k8s] Fixed an issue where setting the security_context field on the k8s_job_executor didn't correctly set the security context on the launched step pods. Thanks @krgn!

Experimental#

  • Observable source assets can now yield ObserveResults with no data_version.
  • You can now include FreshnessPolicys on observable source assets. These assets will be considered “Overdue” when the latest value for the “dagster/data_time” metadata value is older than what’s allowed by the freshness policy.
  • [ui] In Dagster Cloud, a new feature flag allows you to enable an overhauled asset overview page with a high-level stakeholder view of the asset’s health, properties, and column schema.

Documentation#

  • Updated docs to reflect newly-added support for Python 3.12.

Dagster Cloud#

  • [kubernetes] Fixed an issue where the Kubernetes agent would sometimes leave dangling kubernetes services if the agent was interrupted during the middle of being terminated.

1.6.5 (core) / 0.22.5 (libraries)#

New#

  • Within a backfill or within auto-materialize, when submitting runs for partitions of the same assets, runs are now submitted in lexicographical order of partition key, instead of in an unpredictable order.
  • [dagster-k8s] Include k8s pod debug info in run worker failure messages.
  • [dagster-dbt] Events emitted by DbtCliResource now include metadata from the dbt adapter response. This includes fields like rows_affected, query_id from the Snowflake adapter, or bytes_processed from the BigQuery adapter.

Bugfixes#

  • A previous change prevented asset backfills from grouping multiple assets into the same run when using BackfillPolicies under certain conditions. While the backfills would still execute in the proper order, this could lead to more individual runs than necessary. This has been fixed.
  • [dagster-k8s] Fixed an issue introduced in the 1.6.4 release where upgrading the Helm chart without upgrading the Dagster version used by user code caused failures in jobs using the k8s_job_executor.
  • [instigator-tick-logs] Fixed an issue where invoking context.log.exception in a sensor or schedule did not properly capture exception information.
  • [asset-checks] Fixed an issue where additional dependencies for dbt tests modeled as Dagster asset checks were not properly being deduplicated.
  • [dagster-dbt] Fixed an issue where dbt model, seed, or snapshot names with periods were not supported.

Experimental#

  • @observable_source_asset-decorated functions can now return an ObserveResult. This allows including metadata on the observation, in addition to a data version. This is currently only supported for non-partitioned assets.
  • [auto-materialize] A new AutoMaterializeRule.skip_on_not_all_parents_updated_since_cron class allows you to construct AutoMaterializePolicys which wait for all parents to be updated after the latest tick of a given cron schedule.
  • [Global op/asset concurrency] Ops and assets now take run priority into account when claiming global op/asset concurrency slots.

Documentation#

  • Fixed an error in our asset checks docs. Thanks @vaharoni!
  • Fixed an error in our Dagster Pipes Kubernetes docs. Thanks @cameronmartin!
  • Fixed an issue on the Hello Dagster! guide that prevented it from loading.
  • Add specific capabilities of the Airflow integration to the Airflow integration page.
  • Re-arranged sections in the I/O manager concept page to make info about using I/O versus resources more prominent.

0.15.1#

New#

  • When Dagster loads an event from the event log of a type that it doesn’t recognize (for example, because it was created by a newer version of Dagster) it will now return a placeholder event rather than raising an exception.
  • AssetsDefinition.from_graph() now accepts a group_name parameter. All assets created by from_graph are assigned to this group.
  • You can define an asset from an op via a new utility method AssetsDefinition.from_op. Dagster will infer asset inputs and outputs from the ins/outs defined on the @op in the same way as @graphs.
  • A default executor definition can be defined on a repository using the default_executor_def argument. The default executor definition will be used for all op/asset jobs that don’t explicitly define their own executor.
  • JobDefinition.run_request_for_partition now accepts a tags argument (Thanks @jburnich!)
  • In Dagit, the graph canvas now has a dotted background to help it stand out from the reset of the UI.
  • @multi_asset now accepts a resource_defs argument. The provided resources can be either used on the context, or satisfy the io manager requirements of the outs on the asset.
  • In Dagit, show execution timezone on cron strings, and use 12-hour or 24-hour time format depending on the user’s locale.
  • In Dagit, when viewing a run and selecting a specific step in the Gantt chart, the compute log selection state will now update to that step as well.
  • define_asset_job and to_job now can now accept a partitions_def argument and a config argument at the same time, as long as the value for the config argument is a hardcoded config dictionary (not a PartitionedConfig or ConfigMapping)

Bugfixes#

  • Fixed an issue where entering a string in the launchpad that is valid YAML but invalid JSON would render incorrectly in Dagit.
  • Fixed an issue where steps using the k8s_job_executor and docker_executor would sometimes return the same event lines twice in the command-line output for the step.
  • Fixed type annotations on the @op decorator (Thanks Milos Tomic!)
  • Fixed an issue where job backfills were not displayed correctly on the Partition view in Dagit.
  • UnresolvedAssetJobDefinition now supports the run_request_for_partition method.
  • Fixed an issue in Dagit where the Instance Overview page would briefly flash a loading state while loading fresh data.

Breaking Changes#

  • Runs that were executed in newer versions of Dagster may produce errors when their event logs are loaded in older versions of Dagit, due to new event types that were recently added. Going forward, Dagit has been made more resilient to handling new events.

Deprecations#

  • Updated deprecation warnings to clarify that the deprecated metadata APIs will be removed in 0.16.0, not 0.15.0.

Experimental#

  • If two assets are in the same group and the upstream asset has a multi-segment asset key, the downstream asset doesn’t need to specify the full asset key when declaring its dependency on the upstream asset - just the last segment.

Documentation#

  • Added dedicated sections for op, graph, and job Concept docs in the sidenav
  • Moved graph documentation from the jobs docs into its own page
  • Added documentation for assigning asset groups and viewing them in Dagit
  • Added apidoc for AssetOut and AssetIn
  • Fixed a typo on the Run Configuration concept page (Thanks Wenshuai Hou!)
  • Updated screenshots in the software-defined assets tutorial to match the new Dagit UI
  • Fixed a typo in the Defining an asset section of the software-defined assets tutorial (Thanks Daniel Kim!)

0.15.0 "Cool for the Summer"#

Major Changes#

  • Software-defined assets are now marked fully stable and are ready for prime time - we recommend using them whenever your goal using Dagster is to build and maintain data assets.

  • You can now organize software defined assets into groups by providing a group_name on your asset definition. These assets will be grouped together in Dagit.

  • Software-defined assets now accept configuration, similar to ops. E.g.

    from dagster import asset
    
    @asset(config_schema={"iterations": int})
    def my_asset(context):
        for i in range(context.op_config["iterations"]):
            ...
    
  • Asset definitions can now be created from graphs via AssetsDefinition.from_graph:

    @graph(out={"asset_one": GraphOut(), "asset_two": GraphOut()})
    def my_graph(input_asset):
        ...
    
    graph_asset = AssetsDefinition.from_graph(my_graph)
    
  • execute_in_process and GraphDefinition.to_job now both accept an input_values argument, so you can pass arbitrary Python objects to the root inputs of your graphs and jobs.

  • Ops that return Outputs and DynamicOutputs now work well with Python type annotations. You no longer need to sacrifice static type checking just because you want to include metadata on an output. E.g.

    from dagster import Output, op
    
    @op
    def my_op() -> Output[int]:
        return Output(5, metadata={"a": "b"})
    
  • You can now automatically re-execute runs from failure. This is analogous to op-level retries, except at the job level.

  • You can now supply arbitrary structured metadata on jobs, which will be displayed in Dagit.

  • The partitions and backfills pages in Dagit have been redesigned to be faster and show the status of all partitions, instead of just the last 30 or so.

  • The left navigation pane in Dagit is now grouped by repository, which makes it easier to work with when you have large numbers of jobs, especially when jobs in different repositories have the same name.

  • The Asset Details page for a software-defined asset now includes a Lineage tab, which makes it easy to see all the assets that are upstream or downstream of an asset.

Breaking Changes and Deprecations#

Software-defined assets#

This release marks the official transition of software-defined assets from experimental to stable. We made some final changes to incorporate feedback and make the APIs as consistent as possible:

  • Support for adding tags to asset materializations, which was previously marked as experimental, has been removed.
  • Some of the properties of the previously-experimental AssetsDefinition class have been renamed. group_names is now group_names_by_key, asset_keys_by_input_name is now keys_by_input_name, and asset_keys_by_output_name is now keys_by_output_name, asset_key is now key, and asset_keys is now keys.
  • Removes previously experimental IO manager fs_asset_io_manager in favor of merging its functionality with fs_io_manager. fs_io_manager is now the default IO manager for asset jobs, and will store asset outputs in a directory named with the asset key. Similarly, removed adls2_pickle_asset_io_manager, gcs_pickle_asset_io_manager , and s3_pickle_asset_io_manager. Instead, adls2_pickle_io_manager, gcs_pickle_io_manager, and s3_pickle_io_manager now support software-defined assets.
  • (deprecation) The namespace argument on the @asset decorator and AssetIn has been deprecated. Users should use key_prefix instead.
  • (deprecation) AssetGroup has been deprecated. Users should instead place assets directly on repositories, optionally attaching resources using with_resources. Asset jobs should be defined using define_asset_job (replacing AssetGroup.build_job), and arbitrary sets of assets can be materialized using the standalone function materialize (replacing AssetGroup.materialize).
  • (deprecation) The outs property of the previously-experimental @multi_asset decorator now prefers a dictionary whose values are AssetOut objects instead of a dictionary whose values are Out objects. The latter still works, but is deprecated.
  • The previously-experimental property on OpExecutionContext called output_asset_partition_key is now deprecated in favor of asset_partition_key_for_output

Event records#

  • The get_event_records method on DagsterInstance now requires a non-None argument event_records_filter. Passing a None value for the event_records_filter argument will now raise an exception where previously it generated a deprecation warning.
  • Removed methods events_for_asset_key and get_asset_events, which have been deprecated since 0.12.0.

Extension libraries#

  • [dagster-dbt] (breaks previously-experimental API) When using the load_assets_from_dbt_project or load_assets_from_dbt_manifest , the AssetKeys generated for dbt sources are now the union of the source name and the table name, and the AssetKeys generated for models are now the union of the configured schema name for a given model (if any), and the model name. To revert to the old behavior: dbt_assets = load_assets_from_dbt_project(..., node_info_to_asset_key=lambda node_info: AssetKey(node_info["name"]).
  • [dagster-k8s] In the Dagster Helm chart, user code deployment configuration (like secrets, configmaps, or volumes) is now automatically included in any runs launched from that code. Previously, this behavior was opt-in. In most cases, this will not be a breaking change, but in less common cases where a user code deployment was running in a different kubernetes namespace or using a different service account, this could result in missing secrets or configmaps in a launched run that previously worked. You can return to the previous behavior where config on the user code deployment was not applied to any runs by setting the includeConfigInLaunchedRuns.enabled field to false for the user code deployment. See the Kubernetes Deployment docs for more details.
  • [dagster-snowflake] dagster-snowflake has dropped support for python 3.6. The library it is currently built on, snowflake-connector-python, dropped 3.6 support in their recent 2.7.5 release.

Other#

  • The prior_attempts_count parameter is now removed from step-launching APIs. This parameter was not being used, as the information it held was stored elsewhere in all cases. It can safely be removed from invocations without changing behavior.
  • The FileCache class has been removed.
  • Previously, when schedules/sensors targeted jobs with the same name as other jobs in the repo, the jobs on the sensor/schedule would silently overwrite the other jobs. Now, this will cause an error.

New since 0.14.20#

  • A new define_asset_job function allows you to define a selection of assets that should be executed together. The selection can be a simple string, or an AssetSelection object. This selection will be resolved into a set of assets once placed on the repository.

    from dagster import repository, define_asset_job, AssetSelection
    
    string_selection_job = define_asset_job(
        name="foo_job", selection="*foo"
    )
    object_selection_job = define_asset_job(
        name="bar_job", selection=AssetSelection.groups("some_group")
    )
    
    @repository
    def my_repo():
        return [
            *my_list_of_assets,
            string_selection_job,
            object_selection_job,
        ]
    
  • [dagster-dbt] Assets loaded with load_assets_from_dbt_project and load_assets_from_dbt_manifest will now be sorted into groups based on the subdirectory of the project that each model resides in.

  • @asset and @multi_asset are no longer considered experimental.

  • Adds new utility methods load_assets_from_modules, assets_from_current_module, assets_from_package_module, and assets_from_package_name to fetch and return a list of assets from within the specified python modules.

  • Resources and io managers can now be provided directly on assets and source assets.

    from dagster import asset, SourceAsset, resource, io_manager
    
    @resource
    def foo_resource():
        pass
    
    @asset(resource_defs={"foo": foo_resource})
    def the_resource(context):
        foo = context.resources.foo
    
    @io_manager
    def the_manager():
        ...
    
    @asset(io_manager_def=the_manager)
    def the_asset():
        ...
    

    Note that assets provided to a job must not have conflicting resource for the same key. For a given job, all resource definitions must match by reference equality for a given key.

  • A materialize_to_memory method which will load the materializations of a provided list of assets into memory:

    from dagster import asset, materialize_to_memory
    
    @asset
    def the_asset():
        return 5
    
    result = materialize_to_memory([the_asset])
    output = result.output_for_node("the_asset")
    
  • A with_resources method, which allows resources to be added to multiple assets / source assets at once:

    from dagster import asset, with_resources, resource
    
    @asset(required_resource_keys={"foo"})
    def requires_foo(context):
        ...
    
    @asset(required_resource_keys={"foo"})
    def also_requires_foo(context):
        ...
    
    @resource
    def foo_resource():
        ...
    
    requires_foo, also_requires_foo = with_resources(
        [requires_foo, also_requires_foo],
        {"foo": foo_resource},
    )
    
  • You can now include asset definitions directly on repositories. A default_executor_def property has been added to the repository, which will be used on any materializations of assets provided directly to the repository.

    from dagster import asset, repository, multiprocess_executor
    
    @asset
    def my_asset():
      ...
    
    @repository(default_executor_def=multiprocess_executor)
    def repo():
        return [my_asset]
    
  • The run_storage, event_log_storage, and schedule_storage configuration sections of the dagster.yaml can now be replaced by a unified storage configuration section. This should avoid duplicate configuration blocks with your dagster.yaml. For example, instead of:

    # dagster.yaml
    run_storage:
    module: dagster_postgres.run_storage
    class: PostgresRunStorage
    config:
        postgres_url: { PG_DB_CONN_STRING }
    event_log_storage:
    module: dagster_postgres.event_log
    class: PostgresEventLogStorage
    config:
        postgres_url: { PG_DB_CONN_STRING }
    schedule_storage:
    module: dagster_postgres.schedule_storage
    class: PostgresScheduleStorage
    config:
        postgres_url: { PG_DB_CONN_STRING }
    

    You can now write:

    storage:
      postgres:
        postgres_url: { PG_DB_CONN_STRING }
    
  • All assets where a group_name is not provided are now part of a group called default.

  • The group_name parameter value for @asset is now restricted to only allow letters, numbers and underscore.

  • You can now set policies to automatically retry Job runs. This is analogous to op-level retries, except at the job level. By default the retries pick up from failure, meaning only failed ops and their dependents are executed.

  • [dagit] The new repository-grouped left navigation is fully launched, and is no longer behind a feature flag.

  • [dagit] The left navigation can now be collapsed even when the viewport window is wide. Previously, the navigation was collapsible only for small viewports, but kept in a fixed, visible state for wide viewports. This visible/collapsed state for wide viewports is now tracked in localStorage, so your preference will persist across sessions.

  • [dagit] Queued runs can now be terminated from the Run page.

  • [dagit] The log filter on a Run page now shows counts for each filter type, and the filters have higher contrast and a switch to indicate when they are on or off.

  • [dagit] The partitions and backfill pages have been redesigned to focus on easily viewing the last run state by partition. These redesigned pages were previously gated behind a feature flag — they are now loaded by default.

  • [dagster-k8s] Overriding labels in the K8sRunLauncher will now apply to both the Kubernetes job and the Kubernetes pod created for each run, instead of just the Kubernetes pod.

Bugfixes#

  • [dagster-dbt] In some cases, if Dagster attempted to rematerialize a dbt asset, but dbt failed to start execution, asset materialization events would still be emitted. This has been fixed.
  • [dagit] On the Instance Overview page, the popover showing details of overlapping batches of runs is now scrollable.
  • [dagit] When viewing Instance Overview, reloading a repository via controls in the left navigation could lead to an error that would crash the page due to a bug in client-side cache state. This has been fixed.
  • [dagit] When scrolling through a list of runs, scrolling would sometimes get stuck on certain tags, specifically those with content overflowing the width of the tag. This has been fixed.
  • [dagit] While viewing a job page, the left navigation item corresponding to that job will be highlighted, and the navigation pane will scroll to bring it into view.
  • [dagit] Fixed a bug where the “Scaffold config” button was always enabled.

Community Contributions#

  • You can now provide dagster-mlflow configuration parameters as environment variables, thanks @chasleslr!

Documentation#

  • Added a guide that helps users who are familiar with ops and graphs understand how and when to use software-defined assets.
  • Updated and reorganized docs to document software-defined assets changes since 0.14.0.
  • The Deploying in Docker example now includes an example of using the docker_executor to run each step of a job in a different Docker container.
  • Descriptions for the top-level fields of Dagit GraphQL queries, mutations, and subscriptions have been added.

0.14.20#

New#

  • [dagster-aws] Added an env_vars field to the EcsRunLauncher that allows you to configure environment variables in the ECS task for launched runs.
  • [dagster-k8s] The env_vars field on K8sRunLauncher and k8s_job_executor can now except input of the form ENV_VAR_NAME=ENV_VAR_VALUE, and will set the value of ENV_VAR_NAME to ENV_VAR_VALUE. Previously, it only accepted input of the form ENV_VAR_NAME, and the environment variable had to be available in the pod launching the job.
  • [dagster-k8s] setting ‘includeConfigInLaunchedRuns’ on a user code deployment will now also include any image pull secrets from the user code deployment in the pod for the launched runs.

Bugfixes#

  • A recent change had made it so that, when IOManager.load_input was called to load an asset that was not being materialized as part of the run, the provided context would not include the metadata for that asset. context.upstream_output.metadata now correctly returns the metadata on the upstream asset.
  • Fixed an issue where using generic type aliases introduced in Python 3.9 (like list[str]) as the type of an input would raise an exception.
  • [dagster-k8s] Fixed an issue where upgrading the Helm chart version without upgrading your user code deployment version would result in an “Received unexpected config entry "scheme" at path root:postgres_db" error.

0.14.19#

New#

  • Metadata can now be added to jobs (via the metadata parameter) and viewed in dagit. You can use it to track code owners, link to docs, or add other useful information.
  • In the Dagit launchpad, the panel below the config editor now shows more detailed information about the state of the config, including error state and whether the config requires further scaffolding or the removal of extra config.
  • FileCache is now marked for deprecation in 0.15.0.
  • In Dagit, the asset catalog now shows the last materialization for each asset and links to the latest run.
  • Assets can now have a config_schema. If you attempt to materialize an asset with a config schema in Dagit, you'll be able to enter the required config via a modal.

Bugfixes#

  • [helm] Fixed an issue where string floats and integers were not properly templated as image tags.
  • [dagster-k8s] Fixed an issue when using the k8s_job_executor where ops with long names sometimes failed to create a pod due to a validation error with the label names automatically generated by Dagster.
  • [dagster-aws] Fixed an issue where ECS tasks with large container contexts would sometimes fail to launch because their request to the ECS RunTask API was too large.

Breaking Changes#

  • fs_asset_io_manager has been removed in favor of merging its functionality with fs_io_manager. fs_io_manager is now the default IO manager for asset jobs, and will store asset outputs in a directory named with the asset key.

Community Contributions#

  • Fixed a bug that broke the k8s_job_executor’s max_conccurent configuration. Thanks @fahadkh!
  • Fixed a bug that caused the fs_io_manager to incorrectly handle assets associated with upstream assets. Thanks @aroig!

Documentation#

  • [helm] Add documentation for code server image pull secrets in the main chart.
  • The Dagster README has been revamped with documentation and community links.

0.14.17#

New#

  • Added a pin to protobuf version 3 due to a backwards incompatible change in the probobuf version 4 release.
  • [helm] The name of the Dagit deployment can now be overridden in the Dagster Helm chart.
  • [dagit] The left navigation now shows jobs as expandable lists grouped by repository. You can opt out of this change using the feature flag in User Settings.
  • [dagit] In the left navigation, when a job has more than one schedule or sensor, clicking the schedule/sensor icon will now display a dialog containing the full list of schedules and sensors for that job.
  • [dagit] Assets on the runs page are now shown in more scenarios.
  • [dagster-dbt] dbt assets now support subsetting! In dagit, you can launch off a dbt command which will only refresh the selected models, and when you’re building jobs using AssetGroup.build_job(), you can define selections which select subsets of the loaded dbt project.
  • [dagster-dbt][experimental] The load_assets_from_dbt_manifest function now supports an experimental select parameter. This allows you to use dbt selection syntax to select from an existing manifest.json file, rather than having Dagster re-compile the project on demand.
  • For software-defined assets, OpExecutionContext now exposes an asset_key_for_output method, which returns the asset key that one of the op’s outputs corresponds too.
  • The Backfills tab in Dagit loads much faster when there have been backfills that produced large numbers of runs.
  • Added the ability to run the Dagster Daemon as a Python module, by running python -m dagster.daemon.
  • The non_argument_deps parameter for the asset and multi_asset decorators can now be a set of strings in addition to a set of AssetKey.

Bugfixes#

  • [dagit] In cases where Dagit is unable to make successful WebSocket connections, run logs could become stuck in a loading state. Dagit will now time out on the WebSocket connection attempt after a brief period of time. This allows run logs to fall back to http requests and move past the loading state.
  • In version 0.14.16, launching an asset materialization run with source assets would error with an InvalidSubsetError. This is now fixed.
  • Empty strings are no longer allowed as AssetKeys.
  • Fixed an issue where schedules built from partitioned job config always ran at midnight, ignoring any hour or minute offset that was specified on the config.
  • Fixed an issue where if the scheduler was interrupted and resumed in the middle of running a schedule tick that produced multiple RunRequests, it would show the same run ID multiple times on the list of runs for the schedule tick.
  • Fixed an issue where Dagit would raise a GraphQL error when a non-dictionary YAML string was entered into the Launchpad.
  • Fixed an issue where Dagster gRPC servers would sometimes raise an exception when loading repositories with many partition sets.
  • Fixed an issue where the snowflake_io_manager would sometimes raise an error with pandas 1.4 or later installed.
  • Fixed an issue where re-executing an entire set of dynamic steps together with their upstream step resulted in DagsterExecutionStepNotFoundError. This is now fixed.
  • [dagit] Added loading indicator for job-scoped partition backfills.
  • Fixed an issue that made it impossible to have graph-backed assets with upstream SourceAssets.

Community Contributions#

  • AssetIn can now accept a string that will be coerced to an AssetKey. Thanks @aroig!
  • Runtime type checks improved for some asset-related functions. Thanks @aroig!
  • Docs grammar fixes. Thanks @dwinston!
  • Dataproc ops for dagster-gcp now have user-configurable timeout length. Thanks @3cham!