Dagster allows for code versioning and memoization of previous outputs based upon that versioning. Listed here are APIs related to versioning and memoization.
dagster.
VersionStrategy
[source]¶Abstract class for defining a strategy to version solids and resources.
When subclassing, get_solid_version must be implemented, and get_resource_version can be optionally implemented.
get_solid_version should ingest a SolidVersionContext, and get_resource_version should ingest a ResourceVersionContext. From that, each synthesize a unique string called a version, which will be tagged to outputs of that solid in the pipeline. Providing a VersionStrategy instance to a job will enable memoization on that job, such that only steps whose outputs do not have an up-to-date version will run.
dagster.
MemoizableIOManager
[source]¶Base class for IO manager enabled to work with memoized execution. Users should implement
the load_input
and handle_output
methods described in the IOManager
API, and the
has_output
method, which returns a boolean representing whether a data object can be found.
has_output
(context)[source]¶The user-defined method that returns whether data exists given the metadata.
context (OutputContext) – The context of the step performing this check.
True if there is data present that matches the provided context. False otherwise.
See also: dagster.IOManager
.
dagster.
MEMOIZED_RUN_TAG
¶Provide this tag to a run to toggle memoization on or off. {MEMOIZED_RUN_TAG: "true"}
toggles memoization on, while {MEMOIZED_RUN_TAG: "false"}
toggles memoization off.