Source¶
commands
¶
bundle
¶
command(metadata, default_key, overwrite, resource_generator, env, conf_source, pipeline, params)
¶
Databricks Asset Bundle commands
Source code in src/kedro_databricks/commands/bundle.py
deploy
¶
command(ctx, env, bundle, default_key, resource_generator, conf_source, pipeline, params, runtime_params, databricks_args)
¶
Deploy the Databricks Asset Bundle.
This function deploys the Databricks Asset Bundle in the current project directory. It also creates a Databricks configuration file and a Databricks target configuration file.
Source code in src/kedro_databricks/commands/deploy.py
destroy
¶
command(metadata, env, databricks_args)
¶
Databricks Asset Bundle Destroy commands
Source code in src/kedro_databricks/commands/destroy.py
init
¶
command(metadata, catalog, schema, default_key, env, conf_source, resource_generator, regex_prefix, overwrite, databricks_args)
¶
Initialize a Kedro project for Databricks Asset Bundles.
Source code in src/kedro_databricks/commands/init.py
run
¶
command(metadata, env, pipeline, databricks_args)
¶
Databricks Asset Bundle Run commands
Source code in src/kedro_databricks/commands/run.py
config
¶
Config
¶
Bases: BaseSettings
conf_source = Field(default='conf')
class-attribute
instance-attribute
¶
Path of a directory where project configuration is stored
default_env = Field(default='dev')
class-attribute
instance-attribute
¶
Default target environment for kedro-databricks commands.
init_catalog = Field(default='workspace')
class-attribute
instance-attribute
¶
Default catalog for kedro databricks init
init_schema = Field(default='default')
class-attribute
instance-attribute
¶
Default schema for kedro databricks init
regex_prefix = Field(default='re:')
class-attribute
instance-attribute
¶
Prefix to use for discovering regex workflow or task overrides. Must end in ':'
workflow_default_key = Field(default='default')
class-attribute
instance-attribute
¶
Default key to use for overrides in kedro databricks bundle
workflow_generator = Field(default='node')
class-attribute
instance-attribute
¶
Default generator to use for generating Databricks Asset Bundle resources
constants
¶
JOB_KEY_ORDER = ['name', 'description', 'parameters', 'environments', 'job_clusters', 'tasks', 'access_control_list', 'budget_policy_id', 'continuous', 'deployment', 'edit_mode', 'email_notifications', 'format', 'git_source', 'health', 'max_concurrent_runs', 'notification_settings', 'performance_target', 'queue', 'run_as', 'schedule', 'tags', 'timeout_seconds', 'trigger', 'webhook_notifications']
module-attribute
¶
Order of keys in the job configuration for Databricks jobs.
KEDRO_VERSION = Version(metadata.version('kedro'))
module-attribute
¶
Kedro version used to build this plugin.
MAX_TASK_KEY_LENGTH = 100
module-attribute
¶
Maximum number of characters in a task key in Databricks jobs.
MINIMUM_DATABRICKS_VERSION = [0, 205, 0]
module-attribute
¶
Minimum Databricks version required for this plugin.
TASK_KEY_ORDER = ['task_key', 'description', 'depends_on', 'environment_key', 'job_cluster_key', 'existing_cluster_id', 'libraries', 'new_cluster', 'compute', 'disable_auto_optimization', 'health', 'run_if', 'max_retries', 'min_retry_interval_millis', 'retry_on_timeout', 'timeout_seconds', 'notification_settings', 'email_notifications', 'webhook_notifications', 'alert_task', 'clean_rooms_notebook_task', 'condition_task', 'dashboard_task', 'dbt_task', 'for_each_task', 'notebook_task', 'pipeline_task', 'power_bi_task', 'python_wheel_task', 'run_job_task', 'spark_jar_task', 'spark_python_task', 'spark_submit_task', 'sql_task']
module-attribute
¶
Order of keys in the task configuration for Databricks jobs.
plugin
¶
commands()
¶
databricks_commands()
¶
Databricks Asset Bundle commands
These commands are used to manage Databricks Asset Bundles in a Kedro project. They allow you to initialize, bundle, deploy, run, and destroy Databricks asset bundles.
Source code in src/kedro_databricks/plugin.py
resource_generator
¶
Resource generators for building Databricks bundle resources.
This package exposes concrete generators that transform Kedro pipelines into
Databricks Asset Bundle resources (jobs). Select the appropriate
generator via RESOURCE_GENERATORS to produce resources at node or pipeline
granularity.
AbstractResourceGenerator(session, metadata, conf_source='conf', params=None)
¶
Bases: ABC
Generates Databricks resources for the given pipelines.
Finds all pipelines in the project and generates Databricks asset bundle resources for each according to the Databricks REST API
Source code in src/kedro_databricks/resource_generator/abstract_resource_generator.py
generate_jobs(pipeline_name=None)
¶
Generate Databricks resources for the given pipelines.
Finds all pipelines in the project and generates Databricks asset bundle resources for each according to the Databricks REST API
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pipeline_name
|
str | None
|
The name of the pipeline for which Databricks asset bundle resources should be generated. If None, generates all pipelines. |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, dict[str, Any]]
|
dict[str, dict[str, Any]]: A dictionary of pipeline names and their Databricks resources |
Source code in src/kedro_databricks/resource_generator/abstract_resource_generator.py
NodeResourceGenerator(session, metadata, conf_source='conf', params=None)
¶
Bases: AbstractResourceGenerator
Generate a job with one Databricks task per Kedro node.
Source code in src/kedro_databricks/resource_generator/node_resource_generator.py
PipelineResourceGenerator(session, metadata, conf_source='conf', params=None)
¶
Bases: AbstractResourceGenerator
Generate a job with a single task for the whole pipeline.
Source code in src/kedro_databricks/resource_generator/abstract_resource_generator.py
abstract_resource_generator
¶
Base interfaces and helpers for Databricks resource generation.
This module defines the abstract generator responsible for converting Kedro pipelines into Databricks jobs according to the Databricks REST API. Concrete implementations specify how tasks are laid out (e.g., per-node or per-pipeline).
AbstractResourceGenerator(session, metadata, conf_source='conf', params=None)
¶
Bases: ABC
Generates Databricks resources for the given pipelines.
Finds all pipelines in the project and generates Databricks asset bundle resources for each according to the Databricks REST API
Source code in src/kedro_databricks/resource_generator/abstract_resource_generator.py
generate_jobs(pipeline_name=None)
¶
Generate Databricks resources for the given pipelines.
Finds all pipelines in the project and generates Databricks asset bundle resources for each according to the Databricks REST API
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pipeline_name
|
str | None
|
The name of the pipeline for which Databricks asset bundle resources should be generated. If None, generates all pipelines. |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, dict[str, Any]]
|
dict[str, dict[str, Any]]: A dictionary of pipeline names and their Databricks resources |
Source code in src/kedro_databricks/resource_generator/abstract_resource_generator.py
node_resource_generator
¶
Node-level Databricks resource generator.
Creates a Databricks job where each Kedro node becomes an individual task with appropriate dependencies derived from the pipeline graph.
NodeResourceGenerator(session, metadata, conf_source='conf', params=None)
¶
Bases: AbstractResourceGenerator
Generate a job with one Databricks task per Kedro node.
Source code in src/kedro_databricks/resource_generator/node_resource_generator.py
pipeline_resource_generator
¶
Pipeline-level Databricks resource generator.
Creates a Databricks job with a single task that runs an entire Kedro pipeline in one go.
PipelineResourceGenerator(session, metadata, conf_source='conf', params=None)
¶
Bases: AbstractResourceGenerator
Generate a job with a single task for the whole pipeline.
Source code in src/kedro_databricks/resource_generator/abstract_resource_generator.py
resource_overrider
¶
Resource overriders for templating Databricks bundle resources.
This package exposes concrete overriders that Databricks Asset Resources through overrides.
Select the appropriate overrider via RESOURCE_OVERRIDERS to produce the desired behavior.
DefaultResourceOverrider
¶
Bases: AbstractResourceOverrider
A default resource overrider that performs no overrides.
override(resource_key, resource, overrides, default_key=config.workflow_default_key)
¶
Return the overrides unchanged.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource_key
|
str
|
The key identifying the resource. |
required |
default_key
|
str
|
The default key for overrides. |
workflow_default_key
|
resource
|
dict[str, Any]
|
The original resource dictionary. |
required |
overrides
|
dict[str, Any]
|
The overrides to apply. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
dict[str, Any]: resource with overrides applied (unchanged). |
Source code in src/kedro_databricks/resource_overrider/default_resource_overrider.py
JobsResourceOverrider
¶
Bases: AbstractResourceOverrider
Override a Databricks jobs resource with the default key.
override(resource_key, resource, overrides, default_key=config.workflow_default_key)
¶
Override the resources in a Databricks bundle.
This function applies the given overrides to the resources in a Databricks bundle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource_key
|
str
|
the key identifying the resource |
required |
resource
|
Dict
|
the Databricks jobs to override |
required |
overrides
|
Dict
|
the overrides to apply |
required |
default_key
|
str
|
the default key to use for overrides |
workflow_default_key
|
Raises:
| Type | Description |
|---|---|
ValueError
|
if the job or overrides are not dictionaries |
ValueError
|
if the key in overrides is not found in OVERRIDE_KEY_MAP |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dict[str, Any]: the Databricks bundle with the overrides applied |
Source code in src/kedro_databricks/resource_overrider/jobs_resource_overrider.py
abstract_resource_overrider
¶
default_resource_overrider
¶
DefaultResourceOverrider
¶
Bases: AbstractResourceOverrider
A default resource overrider that performs no overrides.
override(resource_key, resource, overrides, default_key=config.workflow_default_key)
¶
Return the overrides unchanged.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource_key
|
str
|
The key identifying the resource. |
required |
default_key
|
str
|
The default key for overrides. |
workflow_default_key
|
resource
|
dict[str, Any]
|
The original resource dictionary. |
required |
overrides
|
dict[str, Any]
|
The overrides to apply. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
dict[str, Any]: resource with overrides applied (unchanged). |
Source code in src/kedro_databricks/resource_overrider/default_resource_overrider.py
jobs_resource_overrider
¶
JobsResourceOverrider
¶
Bases: AbstractResourceOverrider
Override a Databricks jobs resource with the default key.
override(resource_key, resource, overrides, default_key=config.workflow_default_key)
¶
Override the resources in a Databricks bundle.
This function applies the given overrides to the resources in a Databricks bundle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource_key
|
str
|
the key identifying the resource |
required |
resource
|
Dict
|
the Databricks jobs to override |
required |
overrides
|
Dict
|
the overrides to apply |
required |
default_key
|
str
|
the default key to use for overrides |
workflow_default_key
|
Raises:
| Type | Description |
|---|---|
ValueError
|
if the job or overrides are not dictionaries |
ValueError
|
if the key in overrides is not found in OVERRIDE_KEY_MAP |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dict[str, Any]: the Databricks bundle with the overrides applied |
Source code in src/kedro_databricks/resource_overrider/jobs_resource_overrider.py
utilities
¶
common
¶
get_arg_value(args, arg_name)
¶
Get the value of an argument from a list of arguments.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args
|
List[str]
|
list of arguments |
required |
arg_name
|
str
|
name of the argument to get the value for |
required |
Returns:
| Type | Description |
|---|---|
str | None
|
str | None: value of the argument or None if not found |
Source code in src/kedro_databricks/utilities/common.py
get_entry_point(project_name)
¶
Get the entry point for a project.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
project_name
|
str
|
name of the project |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
entry point for the project |
Source code in src/kedro_databricks/utilities/common.py
get_regex_values(lookup_key, values, regex_prefix='re:')
¶
Get values matching a regex
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lookup_key
|
str
|
key to match regexes to |
required |
values
|
dict[str, Any]
|
The overrides to apply. |
required |
regex_prefix
|
str
|
prefix that identifies a key as a regex |
're:'
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
dict[str, Any]: values where lookup key matches the regex |
Source code in src/kedro_databricks/utilities/common.py
get_value_from_dotpath(ddict, dotpath)
¶
Get a value from a dictionary using a dotpath.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ddict
|
Any
|
dictionary to get the value from |
required |
dotpath
|
str
|
dotpath to get the value for |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
value from the dictionary or None if not found |
Source code in src/kedro_databricks/utilities/common.py
remove_nulls(value)
¶
Remove None values from a dictionary or list.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
Dict[Any, Any] | List[Dict[Any, Any]]
|
dictionary or list to remove None values from |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any] | list[Any]
|
Dict[Any, Any] | List[Dict[Any, Any]]: dictionary or list with None values removed |
Source code in src/kedro_databricks/utilities/common.py
require_databricks_run_script(_version=KEDRO_VERSION)
¶
Check if the current Kedro version is less than 0.19.8.
Kedro 0.19.8 introduced a new run_script method that is required for
running tasks on Databricks. This method is not available in earlier
versions of Kedro. This function checks if the current Kedro version is
less than 0.19.8.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
whether the current Kedro version is less than 0.19.8 |
Source code in src/kedro_databricks/utilities/common.py
sanitize_name(node)
¶
Sanitize the node name to be used as a task key in Databricks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node
|
Node | str
|
Kedro node object or node name |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
sanitized task key |
Source code in src/kedro_databricks/utilities/common.py
sort_dict(d, key_order=None)
¶
Recursively sort the keys of a dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d
|
Dict[Any, Any]
|
dictionary to sort |
required |
key_order
|
List[str]
|
list of keys to sort by |
None
|
Returns:
| Type | Description |
|---|---|
dict[Any, Any]
|
Dict[Any, Any]: dictionary with ordered values |
Source code in src/kedro_databricks/utilities/common.py
version_to_str(version)
¶
Convert a version list to a string.
Requires version to follow semantic versioning (3 parts: major, minor, patch).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
version
|
List[int]
|
version list to convert |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
version string |
Source code in src/kedro_databricks/utilities/common.py
databricks_cli
¶
DatabricksCli(metadata, env=config.default_env, additional_args=None)
¶
Databricks CLI command collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
additional_args
|
list[str] | None
|
Additional arguments to be passed to the
|
None
|
Source code in src/kedro_databricks/utilities/databricks_cli.py
logger
¶
Logger configuration for the kedro-databricks package.
This module sets up a logger for the kedro-databricks package, allowing for structured logging
and easier debugging. The logger is configured to log messages at the INFO level by default,
but this can be overridden by setting the LOG_LEVEL environment variable.
get_logger(name)
¶
Get a logger with the specified name.
This function retrieves a logger instance with the given name, which is a child of the root logger
for the kedro-databricks package
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The name of the logger. |
required |
Returns:
| Type | Description |
|---|---|
Logger
|
logging.Logger: The logger instance. |
Source code in src/kedro_databricks/utilities/logger.py
plugin
¶
resolver_generics
¶
Generic, composable resource resolvers.
This module defines a small set of primitives to resolve arbitrary "resources" from string identifiers. It includes:
RegistryResourceResolver: resolves from a given mapping/registry.ModuleResourceResolver: resolves a dotted path (module.attr) and optionally validates the resulting attribute.CompositeResourceResolver: chains multiple resolvers and aggregates their errors for better diagnostics.
These utilities allow flexible lookups (e.g., built-in names via a registry, or user-provided dotted paths) while surfacing clear error messages.
CompositeResourceResolver(resolvers)
dataclass
¶
Bases: Generic[ResourceType], ResourceResolver[ResourceType]
resolve(value)
¶
Try multiple resolvers in order, returning the first success.
If all resolvers fail, a combined ResourceNotFoundError is raised
containing the individual failure messages for easier debugging.
Source code in src/kedro_databricks/utilities/resolver_generics.py
ModuleResourceResolver(validate_fn=None)
dataclass
¶
Bases: Generic[ResourceType], ResourceResolver[ResourceType]
resolve(value)
¶
Resolve a dotted path (module.attr) to a Python attribute.
If validate_fn is provided, the resolved attribute must satisfy it;
otherwise a ResourceInvalidError is raised.
Raises:
| Type | Description |
|---|---|
ResourceImportError
|
If the value is not of the form
|
ResourceNotFoundError
|
If the module is found but the attribute is missing. |
ResourceInvalidError
|
If |
Source code in src/kedro_databricks/utilities/resolver_generics.py
RegistryResourceResolver(registry, default=None)
dataclass
¶
Bases: Generic[ResourceType], ResourceResolver[ResourceType]
resolve(value)
¶
Resolve using a static registry mapping.
Raises ResourceNotFoundError when the key is missing.
Source code in src/kedro_databricks/utilities/resolver_generics.py
ResourceImportError
¶
Bases: ResourceResolverError
Raised when a resource cannot be imported.
Typical causes include:
- The value is not a dotted path of the form module.attribute.
- The specified module cannot be imported.
This error is commonly emitted by ModuleResourceResolver when the
path format is invalid or the module import fails.
for_value(value, message=None)
classmethod
¶
Construct an error for a single value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
str
|
The unresolved identifier. |
required |
message
|
str | None
|
Optional details about why resolution failed. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
ResourceImportError |
ResourceImportError
|
A descriptive error instance. |
Source code in src/kedro_databricks/utilities/resolver_generics.py
ResourceInvalidError
¶
Bases: ResourceResolverError
Raised when a resource is invalid.
Emitted when a resolved value fails a post-resolution validation (e.g.,
does not satisfy a provided validate_fn). This is distinct from
ResourceNotFoundError, which indicates the resource could not be
located at all.
for_value(value, message=None)
classmethod
¶
Construct an error for a single value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
str
|
The unresolved identifier. |
required |
message
|
str | None
|
Optional details about why resolution failed. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
ResourceInvalidError |
ResourceInvalidError
|
A descriptive error instance. |
Source code in src/kedro_databricks/utilities/resolver_generics.py
ResourceNotFoundError
¶
Bases: ResourceResolverError
Raised when a resource cannot be resolved.
Provides helpers to produce consistent, user-friendly error messages for single- and multi-resolver failures.
for_value(value, message=None)
classmethod
¶
Construct an error for a single value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
str
|
The unresolved identifier. |
required |
message
|
str | None
|
Optional details about why resolution failed. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
ResourceNotFoundError |
ResourceNotFoundError
|
A descriptive error instance. |
Source code in src/kedro_databricks/utilities/resolver_generics.py
ResourceResolver
¶
Bases: ABC, Generic[ResourceType]
Abstract protocol for resolving a value into a resource of type T.
resolve(value)
abstractmethod
¶
Resolve a string identifier into a resource.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
str
|
The identifier to resolve. |
required |
Returns:
| Type | Description |
|---|---|
ResourceType
|
The resolved resource. |
Raises:
| Type | Description |
|---|---|
ResourceNotFoundError
|
If the resource cannot be found. |
Source code in src/kedro_databricks/utilities/resolver_generics.py
ResourceResolverError
¶
Bases: Exception
Base error for all resource resolver failures.
This is the common ancestor for all resolver exceptions and is also used by composite resolvers to aggregate multiple failures into a single exception with a readable, concatenated message.
for_errors(errors)
classmethod
¶
Construct an aggregated resolver error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
errors
|
Sequence[ResourceResolverError]
|
The list of individual |
required |
Returns:
| Name | Type | Description |
|---|---|---|
ResourceResolverError |
ResourceResolverError
|
A single error whose message concatenates |
ResourceResolverError
|
the messages of all provided errors, separated by semicolons. |