mixins

Contents

mixins#

Lightweight mixins task classes.

Classes:

ArrayFunctionClassMixin(*args, **kwargs)

ArrayFunctionInstanceMixin(*args, **kwargs)

CalibratorClassMixin(*args, **kwargs)

Mixin to include and access single Calibrator class.

CalibratorMixin(*args, **kwargs)

Mixin to include and access a single Calibrator instance.

CalibratorClassesMixin(*args, **kwargs)

Mixin to include and access multiple Calibrator classes.

CalibratorsMixin(*args, **kwargs)

Mixin to include multiple Calibrator instances into tasks.

SelectorClassMixin(*args, **kwargs)

Mixin to include and access single Selector class.

SelectorMixin(*args, **kwargs)

Mixin to include and access a single Selector instance.

ReducerClassMixin(*args, **kwargs)

Mixin to include and access single Reducer class.

ReducerMixin(*args, **kwargs)

Mixin to include and access a single Reducer instance.

ProducerClassMixin(*args, **kwargs)

Mixin to include and access single Producer class.

ProducerMixin(*args, **kwargs)

Mixin to include and access a single Producer instance.

ProducerClassesMixin(*args, **kwargs)

Mixin to include and access multiple Producer classes.

ProducersMixin(*args, **kwargs)

Mixin to include multiple Producer instances into tasks.

MLModelMixinBase(*args, **kwargs)

Base mixin to include a machine learning application into tasks.

MLModelTrainingMixin(*args, **kwargs)

A mixin class for training machine learning models.

MLModelMixin(*args, **kwargs)

A mixin for tasks that require a single machine learning model, e.g. for evaluation.

PreparationProducerMixin(*args, **kwargs)

MLModelDataMixin(*args, **kwargs)

MLModelsMixin(*args, **kwargs)

HistProducerClassMixin(*args, **kwargs)

Mixin to include and access single HistProducer class.

HistProducerMixin(*args, **kwargs)

Mixin to include and access a single HistProducer instance.

InferenceModelClassMixin(*args, **kwargs)

InferenceModelMixin(*args, **kwargs)

CategoriesMixin(*args, **kwargs)

VariablesMixin(*args, **kwargs)

DatasetsProcessesMixin(*args, **kwargs)

ShiftSourcesMixin(*args, **kwargs)

DatasetShiftSourcesMixin(*args, **kwargs)

ChunkedIOMixin(*args, **kwargs)

HistHookMixin(*args, **kwargs)

class ArrayFunctionClassMixin(*args, **kwargs)[source]#

Bases: ConfigTask

Methods:

array_function_cls_repr(array_function_name)

Central definition of how to obtain representation of array function from the name.

Attributes:

array_function_cls_repr(array_function_name)[source]#

Central definition of how to obtain representation of array function from the name.

Parameters:

array_function – name of the array function (NOTE: change to class?)

Return type:

str

Returns:

sring representation of the array function

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class ArrayFunctionInstanceMixin(*args, **kwargs)[source]#

Bases: DatasetTask

Methods:

array_function_inst_repr(array_function_inst)

Attributes:

array_function_inst_repr(array_function_inst)[source]#
Return type:

None

configs = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'local_shift', 'user'}#
exclude_params_remote_workflow = {'known_shifts', 'local_shift'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'local_shift', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'local_shift', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class CalibratorClassMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionClassMixin

Mixin to include and access single Calibrator class.

Attributes:

Methods:

resolve_param_values_pre_init(params)

Resolve parameters before the array function instances have been initialized.

req_params(inst, **kwargs)

Returns parameters that are jointly defined in this class and another task instance of some other class.

store_parts()

get_config_lookup_keys(inst_or_params)

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

calibrator = <luigi.parameter.Parameter object>#
classmethod resolve_param_values_pre_init(params)[source]#

Resolve parameters before the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod req_params(inst, **kwargs)[source]#

Returns parameters that are jointly defined in this class and another task instance of some other class. The parameters are used when calling Task.req(self).

Return type:

dict[str, Any]

property calibrator_repr: str#

Return a string representation of the calibrator class.

store_parts()[source]#
Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

classmethod get_config_lookup_keys(inst_or_params)[source]#

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

Parameters:

inst_or_params (CalibratorClassMixin | dict[str, Any]) – The tasks instance or its parameters.

Return type:

law.util.InsertiableDict

Returns:

A dictionary with keys that can be used for nested lookup.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class CalibratorMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionInstanceMixin, CalibratorClassMixin

Mixin to include and access a single Calibrator instance.

Attributes:

Methods:

get_calibrator_dict(params)

build_calibrator_inst(calibrator[, params])

Instantiate and return the Calibrator instance.

resolve_instances(params, shifts)

Build the array function instances.

get_known_shifts(params, shifts)

Updates the set of known shifts implemented by this and upstream tasks.

teardown_calibrator_inst()

find_keep_columns(collection)

Finds the columns to keep based on the collection.

calibrator_inst = <columnflow.tasks.framework.parameters.DerivableInstParameter object>#
exclude_params_index = {'calibrator_inst', 'known_shifts', 'local_shift', 'user'}#
exclude_params_repr = {'calibrator_inst', 'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_sandbox = {'calibrator_inst', 'known_shifts', 'local_shift', 'log_file', 'sandbox'}#
exclude_params_remote_workflow = {'calibrator_inst', 'known_shifts', 'local_shift'}#
invokes_calibrator = False#
classmethod get_calibrator_dict(params)[source]#
Return type:

dict[str, Any]

classmethod build_calibrator_inst(calibrator, params=None)[source]#

Instantiate and return the Calibrator instance.

Parameters:
  • calibrator (str) – Name of the calibrator class to instantiate.

  • params (dict[str, Any] | None, default: None) – Arguments forwarded to the calibrator constructor.

Raises:

RuntimeError – If the calibrator class is not exposed.

Return type:

Calibrator

Returns:

The calibrator instance.

classmethod resolve_instances(params, shifts)[source]#

Build the array function instances. For single-config/dataset tasks, resolve_instances is implemented by mixin classes such as the ProducersMixin. For multi-config tasks, resolve_instances from the upstream task is called for each config instance. If the resolve_instances function needs to be called for other combinations of parameters (e.g. per dataset), it can be overwritten by the task class.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – Collection of local and global shifts.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod get_known_shifts(params, shifts)[source]#

Updates the set of known shifts implemented by this and upstream tasks.

Parameters:
  • config_inst – Config instance.

  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – TaskShifts object to adjust.

Return type:

None

teardown_calibrator_inst()[source]#
Return type:

None

property calibrator_repr: str#

Return a string representation of the calibrator instance.

find_keep_columns(collection)[source]#

Finds the columns to keep based on the collection.

Parameters:

collection (ColumnCollection) – The collection of columns.

Return type:

set[Route]

Returns:

Set of columns to keep.

configs = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'local_shift', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class CalibratorClassesMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionClassMixin

Mixin to include and access multiple Calibrator classes.

Attributes:

Methods:

resolve_param_values_pre_init(params)

Resolve parameters before the array function instances have been initialized.

req_params(inst, **kwargs)

Returns parameters that are jointly defined in this class and another task instance of some other class.

store_parts()

get_config_lookup_keys(inst_or_params)

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

calibrators = <law.parameter.CSVParameter object>#
classmethod resolve_param_values_pre_init(params)[source]#

Resolve parameters before the array function instances have been initialized.

Parameters:

params (InsertableDict[str, Any]) – Dictionary of task parameters.

Return type:

InsertableDict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod req_params(inst, **kwargs)[source]#

Returns parameters that are jointly defined in this class and another task instance of some other class. The parameters are used when calling Task.req(self).

Return type:

dict[str, Any]

property calibrators_repr: str#

Return a string representation of the calibrators.

store_parts()[source]#
Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

classmethod get_config_lookup_keys(inst_or_params)[source]#

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

Parameters:

inst_or_params (CalibratorClassesMixin | dict[str, Any]) – The tasks instance or its parameters.

Return type:

law.util.InsertiableDict

Returns:

A dictionary with keys that can be used for nested lookup.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class CalibratorsMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionInstanceMixin, CalibratorClassesMixin

Mixin to include multiple Calibrator instances into tasks.

Attributes:

Methods:

get_calibrator_dict(params)

build_calibrator_insts(calibrators[, params])

Instantiate and return multiple Calibrator instances.

resolve_instances(params, shifts)

Build the array function instances.

get_known_shifts(params, shifts)

Updates the set of known shifts implemented by this and upstream tasks.

find_keep_columns(collection)

Finds the columns to keep based on the collection.

calibrator_insts = <columnflow.tasks.framework.parameters.DerivableInstsParameter object>#
exclude_params_index = {'calibrator_insts', 'known_shifts', 'local_shift', 'user'}#
exclude_params_repr = {'calibrator_insts', 'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_sandbox = {'calibrator_insts', 'known_shifts', 'local_shift', 'log_file', 'sandbox'}#
exclude_params_remote_workflow = {'calibrator_insts', 'known_shifts', 'local_shift'}#
classmethod get_calibrator_dict(params)[source]#
Return type:

dict[str, Any]

classmethod build_calibrator_insts(calibrators, params=None)[source]#

Instantiate and return multiple Calibrator instances.

Parameters:
  • calibrators (Iterable[str]) – Name of the calibrator class to instantiate.

  • params (dict[str, Any] | None, default: None) – Arguments forwarded to the calibrator constructors.

Raises:

RuntimeError – If any calibrator class is not exposed.

Return type:

list[Calibrator]

Returns:

The list of calibrator instances.

classmethod resolve_instances(params, shifts)[source]#

Build the array function instances. For single-config/dataset tasks, resolve_instances is implemented by mixin classes such as the ProducersMixin. For multi-config tasks, resolve_instances from the upstream task is called for each config instance. If the resolve_instances function needs to be called for other combinations of parameters (e.g. per dataset), it can be overwritten by the task class.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – Collection of local and global shifts.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod get_known_shifts(params, shifts)[source]#

Updates the set of known shifts implemented by this and upstream tasks.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – TaskShifts object to adjust.

Return type:

None

property calibrators_repr: str#

Return a string representation of the calibrators.

find_keep_columns(collection)[source]#

Finds the columns to keep based on the collection.

Parameters:

collection (ColumnCollection) – The collection of columns.

Return type:

set[Route]

Returns:

Set of columns to keep.

configs = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'local_shift', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class SelectorClassMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionClassMixin

Mixin to include and access single Selector class.

Attributes:

Methods:

resolve_param_values_pre_init(params)

Resolve parameters before the array function instances have been initialized.

req_params(inst, **kwargs)

Returns parameters that are jointly defined in this class and another task instance of some other class.

store_parts()

get_config_lookup_keys(inst_or_params)

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

selector = <luigi.parameter.Parameter object>#
selector_steps = <law.parameter.CSVParameter object>#
selector_steps_order_sensitive = False#
exclude_params_repr_empty = {'selector_steps'}#
classmethod resolve_param_values_pre_init(params)[source]#

Resolve parameters before the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod req_params(inst, **kwargs)[source]#

Returns parameters that are jointly defined in this class and another task instance of some other class. The parameters are used when calling Task.req(self).

Return type:

dict[str, Any]

property selector_repr: str#

Return a string representation of the selector class.

store_parts()[source]#
Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

classmethod get_config_lookup_keys(inst_or_params)[source]#

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

Parameters:

inst_or_params (SelectorClassMixin | dict[str, Any]) – The tasks instance or its parameters.

Return type:

law.util.InsertiableDict

Returns:

A dictionary with keys that can be used for nested lookup.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class SelectorMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionInstanceMixin, SelectorClassMixin

Mixin to include and access a single Selector instance.

Attributes:

Methods:

get_selector_dict(params)

build_selector_inst(selector[, params])

Instantiate and return the Selector instance.

resolve_instances(params, shifts)

Build the array function instances.

get_known_shifts(params, shifts)

Updates the set of known shifts implemented by this and upstream tasks.

teardown_selector_inst()

find_keep_columns(collection)

Finds the columns to keep based on the collection.

selector_inst = <columnflow.tasks.framework.parameters.DerivableInstParameter object>#
exclude_params_index = {'known_shifts', 'local_shift', 'selector_inst', 'user'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'selector_inst', 'user'}#
exclude_params_sandbox = {'known_shifts', 'local_shift', 'log_file', 'sandbox', 'selector_inst'}#
exclude_params_remote_workflow = {'known_shifts', 'local_shift', 'selector_inst'}#
invokes_selector = False#
classmethod get_selector_dict(params)[source]#
Return type:

dict[str, Any]

classmethod build_selector_inst(selector, params=None)[source]#

Instantiate and return the Selector instance.

Parameters:
  • selector (str) – Name of the selector class to instantiate.

  • params (dict[str, Any] | None, default: None) – Arguments forwarded to the selector constructor.

Raises:

RuntimeError – If the selector class is not exposed.

Return type:

Selector

Returns:

The selector instance.

classmethod resolve_instances(params, shifts)[source]#

Build the array function instances. For single-config/dataset tasks, resolve_instances is implemented by mixin classes such as the ProducersMixin. For multi-config tasks, resolve_instances from the upstream task is called for each config instance. If the resolve_instances function needs to be called for other combinations of parameters (e.g. per dataset), it can be overwritten by the task class.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – Collection of local and global shifts.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod get_known_shifts(params, shifts)[source]#

Updates the set of known shifts implemented by this and upstream tasks.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – TaskShifts object to adjust.

Return type:

None

teardown_selector_inst()[source]#
Return type:

None

property selector_repr: str#

Return a string representation of the selector instance.

find_keep_columns(collection)[source]#

Finds the columns to keep based on the collection.

Parameters:

collection (ColumnCollection) – The collection of columns.

Return type:

set[Route]

Returns:

Set of columns to keep.

configs = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_repr_empty = {'selector_steps'}#
exclude_params_req = {'known_shifts', 'local_shift', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class ReducerClassMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionClassMixin

Mixin to include and access single Reducer class.

Attributes:

Methods:

resolve_param_values_pre_init(params)

Resolve parameters before the array function instances have been initialized.

req_params(inst, **kwargs)

Returns parameters that are jointly defined in this class and another task instance of some other class.

store_parts()

get_config_lookup_keys(inst_or_params)

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

reducer = <luigi.parameter.Parameter object>#
classmethod resolve_param_values_pre_init(params)[source]#

Resolve parameters before the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod req_params(inst, **kwargs)[source]#

Returns parameters that are jointly defined in this class and another task instance of some other class. The parameters are used when calling Task.req(self).

Return type:

dict[str, Any]

property reducer_repr: str#

Return a string representation of the reducer class.

store_parts()[source]#
Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

classmethod get_config_lookup_keys(inst_or_params)[source]#

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

Parameters:

inst_or_params (ReducerClassMixin | dict[str, Any]) – The tasks instance or its parameters.

Return type:

law.util.InsertiableDict

Returns:

A dictionary with keys that can be used for nested lookup.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class ReducerMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionInstanceMixin, ReducerClassMixin

Mixin to include and access a single Reducer instance.

Attributes:

Methods:

get_reducer_dict(params)

build_reducer_inst(reducer[, params])

Instantiate and return the Reducer instance.

resolve_instances(params, shifts)

Build the array function instances.

get_known_shifts(params, shifts)

Updates the set of known shifts implemented by this and upstream tasks.

teardown_reducer_inst()

find_keep_columns(collection)

Finds the columns to keep based on the collection.

reducer_inst = <columnflow.tasks.framework.parameters.DerivableInstParameter object>#
exclude_params_index = {'known_shifts', 'local_shift', 'reducer_inst', 'user'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'reducer_inst', 'user'}#
exclude_params_sandbox = {'known_shifts', 'local_shift', 'log_file', 'reducer_inst', 'sandbox'}#
exclude_params_remote_workflow = {'known_shifts', 'local_shift', 'reducer_inst'}#
invokes_reducer = False#
classmethod get_reducer_dict(params)[source]#
Return type:

dict[str, Any]

classmethod build_reducer_inst(reducer, params=None)[source]#

Instantiate and return the Reducer instance.

Parameters:
  • reducer (str) – Name of the reducer class to instantiate.

  • params (dict[str, Any] | None, default: None) – Arguments forwarded to the reducer constructor.

Raises:

RuntimeError – If the reducer class is not exposed.

Return type:

Reducer

Returns:

The reducer instance.

classmethod resolve_instances(params, shifts)[source]#

Build the array function instances. For single-config/dataset tasks, resolve_instances is implemented by mixin classes such as the ProducersMixin. For multi-config tasks, resolve_instances from the upstream task is called for each config instance. If the resolve_instances function needs to be called for other combinations of parameters (e.g. per dataset), it can be overwritten by the task class.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – Collection of local and global shifts.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod get_known_shifts(params, shifts)[source]#

Updates the set of known shifts implemented by this and upstream tasks.

Parameters:
  • config_inst – Config instance.

  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – TaskShifts object to adjust.

Return type:

None

teardown_reducer_inst()[source]#
Return type:

None

property reducer_repr: str#

Return a string representation of the reducer instance.

find_keep_columns(collection)[source]#

Finds the columns to keep based on the collection.

Parameters:

collection (ColumnCollection) – The collection of columns.

Return type:

set[Route]

Returns:

Set of columns to keep.

configs = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'local_shift', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class ProducerClassMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionClassMixin

Mixin to include and access single Producer class.

Attributes:

Methods:

resolve_param_values_pre_init(params)

Resolve parameters before the array function instances have been initialized.

req_params(inst, **kwargs)

Returns parameters that are jointly defined in this class and another task instance of some other class.

store_parts()

get_config_lookup_keys(inst_or_params)

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

producer = <luigi.parameter.Parameter object>#
classmethod resolve_param_values_pre_init(params)[source]#

Resolve parameters before the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod req_params(inst, **kwargs)[source]#

Returns parameters that are jointly defined in this class and another task instance of some other class. The parameters are used when calling Task.req(self).

Return type:

dict[str, Any]

property producer_repr: str#

Return a string representation of the producer class.

store_parts()[source]#
Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

classmethod get_config_lookup_keys(inst_or_params)[source]#

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

Parameters:

inst_or_params (ProducerClassMixin | dict[str, Any]) – The tasks instance or its parameters.

Return type:

law.util.InsertiableDict

Returns:

A dictionary with keys that can be used for nested lookup.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class ProducerMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionInstanceMixin, ProducerClassMixin

Mixin to include and access a single Producer instance.

Attributes:

Methods:

get_producer_dict(params)

build_producer_inst(producer[, params])

Instantiate and return the Producer instance.

resolve_instances(params, shifts)

Build the array function instances.

get_known_shifts(params, shifts)

Updates the set of known shifts implemented by this and upstream tasks.

teardown_producer_inst()

find_keep_columns(collection)

Finds the columns to keep based on the collection.

producer_inst = <columnflow.tasks.framework.parameters.DerivableInstParameter object>#
exclude_params_index = {'known_shifts', 'local_shift', 'producer_inst', 'user'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'producer_inst', 'user'}#
exclude_params_sandbox = {'known_shifts', 'local_shift', 'log_file', 'producer_inst', 'sandbox'}#
exclude_params_remote_workflow = {'known_shifts', 'local_shift', 'producer_inst'}#
invokes_producer = False#
classmethod get_producer_dict(params)[source]#
Return type:

dict[str, Any]

classmethod build_producer_inst(producer, params=None)[source]#

Instantiate and return the Producer instance.

Parameters:
  • producer (str) – Name of the producer class to instantiate.

  • params (dict[str, Any] | None, default: None) – Arguments forwarded to the producer constructor.

Raises:

RuntimeError – If the producer class is not exposed.

Return type:

Producer

Returns:

The producer instance.

classmethod resolve_instances(params, shifts)[source]#

Build the array function instances. For single-config/dataset tasks, resolve_instances is implemented by mixin classes such as the ProducersMixin. For multi-config tasks, resolve_instances from the upstream task is called for each config instance. If the resolve_instances function needs to be called for other combinations of parameters (e.g. per dataset), it can be overwritten by the task class.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – Collection of local and global shifts.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod get_known_shifts(params, shifts)[source]#

Updates the set of known shifts implemented by this and upstream tasks.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – TaskShifts object to adjust.

Return type:

None

teardown_producer_inst()[source]#
Return type:

None

property producer_repr: str#

Return a string representation of the producer instance.

find_keep_columns(collection)[source]#

Finds the columns to keep based on the collection.

Parameters:

collection (ColumnCollection) – The collection of columns.

Return type:

set[Route]

Returns:

Set of columns to keep.

configs = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'local_shift', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class ProducerClassesMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionClassMixin

Mixin to include and access multiple Producer classes.

Attributes:

Methods:

resolve_param_values_pre_init(params)

Resolve parameters before the array function instances have been initialized.

req_params(inst, **kwargs)

Returns parameters that are jointly defined in this class and another task instance of some other class.

store_parts()

get_config_lookup_keys(inst_or_params)

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

producers = <law.parameter.CSVParameter object>#
classmethod resolve_param_values_pre_init(params)[source]#

Resolve parameters before the array function instances have been initialized.

Parameters:

params (InsertableDict[str, Any]) – Dictionary of task parameters.

Return type:

InsertableDict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod req_params(inst, **kwargs)[source]#

Returns parameters that are jointly defined in this class and another task instance of some other class. The parameters are used when calling Task.req(self).

Return type:

dict[str, Any]

property producers_repr: str#

Return a string representation of the producers.

store_parts()[source]#
Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

classmethod get_config_lookup_keys(inst_or_params)[source]#

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

Parameters:

inst_or_params (ProducerClassesMixin | dict[str, Any]) – The tasks instance or its parameters.

Return type:

law.util.InsertiableDict

Returns:

A dictionary with keys that can be used for nested lookup.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class ProducersMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionInstanceMixin, ProducerClassesMixin

Mixin to include multiple Producer instances into tasks.

Attributes:

Methods:

get_producer_dict(params)

build_producer_insts(producers[, params])

Instantiate and return multiple Producer instances.

resolve_instances(params, shifts)

Build the array function instances.

get_known_shifts(params, shifts)

Updates the set of known shifts implemented by this and upstream tasks.

find_keep_columns(collection)

Finds the columns to keep based on the collection.

producer_insts = <columnflow.tasks.framework.parameters.DerivableInstsParameter object>#
exclude_params_index = {'known_shifts', 'local_shift', 'producer_insts', 'user'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'producer_insts', 'user'}#
exclude_params_sandbox = {'known_shifts', 'local_shift', 'log_file', 'producer_insts', 'sandbox'}#
exclude_params_remote_workflow = {'known_shifts', 'local_shift', 'producer_insts'}#
classmethod get_producer_dict(params)[source]#
Return type:

dict[str, Any]

classmethod build_producer_insts(producers, params=None)[source]#

Instantiate and return multiple Producer instances.

Parameters:
  • producers (Iterable[str]) – Name of the producer class to instantiate.

  • params (dict[str, Any] | None, default: None) – Arguments forwarded to the producer constructors.

Raises:

RuntimeError – If any producer class is not exposed.

Return type:

list[Producer]

Returns:

The list of producer instances.

classmethod resolve_instances(params, shifts)[source]#

Build the array function instances. For single-config/dataset tasks, resolve_instances is implemented by mixin classes such as the ProducersMixin. For multi-config tasks, resolve_instances from the upstream task is called for each config instance. If the resolve_instances function needs to be called for other combinations of parameters (e.g. per dataset), it can be overwritten by the task class.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – Collection of local and global shifts.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod get_known_shifts(params, shifts)[source]#

Updates the set of known shifts implemented by this and upstream tasks.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – TaskShifts object to adjust.

Return type:

None

property producers_repr: str#

Return a string representation of the producers.

find_keep_columns(collection)[source]#

Finds the columns to keep based on the collection.

Parameters:

collection (ColumnCollection) – The collection of columns.

Return type:

set[Route]

Returns:

Set of columns to keep.

configs = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'local_shift', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class MLModelMixinBase(*args, **kwargs)[source]#

Bases: ConfigTask

Base mixin to include a machine learning application into tasks.

Inheriting from this mixin will allow a task to instantiate and access a MLModel instance with name ml_model, which is an input parameter for this task.

Attributes:

Methods:

req_params(inst, **kwargs)

Get the required parameters for the task, preferring the --ml-model set on task-level via CLI.

get_ml_model_inst(ml_model, analysis_inst[, ...])

Get requested ml_model instance.

events_used_in_training(config_inst, ...)

Evaluate whether the events for the combination of dataset_inst and shift_inst shall be used in the training.

ml_model = <luigi.parameter.Parameter object>#
ml_model_settings = <columnflow.tasks.framework.parameters.SettingsParameter object>#
ml_model_inst = <columnflow.tasks.framework.parameters.DerivableInstParameter object>#
exclude_params_index = {'known_shifts', 'ml_model_inst', 'user'}#
exclude_params_repr = {'known_shifts', 'ml_model_inst', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'ml_model_inst', 'sandbox'}#
exclude_params_remote_workflow = {'known_shifts', 'ml_model_inst'}#
exclude_params_repr_empty = {'ml_model'}#
property ml_model_repr: str#

Returns a string representation of the ML model instance.

classmethod req_params(inst, **kwargs)[source]#

Get the required parameters for the task, preferring the --ml-model set on task-level via CLI.

This method first checks if the --ml-model parameter is set at the task-level via the command line. If it is, this parameter is preferred and added to the ‘_prefer_cli’ key in the kwargs dictionary. The method then calls the ‘req_params’ method of the superclass with the updated kwargs.

Parameters:
  • inst (Task) – The current task instance.

  • kwargs – Additional keyword arguments that may contain parameters for the task.

Return type:

dict[str, Any]

Returns:

A dictionary of parameters required for the task.

classmethod get_ml_model_inst(ml_model, analysis_inst, requested_configs=None, **kwargs)[source]#

Get requested ml_model instance.

This method retrieves the requested ml_model instance. If requested_configs are provided, they are used for the training of the ML application.

Parameters:
  • ml_model (str) – Name of MLModel to load.

  • analysis_inst (od.Analysis) – Forward this analysis inst to the init function of new MLModel sub class.

  • requested_configs (list[str] | None, default: None) – Configs needed for the training of the ML application.

  • kwargs – Additional keyword arguments to forward to the MLModel instance.

Return type:

MLModel

Returns:

MLModel instance.

events_used_in_training(config_inst, dataset_inst, shift_inst)[source]#

Evaluate whether the events for the combination of dataset_inst and shift_inst shall be used in the training.

This method checks if the dataset_inst is in the set of datasets of the current ml_model_inst based on the given config_inst. Additionally, the function checks that the shift_inst does not have the tag “disjoint_from_nominal”.

Parameters:
  • config_inst (Config) – The configuration instance.

  • dataset_inst (Dataset) – The dataset instance.

  • shift_inst (Shift) – The shift instance.

Return type:

bool

Returns:

True if the events shall be used in the training, False otherwise.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class MLModelTrainingMixin(*args, **kwargs)[source]#

Bases: MLModelMixinBase, CalibratorClassesMixin, SelectorClassMixin, ReducerClassMixin, ProducerClassesMixin

A mixin class for training machine learning models.

Attributes:

Methods:

resolve_instances(params, shifts)

Build the array function instances.

resolve_param_values_pre_init(params)

Resolve the parameter values for the given parameters.

store_parts()

Generate a dictionary of store parts for the current instance.

single_config = False#
classmethod resolve_instances(params, shifts)[source]#

Build the array function instances. For single-config/dataset tasks, resolve_instances is implemented by mixin classes such as the ProducersMixin. For multi-config tasks, resolve_instances from the upstream task is called for each config instance. If the resolve_instances function needs to be called for other combinations of parameters (e.g. per dataset), it can be overwritten by the task class.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – Collection of local and global shifts.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod resolve_param_values_pre_init(params)[source]#

Resolve the parameter values for the given parameters.

This method retrieves the parameters and resolves the ML model instance and the configs. It also calls the model’s setup hook.

Parameters:

params (dict[str, Any]) – A dictionary of parameters that may contain the analysis instance and ML model.

Return type:

dict[str, Any]

Returns:

A dictionary containing the resolved parameters.

Raises:

Exception – If the ML model instance received configs to define training configs, but did not define any.

store_parts()[source]#

Generate a dictionary of store parts for the current instance. This method extends the base method to include the ML model parameter.

Return type:

InsertableDict[str, str]

Returns:

An InsertableDict containing the store parts.

config = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'ml_model_inst', 'user'}#
exclude_params_remote_workflow = {'known_shifts', 'ml_model_inst'}#
exclude_params_repr = {'known_shifts', 'ml_model_inst', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {'ml_model', 'selector_steps'}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'ml_model_inst', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class MLModelMixin(*args, **kwargs)[source]#

Bases: MLModelMixinBase

A mixin for tasks that require a single machine learning model, e.g. for evaluation.

Attributes:

Methods:

resolve_param_values_pre_init(params)

Resolve parameters before the array function instances have been initialized.

store_parts()

Returns a law.util.InsertableDict whose values are used to create a store path.

find_keep_columns(collection)

Returns a set of Route objects describing columns that should be kept given a type of column collection.

ml_model = <luigi.parameter.Parameter object>#
allow_empty_ml_model = True#
exclude_params_repr_empty = {'ml_model'}#
classmethod resolve_param_values_pre_init(params)[source]#

Resolve parameters before the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

store_parts()[source]#

Returns a law.util.InsertableDict whose values are used to create a store path. For instance, the parts {"keyA": "a", "keyB": "b", 2: "c"} lead to the path “a/b/c”. The keys can be used by subclassing tasks to overwrite values.

Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

find_keep_columns(collection)[source]#

Returns a set of Route objects describing columns that should be kept given a type of column collection.

Parameters:

collection (ColumnCollection) – The collection to return.

Return type:

set[Route]

Returns:

A set of Route objects.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'ml_model_inst', 'user'}#
exclude_params_remote_workflow = {'known_shifts', 'ml_model_inst'}#
exclude_params_repr = {'known_shifts', 'ml_model_inst', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'ml_model_inst', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class PreparationProducerMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionInstanceMixin, MLModelMixin

Attributes:

Methods:

invokes_preparation_producer(params)

get_producer_dict(params)

build_producer_inst(producer[, params])

Instantiate and return the Producer instance.

teardown_preparation_producer_inst()

resolve_instances(params, shifts)

Build the array function instances.

preparation_producer_inst = <columnflow.tasks.framework.parameters.DerivableInstParameter object>#
exclude_params_index = {'known_shifts', 'local_shift', 'ml_model_inst', 'preparation_producer_inst', 'user'}#
exclude_params_repr = {'known_shifts', 'ml_model_inst', 'notify_custom', 'notify_mattermost', 'notify_slack', 'preparation_producer_inst', 'user'}#
exclude_params_sandbox = {'known_shifts', 'local_shift', 'log_file', 'ml_model_inst', 'preparation_producer_inst', 'sandbox'}#
exclude_params_remote_workflow = {'known_shifts', 'local_shift', 'ml_model_inst', 'preparation_producer_inst'}#
classmethod invokes_preparation_producer(params)[source]#
Return type:

bool

classmethod get_producer_dict(params)[source]#
Return type:

dict[str, Any]

classmethod build_producer_inst(producer, params=None)#

Instantiate and return the Producer instance.

Parameters:
  • producer (str) – Name of the producer class to instantiate.

  • params (dict[str, Any] | None, default: None) – Arguments forwarded to the producer constructor.

Raises:

RuntimeError – If the producer class is not exposed.

Return type:

Producer

Returns:

The producer instance.

teardown_preparation_producer_inst()[source]#
Return type:

None

classmethod resolve_instances(params, shifts)[source]#

Build the array function instances. For single-config/dataset tasks, resolve_instances is implemented by mixin classes such as the ProducersMixin. For multi-config tasks, resolve_instances from the upstream task is called for each config instance. If the resolve_instances function needs to be called for other combinations of parameters (e.g. per dataset), it can be overwritten by the task class.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – Collection of local and global shifts.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

configs = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_repr_empty = {'ml_model'}#
exclude_params_req = {'known_shifts', 'local_shift', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class MLModelDataMixin(*args, **kwargs)[source]#

Bases: PreparationProducerMixin

Attributes:

Methods:

store_parts()

Returns a law.util.InsertableDict whose values are used to create a store path.

single_config = True#
allow_empty_ml_model = False#
store_parts()[source]#

Returns a law.util.InsertableDict whose values are used to create a store path. For instance, the parts {"keyA": "a", "keyB": "b", 2: "c"} lead to the path “a/b/c”. The keys can be used by subclassing tasks to overwrite values.

Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

configs = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'local_shift', 'ml_model_inst', 'preparation_producer_inst', 'user'}#
exclude_params_remote_workflow = {'known_shifts', 'local_shift', 'ml_model_inst', 'preparation_producer_inst'}#
exclude_params_repr = {'known_shifts', 'ml_model_inst', 'notify_custom', 'notify_mattermost', 'notify_slack', 'preparation_producer_inst', 'user'}#
exclude_params_repr_empty = {'ml_model'}#
exclude_params_req = {'known_shifts', 'local_shift', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'local_shift', 'log_file', 'ml_model_inst', 'preparation_producer_inst', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class MLModelsMixin(*args, **kwargs)[source]#

Bases: ConfigTask

Attributes:

Methods:

resolve_param_values_pre_init(params)

Resolve parameters before the array function instances have been initialized.

req_params(inst, **kwargs)

Returns parameters that are jointly defined in this class and another task instance of some other class.

store_parts()

Returns a law.util.InsertableDict whose values are used to create a store path.

find_keep_columns(collection)

Returns a set of Route objects describing columns that should be kept given a type of column collection.

ml_models = <law.parameter.CSVParameter object>#
exclude_params_repr_empty = {'ml_models'}#
allow_empty_ml_models = True#
property ml_models_repr: str#

Returns a string representation of the ML models.

classmethod resolve_param_values_pre_init(params)[source]#

Resolve parameters before the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod req_params(inst, **kwargs)[source]#

Returns parameters that are jointly defined in this class and another task instance of some other class. The parameters are used when calling Task.req(self).

Return type:

dict

property ml_model_insts: list[MLModel]#
store_parts()[source]#

Returns a law.util.InsertableDict whose values are used to create a store path. For instance, the parts {"keyA": "a", "keyB": "b", 2: "c"} lead to the path “a/b/c”. The keys can be used by subclassing tasks to overwrite values.

Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

find_keep_columns(collection)[source]#

Returns a set of Route objects describing columns that should be kept given a type of column collection.

Parameters:

collection (ColumnCollection) – The collection to return.

Return type:

set[Route]

Returns:

A set of Route objects.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class HistProducerClassMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionClassMixin

Mixin to include and access single HistProducer class.

Attributes:

Methods:

resolve_param_values_pre_init(params)

Resolve parameters before the array function instances have been initialized.

req_params(inst, **kwargs)

Returns parameters that are jointly defined in this class and another task instance of some other class.

store_parts()

get_config_lookup_keys(inst_or_params)

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

hist_producer = <luigi.parameter.Parameter object>#
classmethod resolve_param_values_pre_init(params)[source]#

Resolve parameters before the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod req_params(inst, **kwargs)[source]#

Returns parameters that are jointly defined in this class and another task instance of some other class. The parameters are used when calling Task.req(self).

Return type:

dict[str, Any]

property hist_producer_repr: str#

Return a string representation of the hist producer class.

store_parts()[source]#
Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

classmethod get_config_lookup_keys(inst_or_params)[source]#

Returns a dictionary with keys that can be used to lookup state specific values in a config or dictionary, such as default task versions or output locations.

Parameters:

inst_or_params (HistProducerClassMixin | dict[str, Any]) – The tasks instance or its parameters.

Return type:

law.util.InsertiableDict

Returns:

A dictionary with keys that can be used for nested lookup.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class HistProducerMixin(*args, **kwargs)[source]#

Bases: ArrayFunctionInstanceMixin, HistProducerClassMixin

Mixin to include and access a single HistProducer instance.

Attributes:

Methods:

get_hist_producer_dict(params)

build_hist_producer_inst(hist_producer[, params])

Instantiate and return the HistProducer instance.

resolve_instances(params, shifts)

Build the array function instances.

get_known_shifts(params, shifts)

Updates the set of known shifts implemented by this and upstream tasks.

teardown_hist_producer_inst()

hist_producer_inst = <columnflow.tasks.framework.parameters.DerivableInstParameter object>#
exclude_params_index = {'hist_producer_inst', 'known_shifts', 'local_shift', 'user'}#
exclude_params_repr = {'hist_producer_inst', 'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_sandbox = {'hist_producer_inst', 'known_shifts', 'local_shift', 'log_file', 'sandbox'}#
exclude_params_remote_workflow = {'hist_producer_inst', 'known_shifts', 'local_shift'}#
invokes_hist_producer = False#
classmethod get_hist_producer_dict(params)[source]#
Return type:

dict[str, Any]

classmethod build_hist_producer_inst(hist_producer, params=None)[source]#

Instantiate and return the HistProducer instance.

Parameters:
  • producer – Name of the hist producer class to instantiate.

  • params (dict[str, Any] | None, default: None) – Arguments forwarded to the hist producer constructor.

Raises:

RuntimeError – If the hist producer class is not exposed.

Return type:

Producer

Returns:

The hist producer instance.

classmethod resolve_instances(params, shifts)[source]#

Build the array function instances. For single-config/dataset tasks, resolve_instances is implemented by mixin classes such as the ProducersMixin. For multi-config tasks, resolve_instances from the upstream task is called for each config instance. If the resolve_instances function needs to be called for other combinations of parameters (e.g. per dataset), it can be overwritten by the task class.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – Collection of local and global shifts.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod get_known_shifts(params, shifts)[source]#

Updates the set of known shifts implemented by this and upstream tasks.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – TaskShifts object to adjust.

Return type:

None

teardown_hist_producer_inst()[source]#
Return type:

None

property hist_producer_repr: str#

Return a string representation of the hist producer instance.

configs = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'local_shift', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class InferenceModelClassMixin(*args, **kwargs)[source]#

Bases: ConfigTask

Attributes:

Methods:

resolve_param_values_pre_init(params)

Resolve parameters before the array function instances have been initialized.

req_params(inst, **kwargs)

Returns parameters that are jointly defined in this class and another task instance of some other class.

store_parts()

inference_model = <luigi.parameter.Parameter object>#
classmethod resolve_param_values_pre_init(params)[source]#

Resolve parameters before the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod req_params(inst, **kwargs)[source]#

Returns parameters that are jointly defined in this class and another task instance of some other class. The parameters are used when calling Task.req(self).

Return type:

dict

property inference_model_repr#
store_parts()[source]#
Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class InferenceModelMixin(*args, **kwargs)[source]#

Bases: InferenceModelClassMixin

Attributes:

Methods:

build_inference_model_inst(inference_model, ...)

Instantiate and return the InferenceModel instance.

resolve_param_values_post_init(params)

Resolve parameters after the array function instances have been initialized.

resolve_instances(params, shifts)

Build the array function instances.

inference_model_inst = <columnflow.tasks.framework.parameters.DerivableInstParameter object>#
exclude_params_index = {'inference_model_inst', 'known_shifts', 'user'}#
exclude_params_repr = {'inference_model_inst', 'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_sandbox = {'inference_model_inst', 'known_shifts', 'log_file', 'sandbox'}#
exclude_params_remote_workflow = {'inference_model_inst', 'known_shifts'}#
classmethod build_inference_model_inst(inference_model, config_insts, **kwargs)[source]#

Instantiate and return the InferenceModel instance.

Parameters:
  • inference_model (str) – Name of the inference model class to instantiate.

  • config_insts (list[Config]) – List of configuration objects that are passed to the inference model constructor.

  • kwargs – Additional keywork arguments forwarded to the inference model constructor.

Return type:

InferenceModel

Returns:

The inference model instance.

classmethod resolve_param_values_post_init(params)[source]#

Resolve parameters after the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod resolve_instances(params, shifts)[source]#

Build the array function instances. For single-config/dataset tasks, resolve_instances is implemented by mixin classes such as the ProducersMixin. For multi-config tasks, resolve_instances from the upstream task is called for each config instance. If the resolve_instances function needs to be called for other combinations of parameters (e.g. per dataset), it can be overwritten by the task class.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – Collection of local and global shifts.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class CategoriesMixin(*args, **kwargs)[source]#

Bases: ConfigTask

Attributes:

Methods:

resolve_param_values_post_init(params)

Resolve parameters after the array function instances have been initialized.

categories = <law.parameter.CSVParameter object>#
default_categories = None#
allow_empty_categories = False#
classmethod resolve_param_values_post_init(params)[source]#

Resolve parameters after the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

property categories_repr: str#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class VariablesMixin(*args, **kwargs)[source]#

Bases: ConfigTask

Attributes:

Methods:

resolve_param_values_post_init(params)

Resolve parameters after the array function instances have been initialized.

split_multi_variable(variable)

Splits a multi-dimensional variable given in the format "var_a[-var_b[-...]]" into separate variable names using a delimiter ("-") and returns a tuple.

join_multi_variable(variables)

Joins the name of multiple variables using a delimiter ("-") into a single string that represents a multi-dimensional variable and returns it.

variables = <law.parameter.CSVParameter object>#
default_variables = None#
allow_empty_variables = False#
allow_missing_variables = False#
classmethod resolve_param_values_post_init(params)[source]#

Resolve parameters after the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod split_multi_variable(variable)[source]#

Splits a multi-dimensional variable given in the format "var_a[-var_b[-...]]" into separate variable names using a delimiter ("-") and returns a tuple.

Return type:

tuple[str]

classmethod join_multi_variable(variables)[source]#

Joins the name of multiple variables using a delimiter ("-") into a single string that represents a multi-dimensional variable and returns it.

Return type:

str

property variables_repr: str#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class DatasetsProcessesMixin(*args, **kwargs)[source]#

Bases: ConfigTask

Attributes:

Methods:

modify_task_attributes()

Hook that is called by law's task register meta class right after subclass creation to update class-level attributes.

resolve_param_values_pre_init(params)

Resolve parameters before the array function instances have been initialized.

resolve_instances(params, shifts)

Build the array function instances.

get_known_shifts(params, shifts)

Updates the set of known shifts implemented by this and upstream tasks.

datasets = <law.parameter.CSVParameter object>#
datasets_multi = <law.parameter.MultiCSVParameter object>#
processes = <law.parameter.CSVParameter object>#
processes_multi = <law.parameter.MultiCSVParameter object>#
allow_empty_datasets = False#
allow_empty_processes = False#
classmethod modify_task_attributes()[source]#

Hook that is called by law’s task register meta class right after subclass creation to update class-level attributes.

Return type:

None

classmethod resolve_param_values_pre_init(params)[source]#

Resolve parameters before the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod resolve_instances(params, shifts)[source]#

Build the array function instances. For single-config/dataset tasks, resolve_instances is implemented by mixin classes such as the ProducersMixin. For multi-config tasks, resolve_instances from the upstream task is called for each config instance. If the resolve_instances function needs to be called for other combinations of parameters (e.g. per dataset), it can be overwritten by the task class.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – Collection of local and global shifts.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod get_known_shifts(params, shifts)[source]#

Updates the set of known shifts implemented by this and upstream tasks.

Parameters:
  • params (dict[str, Any]) – Dictionary of task parameters.

  • shifts (TaskShifts) – TaskShifts object to adjust.

Return type:

None

property datasets_repr: str#
property processes_repr: str#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class ShiftSourcesMixin(*args, **kwargs)[source]#

Bases: ConfigTask

Attributes:

Methods:

resolve_param_values_post_init(params)

Resolve parameters after the array function instances have been initialized.

expand_shift_sources(sources)

reduce_shifts(shifts)

store_parts()

Returns a law.util.InsertableDict whose values are used to create a store path.

shift_sources = <law.parameter.CSVParameter object>#
allow_empty_shift_sources = False#
classmethod resolve_param_values_post_init(params)[source]#

Resolve parameters after the array function instances have been initialized.

Parameters:

params (dict[str, Any]) – Dictionary of task parameters.

Return type:

dict[str, Any]

Returns:

Updated dictionary of task parameters.

classmethod expand_shift_sources(sources)[source]#
Return type:

list[str]

classmethod reduce_shifts(shifts)[source]#
Return type:

list[str]

property shift_sources_repr: str#
store_parts()[source]#

Returns a law.util.InsertableDict whose values are used to create a store path. For instance, the parts {"keyA": "a", "keyB": "b", 2: "c"} lead to the path “a/b/c”. The keys can be used by subclassing tasks to overwrite values.

Return type:

InsertableDict

Returns:

Dictionary with parts that will be translated into an output directory path.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class DatasetShiftSourcesMixin(*args, **kwargs)[source]#

Bases: ShiftSourcesMixin, DatasetTask

Attributes:

shift = None#
effective_shift = None#
allow_empty_shift = True#
allow_empty_shift_sources = True#
configs = None#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'local_shift', 'user'}#
exclude_params_remote_workflow = {'known_shifts', 'local_shift'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'local_shift', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'local_shift', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class ChunkedIOMixin(*args, **kwargs)[source]#

Bases: ConfigTask

Attributes:

Methods:

raise_if_not_finite(ak_array)

Checks whether all values in array ak_array are finite.

raise_if_overlapping(ak_arrays)

Checks whether fields of ak_arrays overlap.

iter_chunked_io(*args, **kwargs)

check_finite_output = <luigi.parameter.BoolParameter object>#
check_overlapping_inputs = <luigi.parameter.BoolParameter object>#
exclude_params_req = {'check_finite_output', 'check_overlapping_inputs', 'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
default_chunk_size = 100000#
default_pool_size = 2#
classmethod raise_if_not_finite(ak_array)[source]#

Checks whether all values in array ak_array are finite.

The check is performed using the numpy.isfinite() function.

Parameters:

ak_array (Array) – Array with events to check.

Raises:

ValueError – If any value in ak_array is not finite.

Return type:

None

classmethod raise_if_overlapping(ak_arrays)[source]#

Checks whether fields of ak_arrays overlap.

Parameters:

ak_arrays (Sequence[Array]) – Arrays with fields to check.

Raises:

ValueError – If at least one overlap is found.

Return type:

None

iter_chunked_io(*args, **kwargs)[source]#
exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
class HistHookMixin(*args, **kwargs)[source]#

Bases: ConfigTask

Attributes:

Methods:

invoke_hist_hooks(hists)

Invoke hooks to modify histograms before further processing such as plotting.

hist_hooks = <law.parameter.CSVParameter object>#
invoke_hist_hooks(hists)[source]#

Invoke hooks to modify histograms before further processing such as plotting.

Return type:

dict[Config, dict[Process, Any]]

property hist_hooks_repr: str#

Return a string representation of the hist hooks.

exclude_index = False#
exclude_params_branch = {'user'}#
exclude_params_index = {'known_shifts', 'user'}#
exclude_params_remote_workflow = {'known_shifts'}#
exclude_params_repr = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_repr_empty = {}#
exclude_params_req = {'known_shifts', 'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#
exclude_params_req_get = {}#
exclude_params_req_set = {}#
exclude_params_sandbox = {'known_shifts', 'log_file', 'sandbox'}#
exclude_params_workflow = {'notify_custom', 'notify_mattermost', 'notify_slack', 'user'}#