columnflow.inference#

Basic objects for defining statistical inference models.

Classes:

ParameterType(value)

Parameter type flag.

ParameterTransformation(value)

Flags denoting transformations to be applied on parameters.

ParameterTransformations(transformations)

Container around a sequence of ParameterTransformation's with a few convenience methods.

InferenceModel(config_inst)

Interface to statistical inference models with connections to config objects (such as py:class:order.Config or order.Dataset).

Functions:

inference_model([func, bases])

Decorator for creating a new InferenceModel subclass with additional, optional bases and attaching the decorated function to it as init_func.

class ParameterType(value)[source]#

Bases: Enum

Parameter type flag.

class ParameterTransformation(value)[source]#

Bases: Enum

Flags denoting transformations to be applied on parameters.

class ParameterTransformations(transformations)[source]#

Bases: tuple

Container around a sequence of ParameterTransformation’s with a few convenience methods.

class InferenceModel(config_inst)[source]#

Bases: Derivable

Interface to statistical inference models with connections to config objects (such as py:class:order.Config or order.Dataset).

The internal structure to describe a model looks as follows (in yaml style) and is accessible through model as well as property access to its top-level objects.

categories:
  - name: cat1
    config_category: 1e
    config_variable: ht
    config_data_datasets: [data_mu_a]
    data_from_processes: []
    mc_stats: 10
    processes:
      - name: HH
        config_process: hh
        is_signal: True
        config_mc_datasets: [hh_ggf]
        scale: 1.0
        parameters:
          - name: lumi
            type: rate_gauss
            effect: 1.02
            config_shift_source: null
          - name: pu
            type: rate_gauss
            effect: [0.97, 1.02]
            config_shift_source: null
          - name: pileup
            type: shape
            effect: 1.0
            config_shift_source: minbias_xs
      - name: tt
        is_signal: False
        config_process: ttbar
        config_mc_datasets: [tt_sl, tt_dl, tt_fh]
        scale: 1.0
        parameters:
          - name: lumi
            type: rate_gauss
            effect: 1.02
            config_shift_source: null

  - name: cat2
    ...

parameter_groups:
  - name: rates
    parameters_names: [lumi, pu]
  - ...
name#
type: str

The unique name of this model.

config_inst#
type: order.Config, None

Reference to the order.Config object.

config_callbacks#
type: list

A list of callables that are invoked after set_config() was called.

model#
type: DotDict

The internal data structure representing the model.

Classes:

YamlDumper(*args, **kwargs)

YAML dumper for statistical inference models with ammended representers to serialize internal, structured objects as safe, standard objects.

Methods:

inference_model([func, bases])

Decorator for creating a new InferenceModel subclass with additional, optional bases and attaching the decorated function to it as init_func.

model_spec()

Returns a dictionary representing the top-level structure of the model.

category_spec(name[, config_category, ...])

Returns a dictionary representing a category (interchangeably called bin or channel in other tools), forwarding all arguments.

process_spec(name[, config_process, ...])

Returns a dictionary representing a process, forwarding all arguments.

parameter_spec(name, type[, ...])

Returns a dictionary representing a (nuisance) parameter, forwarding all arguments.

parameter_group_spec(name[, parameter_names])

Returns a dictionary representing a group of parameter names.

require_shapes_for_parameter(param_obj)

Returns True if for a certain parameter object param_obj varied shapes are needed, and False otherwise.

to_yaml([stream])

Writes the content of the model into a file-like object stream when given, and returns a string representation otherwise.

pprint()

Pretty-prints the content of the model in yaml-style.

get_categories([category, only_names])

Returns a list of categories whose name match category.

get_category(category[, only_name, silent])

Returns a single category whose name matches category.

has_category(category)

Returns True if a category whose name matches category is existing, and False otherwise.

add_category(*args, **kwargs)

Adds a new category with all args and kwargs used to create the structured category dictionary via category_spec().

remove_category(category)

Removes one or more categories whose names match category.

get_processes([process, category, ...])

Returns a dictionary of processes whose names match process, mapped to the name of the category they belong to.

get_process(process[, category, only_name, ...])

Returns a single process whose name matches process, and optionally, whose category's name matches category.

has_process(process[, category])

Returns True if a process whose name matches process, and optionally whose category's name matches category, is existing, and False otherwise.

add_process(*args[, category, silent])

Adds a new process to all categories whose names match category, with all args and kwargs used to create the structured process dictionary via process_spec().

remove_process(process[, category])

Removes one or more processes whose names match process, and optionally whose category's name match category.

get_parameters([parameter, process, ...])

Returns a dictionary of parameter whose names match parameter, mapped twice to the name of the category and the name of the process they belong to.

get_parameter(parameter[, process, ...])

Returns a single parameter whose name matches parameter, and optionally, whose category's and process' name matches category and process.

has_parameter(parameter[, process, category])

Returns True if a parameter whose name matches parameter, and optionally whose category's and process' name match category and process, is existing, and False otherwise.

add_parameter(*args[, process, category])

Adds a new parameter to all categories and processes whose names match category and process, with all args and kwargs used to create the structured parameter dictionary via parameter_spec().

remove_parameter(parameter[, process, category])

Removes one or more parameters whose names match parameter, and optionally whose category's and process' name match category and process.

get_parameter_groups([group, only_names])

Returns a list of parameter group whose name match group.

get_parameter_group(group[, only_name])

Returns a single parameter group whose name matches group.

has_parameter_group(group)

Returns True if a parameter group whose name matches group is existing, and False otherwise.

add_parameter_group(*args, **kwargs)

Adds a new parameter group with all args and kwargs used to create the structured parameter group dictionary via parameter_group_spec().

remove_parameter_group(group)

Removes one or more parameter groups whose names match group.

add_parameter_to_group(parameter, group)

Adds a parameter named parameter to one or multiple parameter groups whose name match group.

remove_parameter_from_groups(parameter[, group])

Removes all parameters matching parameter from parameter groups whose names match group.

get_categories_with_process(process)

Returns a flat list of category names that contain processes matching process.

get_processes_with_parameter(parameter[, ...])

Returns a dictionary of names of processes that contain a parameter whose names match parameter, mapped to categories names.

get_categories_with_parameter(parameter[, ...])

Returns a dictionary of category names mapping to process names that contain parameters whose name match parameter.

get_groups_with_parameter(parameter)

Returns a list of names of parameter groups that contain a parameter whose name matches parameter, which can be a string, a pattern, or sequence of them.

cleanup()

Cleans the internal model structure by removing empty and dangling objects by calling remove_empty_categories(), remove_dangling_parameters_from_groups() and remove_empty_parameter_groups() in that order.

remove_empty_categories()

Removes all categories that contain no processes.

remove_dangling_parameters_from_groups()

Removes names of parameters from parameter groups that are not assigned to any process in any category.

remove_empty_parameter_groups()

Removes parameter groups that contain no parameter names.

iter_processes([process, category])

Generator that iteratively yields all processes whose names match process, optionally in all categories whose names match category.

iter_parameters([parameter, process, category])

Generator that iteratively yields all parameters whose names match parameter, optionally in all processes and categories whose names match process and category.

scale_process(scale[, process, category])

Sets the scale attribute of all processes whose names match process, optionally in all categories whose names match category, to scale.

class YamlDumper(*args, **kwargs)[source]#

Bases: SafeDumper

YAML dumper for statistical inference models with ammended representers to serialize internal, structured objects as safe, standard objects.

classmethod inference_model(func=None, bases=(), **kwargs)[source]#

Decorator for creating a new InferenceModel subclass with additional, optional bases and attaching the decorated function to it as init_func. All additional kwargs are added as class members of the new subclasses.

Return type:

DerivableMeta | Callable

classmethod model_spec()[source]#

Returns a dictionary representing the top-level structure of the model. :rtype: DotDict

  • categories: List of category_spec() objects.

  • parameter_groups: List of paramter_group_spec() objects.

classmethod category_spec(name, config_category=None, config_variable=None, config_data_datasets=None, data_from_processes=None, mc_stats=None)[source]#

Returns a dictionary representing a category (interchangeably called bin or channel in other tools), forwarding all arguments. :rtype: DotDict

  • name: The name of the category in the model.

  • config_category: The name of the source category in the config to use.

  • config_variable: The name of the variable in the config to use.

  • config_data_datasets: List of names of datasets in the config to use for real data.

  • data_from_processes: Optional list of names of process_spec() objects that, when config_data_datasets is not defined, make of a fake data contribution.

  • mc_stats: Either None to disable MC stat uncertainties, or a float or tuple of floats to control the options of MC stat options.

classmethod process_spec(name, config_process=None, is_signal=False, config_mc_datasets=None, scale=1.0)[source]#

Returns a dictionary representing a process, forwarding all arguments. :rtype: DotDict

  • name: The name of the process in the model.

  • is_signal: A boolean flag deciding whether this process describes signal.

  • config_process: The name of the source process in the config to use.

  • config_mc_datasets: List of names of MC datasets in the config to use.

  • scale: A float value to scale the process, defaulting to 1.0.

classmethod parameter_spec(name, type, transformations=(<ParameterTransformation.none: 'none'>, ), config_shift_source=None, effect=1.0)[source]#

Returns a dictionary representing a (nuisance) parameter, forwarding all arguments. :rtype: DotDict

  • name: The name of the parameter in the model.

  • type: A ParameterType instance describing the type of this parameter.

  • transformations: A sequence of ParameterTransformation instances describing transformations to be applied to the effect of this parameter.

  • config_shift_source: The name of a systematic shift source in the config that this parameter corresponds to.

  • effect: An arbitrary object describing the effect of the parameter (e.g. float for symmetric rate effects, 2-tuple for down/up variation, etc).

classmethod parameter_group_spec(name, parameter_names=None)[source]#

Returns a dictionary representing a group of parameter names. :rtype: DotDict

  • name: The name of the parameter group in the model.

  • parameter_names: Names of parameter objects this group contains.

classmethod require_shapes_for_parameter(param_obj)[source]#

Returns True if for a certain parameter object param_obj varied shapes are needed, and False otherwise.

Return type:

bool

to_yaml(stream=None)[source]#

Writes the content of the model into a file-like object stream when given, and returns a string representation otherwise.

Return type:

str | None

pprint()[source]#

Pretty-prints the content of the model in yaml-style.

Return type:

None

get_categories(category=None, only_names=False)[source]#

Returns a list of categories whose name match category. category can be a string, a pattern, or sequence of them. When only_names is True, only names of categories are returned rather than structured dictionaries.

Return type:

list[DotDict | str]

get_category(category, only_name=False, silent=False)[source]#

Returns a single category whose name matches category. category can be a string, a pattern, or sequence of them. An exception is raised if no or more than one category is found, unless silent is True in which case None is returned. When only_name is True, only the name of the category is returned rather than a structured dictionary.

Return type:

DotDict | str

has_category(category)[source]#

Returns True if a category whose name matches category is existing, and False otherwise. category can be a string, a pattern, or sequence of them.

Return type:

bool

add_category(*args, **kwargs)[source]#

Adds a new category with all args and kwargs used to create the structured category dictionary via category_spec(). If a category with the same name already exists, an exception is raised.

Return type:

None

remove_category(category)[source]#

Removes one or more categories whose names match category. Returns True if at least one category was removed, and False otherwise. category can be a string, a pattern, or sequence of them.

Return type:

bool

get_processes(process=None, category=None, only_names=False, flat=False)[source]#

Returns a dictionary of processes whose names match process, mapped to the name of the category they belong to. Categories can optionally be filtered through category. Both process and category can be a string, a pattern, or sequence of them.

When only_names is True, only names of processes are returned rather than structured dictionaries. When flat is True, a flat, unique list of process names is returned.

Return type:

dict[str, DotDict | str] | list[str]

get_process(process, category=None, only_name=False, silent=False)[source]#

Returns a single process whose name matches process, and optionally, whose category’s name matches category. Both process and category can be a string, a pattern, or sequence of them.

An exception is raised if no or more than one process is found, unless silent is True in which case None is returned. When only_name is True, only the name of the process is returned rather than a structured dictionary.

Return type:

DotDict | str

has_process(process, category=None)[source]#

Returns True if a process whose name matches process, and optionally whose category’s name matches category, is existing, and False otherwise. Both process and category can be a string, a pattern, or sequence of them.

Return type:

bool

add_process(*args, category=None, silent=False, **kwargs)[source]#

Adds a new process to all categories whose names match category, with all args and kwargs used to create the structured process dictionary via process_spec(). category can be a string, a pattern, or sequence of them.

If a process with the same name already exists in one of the categories, an exception is raised unless silent is True.

Return type:

None

remove_process(process, category=None)[source]#

Removes one or more processes whose names match process, and optionally whose category’s name match category. Both process and category can be a string, a pattern, or sequence of them. Returns True if at least one process was removed, and False otherwise.

Return type:

bool

get_parameters(parameter=None, process=None, category=None, only_names=False, flat=False)[source]#

Returns a dictionary of parameter whose names match parameter, mapped twice to the name of the category and the name of the process they belong to. Categories and processes can optionally be filtered through category and process. All three, parameter, process and category can be a string, a pattern, or sequence of them.

When only_names is True, only names of parameters are returned rather than structured dictionaries. When flat is True, a flat, unique list of parameter names is returned.

Return type:

dict[str, dict[str, DotDict | str]] | list[str]

get_parameter(parameter, process=None, category=None, only_name=False, silent=False)[source]#

Returns a single parameter whose name matches parameter, and optionally, whose category’s and process’ name matches category and process. All three, parameter, process and category can be a string, a pattern, or sequence of them.

An exception is raised if no or more than one parameter is found, unless silent is True in which case None is returned. When only_name is True, only the name of the parameter is returned rather than a structured dictionary.

Return type:

DotDict | str

has_parameter(parameter, process=None, category=None)[source]#

Returns True if a parameter whose name matches parameter, and optionally whose category’s and process’ name match category and process, is existing, and False otherwise. All three, parameter, process and category can be a string, a pattern, or sequence of them.

Return type:

bool

add_parameter(*args, process=None, category=None, **kwargs)[source]#

Adds a new parameter to all categories and processes whose names match category and process, with all args and kwargs used to create the structured parameter dictionary via parameter_spec(). Both process and category can be a string, a pattern, or sequence of them.

If a parameter with the same name already exists in one of the processes throughout the categories, an exception is raised.

Return type:

DotDict

remove_parameter(parameter, process=None, category=None)[source]#

Removes one or more parameters whose names match parameter, and optionally whose category’s and process’ name match category and process. All three, parameter, process and category can be a string, a pattern, or sequence of them. Returns True if at least one parameter was removed, and False otherwise.

Return type:

bool

get_parameter_groups(group=None, only_names=False)[source]#

Returns a list of parameter group whose name match group. group can be a string, a pattern, or sequence of them.

When only_names is True, only names of parameter groups are returned rather than structured dictionaries.

Return type:

list[DotDict | str]

get_parameter_group(group, only_name=False)[source]#

Returns a single parameter group whose name matches group. group can be a string, a pattern, or sequence of them.

An exception is raised in case no or more than one parameter group is found. When only_name is True, only the name of the parameter group is returned rather than a structured dictionary.

Return type:

DotDict | str

has_parameter_group(group)[source]#

Returns True if a parameter group whose name matches group is existing, and False otherwise. group can be a string, a pattern, or sequence of them.

Return type:

bool

add_parameter_group(*args, **kwargs)[source]#

Adds a new parameter group with all args and kwargs used to create the structured parameter group dictionary via parameter_group_spec(). If a group with the same name already exists, an exception is raised.

Return type:

None

remove_parameter_group(group)[source]#

Removes one or more parameter groups whose names match group. group can be a string, a pattern, or sequence of them. Returns True if at least one group was removed, and False otherwise.

Return type:

bool

add_parameter_to_group(parameter, group)[source]#

Adds a parameter named parameter to one or multiple parameter groups whose name match group. group can be a string, a pattern, or sequence of them. When parameter is a pattern or regular expression, all previously added, matching parameters are added. Otherwise, parameter is added as as. If a parameter was added to at least one group, True is returned and False otherwise.

Return type:

bool

remove_parameter_from_groups(parameter, group=None)[source]#

Removes all parameters matching parameter from parameter groups whose names match group. Both parameter and group can be a string, a pattern, or sequence of them. Returns True if at least one parameter was removed, and False otherwise.

Return type:

bool

get_categories_with_process(process)[source]#

Returns a flat list of category names that contain processes matching process. process can be a string, a pattern, or sequence of them.

Return type:

list[str]

get_processes_with_parameter(parameter, category=None, flat=True)[source]#

Returns a dictionary of names of processes that contain a parameter whose names match parameter, mapped to categories names. Categories can optionally be filtered through category. Both parameter and category can be a string, a pattern, or sequence of them.

When flat is True, a flat, unique list of process names is returned.

Return type:

list[str] | dict[str, list[str]]

get_categories_with_parameter(parameter, process=None, flat=True)[source]#

Returns a dictionary of category names mapping to process names that contain parameters whose name match parameter. Processes can optionally be filtered through process. Both parameter and process can be a string, a pattern, or sequence of them.

When flat is True, a flat, unique list of category names is returned.

Return type:

list[str] | dict[str, list[str]]

get_groups_with_parameter(parameter)[source]#

Returns a list of names of parameter groups that contain a parameter whose name matches parameter, which can be a string, a pattern, or sequence of them.

Return type:

list[str]

cleanup()[source]#

Cleans the internal model structure by removing empty and dangling objects by calling remove_empty_categories(), remove_dangling_parameters_from_groups() and remove_empty_parameter_groups() in that order.

Return type:

None

remove_empty_categories()[source]#

Removes all categories that contain no processes.

Return type:

None

remove_dangling_parameters_from_groups()[source]#

Removes names of parameters from parameter groups that are not assigned to any process in any category.

Return type:

None

remove_empty_parameter_groups()[source]#

Removes parameter groups that contain no parameter names.

Return type:

None

iter_processes(process=None, category=None)[source]#

Generator that iteratively yields all processes whose names match process, optionally in all categories whose names match category. The yielded value is a 2-tuple containing the cagegory name and the process object.

Return type:

Generator[tuple[DotDict, DotDict], None, None]

iter_parameters(parameter=None, process=None, category=None)[source]#

Generator that iteratively yields all parameters whose names match parameter, optionally in all processes and categories whose names match process and category. The yielded value is a 3-tuple containing the cagegory name, the process name and the parameter object.

Return type:

Generator[tuple[DotDict, DotDict, DotDict], None, None]

scale_process(scale, process=None, category=None)[source]#

Sets the scale attribute of all processes whose names match process, optionally in all categories whose names match category, to scale. Returns True if at least one process was found and scale, and False otherwise.

Return type:

bool

inference_model(func=None, bases=(), **kwargs)#

Decorator for creating a new InferenceModel subclass with additional, optional bases and attaching the decorated function to it as init_func. All additional kwargs are added as class members of the new subclasses.

Return type:

DerivableMeta | Callable