external

`external`#

Tasks dealing with external data.

Classes:

`GetDatasetLFNs`(args, *kwargs)	Task to get list of logical file names (LFNs).
`GetDatasetLFNsWrapper`(args, *kwargs)
`BundleExternalFiles`(args, *kwargs)	Task to collect external files.

class GetDatasetLFNs(*args, **kwargs)[source]#

Bases: DatasetTask, TransferLocalFile

Task to get list of logical file names (LFNs).

Attributes:

`replicas`
`validate`
`version`	Version parameter - deactivated for `GetDatasetLFNs`
`sandbox`	Defines sandbox for this task.
`exclude_index`
`exclude_params_index`
`exclude_params_remote_workflow`
`exclude_params_repr`
`exclude_params_repr_empty`
`exclude_params_req`
`exclude_params_req_get`
`exclude_params_req_set`
`exclude_params_sandbox`

Methods:

`resolve_param_values`(params)	Resolve parameter values params from command line and propagate them to this set of parameters.
`single_output`()	Creates a remote target file for the final .json file containing the list of LFNs.
`run`()	Run function for this task.
`get_dataset_lfns_dasgoclient`(dataset_inst, ...)	Get the LNF information with the `dasgoclient`.
`iter_nano_files`(task[, fs, lfn_indices, ...])	Generator function that reduces the boilerplate code for looping over files referred to by lfn_indices given the lfns obtained by this task which needs to be complete for this function to succeed.

replicas = <luigi.parameter.IntParameter object>#

validate = <law.parameter.OptionalBoolParameter object>#

version = None#: Version parameter - deactivated for GetDatasetLFNs

classmethod resolve_param_values(params)[source]#

Resolve parameter values params from command line and propagate them to this set of parameters.

Parameters:: params (DotDict) – Parameters provided at command line level.
Return type:: DotDict
Returns:: Updated list of parameter values.

property sandbox: str#

Defines sandbox for this task.

Returns:: Path to shell script that sets up the requested sandbox.

single_output()[source]#

Creates a remote target file for the final .json file containing the list of LFNs.

Return type:: FileSystemFileTarget
Returns:: Law remote target with the initialized output name

run()[source]#

Run function for this task.

Raises:: ValueError – If number of loaded LFNs does not correspond to number of LFNs specified in this dataset_info_inst.

get_dataset_lfns_dasgoclient(dataset_inst, shift_inst, dataset_key)[source]#

Get the LNF information with the dasgoclient.

Parameters:

dataset_inst (Dataset) – Current dataset instance, currently not used.
shift_inst (Shift) – Current shift instance, currently not used.
dataset_key (str) – DAS key identifier for the current dataset.

Raises:

Exception – If query with dasgoclient fails.

Return type:

list[str]

Returns:

The list of LFNs corresponding to the dataset with the identifier dataset_key.

iter_nano_files(task, fs=None, lfn_indices=None, eager_lookup=1)[source]#

Generator function that reduces the boilerplate code for looping over files referred to by lfn_indices given the lfns obtained by this task which needs to be complete for this function to succeed.

When lfn_indices are not given, task must be a branch of a DatasetTask workflow whose branch value is used instead.

Parameters:

task (AnalysisTask | DatasetTask) – Current task that needs to access the nanoAOD files
fs (str | Sequence[str] | None, default: None) – Name of the local or remote file system where the LFNs are located, defaults to None
lfn_indices (list[int] | None, default: None) – List of indices of LFNs that are processed by this task instance, defaults to None
eager_lookup (bool | int, default: 1) – Look at the next fs if stat takes too long, defaults to 1

Raises:

TypeError – If task is not of type BaseWorkflow or not a task analyzing a single branch in the task tree
Exception – If current task is not complete as indicated with self.complete()
ValueError – If no fs is provided at call and none can be found in either the config instance or the law config.
Exception – If a given LFN cannot be found at any fs

Yield:

a file target that points to a LFN

Return type:

None

exclude_index = False#

exclude_params_index = {'local_shift'}#

exclude_params_remote_workflow = {'local_shift'}#

exclude_params_repr = {}#

exclude_params_repr_empty = {}#

exclude_params_req = {'local_shift'}#

exclude_params_req_get = {}#

exclude_params_req_set = {}#

exclude_params_sandbox = {'local_shift', 'log_file', 'sandbox'}#

class GetDatasetLFNsWrapper(*args, **kwargs)#

Bases: AnalysisTask, WrapperTask

Attributes:

`configs`
`datasets`
`exclude_index`
`exclude_params_index`
`exclude_params_repr`
`exclude_params_repr_empty`
`exclude_params_req`
`exclude_params_req_get`
`exclude_params_req_set`
`exclude_params_sandbox`
`shifts`
`skip_configs`
`skip_datasets`
`skip_shifts`
`version`

Methods:

`requires`()	Collect requirements defined by the underlying `require_cls` of the `WrapperTask` depending on optional additional parameters.
`update_wrapper_params`(params)

configs = <law.parameter.CSVParameter object>#

datasets = <law.parameter.CSVParameter object>#

exclude_index = False#

exclude_params_index = {}#

exclude_params_repr = {}#

exclude_params_repr_empty = {}#

exclude_params_req = {}#

exclude_params_req_get = {}#

exclude_params_req_set = {}#

exclude_params_sandbox = {'log_file', 'sandbox'}#

requires() → Requirements#

Collect requirements defined by the underlying require_cls of the WrapperTask depending on optional additional parameters.

Return type:: Requirements
Returns:: Requirements for the WrapperTask instance.

shifts = <law.parameter.CSVParameter object>#

skip_configs = <law.parameter.CSVParameter object>#

skip_datasets = <law.parameter.CSVParameter object>#

skip_shifts = <law.parameter.CSVParameter object>#

update_wrapper_params(params)#

version = None#

class BundleExternalFiles(*args, **kwargs)[source]#

Bases: ConfigTask, TransferLocalFile

Task to collect external files.

This task is intended to download source files for other tasks, such as files containing corrections for objects, the “golden” json files, source files for the calculation of pileup weights, and others.

All information about the relevant external files is extracted from the given config_inst, which must contain the keyword external_files in the auxiliary information. This can look like this:

# cfg is the current config instance
cfg.x.external_files = DotDict.wrap({
# The following assumes that the zip files are reachable under the
# url ``SOURCE_URL``
# jet energy correction
"jet_jerc": (f"{SOURCE_URL}/POG/JME/{year}{corr_postfix}_UL/jet_jerc.json.gz", "v1"),

# tau energy correction and scale factors
"tau_sf": (f"{SOURCE_URL}/POG/TAU/{year}{corr_postfix}_UL/tau.json.gz", "v1"),

# electron scale factors
"electron_sf": (f"{SOURCE_URL}/POG/EGM/{year}{corr_postfix}_UL/electron.json.gz", "v1"),

The entries in this DotDict can either be simply the path to the source files or can be a tuple of the format (path/or/url/to/source/file, VERSION) to introduce a versioning mechanism for external files.

Attributes:

`replicas`
`version`
`files_hash`	Create a hash based on all external files.
`file_names`	Create a unique basename for each external file.
`files`
`exclude_index`
`exclude_params_index`
`exclude_params_repr`
`exclude_params_repr_empty`
`exclude_params_req`
`exclude_params_req_get`
`exclude_params_req_set`
`exclude_params_sandbox`

Methods:

`create_unique_basename`(path)	Create a unique basename.
`get_files`([output])
`single_output`()
`run`()	The task run method, to be overridden in a subclass.

replicas = <luigi.parameter.IntParameter object>#

version = None#

classmethod create_unique_basename(path)[source]#

Create a unique basename.

Parameters:: path (tuple[str] | str) – path to create a unique basename for
Return type:: str
Returns:: Unique basename

property files_hash: str#

Create a hash based on all external files.

Returns:: Hash based on the flattened list of external files in the current config instance.

property file_names: DotDict#

Create a unique basename for each external file.

Returns:: DotDict of same shape as external_files DotDict with unique basenames.

get_files(output=None)[source]#

property files#

single_output()[source]#

exclude_index = False#

exclude_params_index = {}#

exclude_params_repr = {}#

exclude_params_repr_empty = {}#

exclude_params_req = {}#

exclude_params_req_get = {}#

exclude_params_req_set = {}#

exclude_params_sandbox = {'log_file', 'sandbox'}#

run()[source]#

The task run method, to be overridden in a subclass.

See Task.run

`GetDatasetLFNs`(args, *kwargs)	Task to get list of logical file names (LFNs).
`GetDatasetLFNsWrapper`(args, *kwargs)
`BundleExternalFiles`(args, *kwargs)	Task to collect external files.

external

Contents

external#

`external`#