stats
#
Selector helpers for book keeping of selection and event weight statistics for aggregation over datasets.
Classes:
|
|
|
- class increment_stats(*args, **kwargs)[source]#
Bases:
Selector
Attributes:
Methods:
call_func
(events, results, stats[, ...])Unexposed selector that does not actually select objects but that instead increments selection metrics in a given dictionary stats given a chunk of events and the corresponding selection results.
setup_func
(reqs, inputs, reader_targets)- rtype:
- call_force = True#
- call_func(events, results, stats, weight_map=None, group_map=None, group_combinations=None, **kwargs)#
Unexposed selector that does not actually select objects but that instead increments selection metrics in a given dictionary stats given a chunk of events and the corresponding selection results.
A weight_map* can be defined to configure the actual fields to be added. The key of each entry should either start with
"num
, to state that it will refer to a plain number of events, or"sum"
, to state that the field describes the sum of a specific column (usualky weights). Different types of values are accepted, depending on the type of “operation”: :rtype: tuple[ak.Array, SelectionResult]"num"
: An event mask, or an Ellipsis to select all events."sum"
: Either a column to sum over, or a 2-tuple containing the column to sum, andan event mask to only sum over certain events.
Example:
# weight map definition weight_map = { # "num" operations "num_events": Ellipsis, # all events "num_events_selected": results.event, # selected events only # "sum" operations "sum_mc_weight": events.mc_weight, # weights of all events "sum_mc_weight_selected": (events.mc_weight, results.event), # weights of selected events } # usage within an exposed selector # (where results are generated, and events and stats were passed by SelectEvents) self[increment_stats_per_process](events, results, stats, weight_map=weight_map, **kwargs)
Each sum of weights can also be extracted for each unique element in a so-called group, such as per process id, or per jet multiplicity bin. For this purpose, a group_map can be defined, mapping the name of a group (e.g.
"process"
or"njet"
) to a dictionary with the fields"values"
, unique values to loop over,"mask_fn"
, a function that is supposed to return a mask given a single value, and"combinations_only"
(optional), a boolean flag (False by default) that decideswhether this group is not to be evaluated on its own, but only as part of a combination with other groups (see below).
Example:
group_map = { "process": { "values": events.process_id, "mask_fn": (lambda v: events.process_id == v), }, "njet": { "values": results.x.n_jets, "mask_fn": (lambda v: results.x.n_jets == v), }, }
Based on the weight_map in the example above, this will result in eight additional fields in stats, e.g,
"sum_mc_weight_per_process"
,"sum_mc_weight_selected_per_process"
,"sum_mc_weight_per_njet"
,"sum_mc_weight_selected_per_njet"
, etc. (same of “num”). Each of these new fields will refer to a dictionary with keys corresponding to the unique values defined in the group_map above.In addition, combinations of groups can be configured using group_combinations. It accepts a sequence of tuples whose elements should be names of groups in group_names. As the name suggests, combinations of all possible values between groups are evaluated and stored in a nested dictionary.
Example:
group_combinations = [("process", "njet")]
In this case, stats will obtain additional fields, such as
"sum_mc_weight_per_process_and_njet"
and"sum_mc_weight_selected_per_process_and_njet"
, referring to nested dictionaries whose structure depends on the exact order of group names per tuple.
- data_only = False#
- mc_only = False#
- nominal_only = False#
- shifts_only = None#
- class increment_event_stats(*args, **kwargs)[source]#
Bases:
Selector
Attributes:
Methods:
call_func
(events, results, stats, **kwargs)Simplified version of
increment_stats
that only increments the number of events and the number of selected events.- call_force = True#
- call_func(events, results, stats, **kwargs)#
Simplified version of
increment_stats
that only increments the number of events and the number of selected events.- Return type:
- data_only = False#
- mc_only = False#
- nominal_only = False#
- produces = {<class 'columnflow.selection.stats.increment_stats'>}#
- shifts_only = None#
- uses = {<class 'columnflow.selection.stats.increment_stats'>}#