Radiative Inclusive HLT2 Efficiency and Multiplicity Analysis Pipeline

public 1yr ago Version: NoExtraSelections_2k 0 bookmarks

Help improve this workflow!

This workflow has been published but could be further improved with some additional meta data:

Keyword(s) in categories input, output, operation

You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .

Repository to contain scripts used to run Moore, MooreAnalysis and DaVinci to obtain HLT2 efficiencies, extraselection multiplicities and event sizes from radiative inclusive HLT2 lines. The flow is controlled with a makefile. The MCs used are KstG, PhiG, K1G, LambdaG, XiG, OmegaG . The main commands are:

make all : Run everything
make all_MA : Run MooreAnalysis part only. This produces matched MCDecayTreeTuples and computes line efficiencies over reconstructible events.
make alltuples_MA : Produce matched ntuples with MooreAnalysis.
make all_Moore : Run Moore+DaVinci part only. This produces mDSTs with Moore and then ntuples with Davinci. Computes extraselection multiplicities and average event sizes in the mDST.
make HHGamma_multiplicities : Compute multiplicities using extraselection from HHGamma line only. Uses all MCs.
make HHGammaEE_multiplicities : Compute multiplicities using extraselection from HHGammaEE line only. Uses all MCs.
make HHHGamma_multiplicities : Compute multiplicities using extraselection from HHHGamma line only. Uses all MCs.
make HHHGammaEE_multiplicities : Compute multiplicities using extraselection from HHHGammaEE line only. Uses all MCs.
make allEvtSizes_Moore : Compute all event sizes. One for each MC
make allDSTs_Moore : Produce all mDSTs using Moore.
make alltuples_Moore : Produce all ntuples with DaVinci.

Outputs

The outputs are all saved in the output folder. Interesting results are all tagged in the repository so that they can be accessed in the future. The list of previous and a short description is added here.

Loose_extra_cuts_2k : Loose extraselection cuts. Ran on 2000 events per MC sample.
Average_extra_cuts_2k : Average extraselection cuts with maximum 3-4 of each extra particles per event. Ran on 2000 events per MC sample.

Dependencies

In order to run all the scripts provided, other software must be prepared beforehand. Here are instructions to get everything ready.

LHCb repositories

First, LHCb software is needed, and the makefile is prepared to run on my own compiled stack (which contains latest changes on master). Follow lb-stack-setup to setup your own stack or checkout published versions using lb-dev . Then, change the paths at the beginning of the makefile . The packages needed are:

Moore : Repository for HLT2 lines.
MooreAnalysis : Helper repository to retrieve efficiencies and/or rates.
DaVinci : Repository to produce ntuples from mDSTs.

aalfonso-Analysis-Tools/root

Personal repository with many root scripts. It is here and contains instructions on how to compile it. It is used to retrieve the multiplicity of the extraselections.

Setup

By default the makefile will assume that the stack and aalfonso-Analysis-Tools/root are under ../stack and ../root. If that is your case, everything is already configured as long as your stack has MooreAnalysis, Moore and DaVinci.

Otherwise, you need to configure the first lines in the makefile to match your installation paths

ANALYSIS_TOOLS_ROOT : Path to aalfonso-Analysis-Tools/root folder.
STACKDIR (optional): Path to your stack folder. If you compile your own stack, only configuring this variable should accomodate all three LHCb packages.
MOOREANALYSIS : Path to MooreAnalysis folder.
MOORE : Path to Moore folder.
DAVINCI : Path to DaVinci folder.
ANALYSIS_TOOLS_ROOT : Path to aalfonso-Analysis-Tools/root folder.
UPGRADE_BANDWIDTH_STUDIES : Path to Upgrade-bandwidth-studies folder.
STACKDIR (optional): Path to your stack folder. If you compile your own stack, only configuring this variable should accomodate all three LHCb packages.

Snakemake

You can also use snakemake to run the desired scripts. Targets share the same names as in the makefile, just substitute make target by snakemake -j 1 target . Additionally you can specify other usefull flags:

-n dry-run, -p print commands, -r reason, -j run in parallel

To enter an environment where snakemake is installed on lxplus type lb-conda default

The workflow can be found in the report or in the DAGs:

All MA:
All Moore:

MC cheatsheet

We have prepared scripts to run over many, interesting radiative MC samples. We have given a short key name to each of them, which can be related to its EvtNumber down here:

Key	Event Type	Decay descriptor
KstG	11102202	{[[B0]nos -> (K(892)0 -> K+ pi-) gamma]cc, [[B0]os -> (K(892)~0 -> K- pi+) gamma]cc}
PhiG	13102202	[B_s0 -> (phi(1020) -> K+ K-) gamma]cc
K1G	12203224	[ B+ -> (K_1(1270)+ -> (X -> K+ pi- pi+)) gamma ]cc
LambdaG	15102307	[Lambda_b0 -> (Lambda0 -> p+ pi-) gamma]cc
XiG	16103330	[Xi_b- -> (Xi- ->(Lambda0 -> p+ pi-) pi-) gamma]cc
OmegaG	16103332	[Xi_b- -> (Omega- ->(Lambda0 -> p+ pi-) K- ) gamma]cc
PhiKstG	11104202	[B0 -> (phi(1020) -> K+ K-) (K*(892)0 -> K+ pi-) gamma]cc
PhiPhiG	13104212	[B0s -> (phi(1020) -> K+ K-) (phi(1020) -> K+ K-) gamma]cc
PhiKs0G	11104372	[Beauty -> (phi(1020) -> K+ K-) (KS0 -> pi+ pi-) gamma]cc
K1G_KpiPi0G	11202603	[ B0 -> (K_1(1270)0 -> (X0 -> K+ pi- pi0)) gamma ]cc
PhiPi0G	13102212	[B_s0 -> (phi(1020) -> K+ K-) pi0 gamma]cc
KstIsoG	12203303	[B+ -> (K*+ -> (K_S0 -> pi+ pi-) pi+) gamma]cc
LambdaPG	12103331	[ B+ -> (anti-Lambda0 -> p~- pi+) p gamma]cc
PhiKG	12103202	[B+ -> (phi(1020) -> K+ K-) K+ gamma]cc
K1G_Cocktail	12203271	[ B+ -> (K_1+ -> (X -> K+ pi- pi+)) gamma ]cc
L1520G	15102203	[Lambda_b0 -> (Lambda(1520)0 -> p+ K-) gamma]cc
RhoG	11102222	[B0 -> (rho0 -> pi+ pi-) gamma]cc

Running on ganga

The ganga workflow, for obvious reasons, is not included in the makefile/snakemake flow. Although it is possible to override it by substituting the .mdst or .root from ganga in their expected locations in here, so that (snake)make will continue from then on. The scripts to run ganga and their relevant output are all located in the ganga_Scripts folder. Step-by-step instructions can be found in the dedicated README .

Code Snippets

from __future__ import division, print_function
import argparse
import os

import prettytable
import ROOT

# Each branch has a related branch whose name ends with this suffix
# I don't know what the relationship actually is
WEIRD_SUFFIX = '_R.'
# Ways we can order the table
ORDERING = ['name', 'size', 'ratio']

# SIGH
ROOT.PyConfig.IgnoreCommandLineOptions = True
# Suppress warnings about missing dictionaries for LHCb classes
ROOT.gErrorIgnoreLevel = ROOT.kError


def nbytes(branch):
    """Return a tuple of (uncompressed, compressed) sizes in bytes."""
    ubytes = branch.GetTotBytes("*")
    cbytes = branch.GetZipBytes()
    children = branch.GetListOfBranches()
    for child in children:
        tmp = nbytes(child)
        ubytes += tmp[0]
        cbytes += tmp[1]
    return ubytes, cbytes


def is_any_in(_list, _str):
    for l in _list:
        if l in _str:
            return True
    return False


def event_size(fname, order, pathname, bannednames):
    diskbytes = os.path.getsize(fname)
    f = ROOT.TFile(fname)
    totbytes = 0
    for key in f.GetListOfKeys():
        kname = key.GetName()
        tree = f.Get(kname)
        assert tree.Class().GetName() == 'TTree', tree
        nentries = tree.GetEntries() or 1
        tcbytes = tree.GetZipBytes()
        # Should we show statistics as an average over all events, or in total?
        show_average = kname == 'Event'

        table = prettytable.PrettyTable()
        table.field_names = [
            'Path', 'Uncompressed (B)', 'Compressed (B)', 'Ratio'
        ]
        table.align['Path'] = 'l'
        table.sortby = {
            'name': 'Path',
            'size': 'Uncompressed (B)',
            'ratio': 'Ratio'
        }[order]

        treebytes = 0
        matchbytes = 0
        for branch in tree.GetListOfBranches():
            bname = branch.GetName()
            if bname.endswith(WEIRD_SUFFIX):
                continue
            ubytes, cbytes = nbytes(branch)
            # Include the contribution from the suffixed branch
            # Remove the trailing dot and add the suffix
            branch_r = tree.GetBranch(bname[:-1] + WEIRD_SUFFIX)
            if branch_r:
                ubytes_r, cbytes_r = nbytes(branch_r)
                ubytes += ubytes_r
                cbytes += cbytes_r

            totbytes += cbytes
            treebytes += cbytes
            ratio = ubytes / cbytes

            tespath = bname.replace('_', '/').replace('.', '')
            if pathname in tespath and not is_any_in(bannednames, tespath):
                matchbytes += cbytes
            if show_average:
                ubytes = ubytes / nentries
                cbytes = cbytes / nentries
            if pathname in tespath and not is_any_in(bannednames, tespath):
                table.add_row([
                    tespath,
                    int(ubytes),
                    int(cbytes), '{0:.2f}'.format(ratio)
                ])
        # Check our bookkeeping
        assert tcbytes == treebytes
        units = 'B'
        if show_average:
            matchbytes /= nentries
            units += '/event'
        print('== {0} ({1:.0f} {2}) =='.format(kname, matchbytes, units))
        print(table)

    print('== Total ==')
    print('Disk size: {0:.0f} kB'.format(diskbytes / 1e3))
    print('Trees size: {0:.0f} kB'.format(totbytes / 1e3))


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description=__doc__)
    parser.add_argument('file', help='DST file to analyse')
    parser.add_argument(
        '--order',
        choices=ORDERING,
        default='name',
        help='How to order the printed branches')
    parser.add_argument(
        '--path',
        default='',
        help='Specify which branches must be shown and summed')
    parser.add_argument(
        '--banned', default=[], nargs='*', help='Ban branches containing this')
    args = parser.parse_args()
    event_size(args.file, args.order, args.path, args.banned)

Python prettytable root From line 4 of scripts/event_size.py

shell:
    "lb-dirac dirac-bookkeeping-get-files --Prod={params.prodID} --OptionsFile={output}"

SnakeMake From line 100 of master/Snakefile

shell:
    "lb-dirac dirac-bookkeeping-genXMLCatalog --Options={input} --NewOptions={output}"

SnakeMake From line 110 of master/Snakefile

run:
    shell('mkdir -p output/{wildcards.MC}')
    ma_script = f'{MOOREANALYSIS}/run  --set=DECAY={wildcards.MC} gaudirun.py options/Decay_options.py'
    options   = f' options/2000_Evts.py MooreAnalysis_Scripts/AllLines.py Gaudi_inputs/{wildcards.MC}_input_PFNs.py'
    tee       = f' | tee output/{wildcards.MC}/AllLines_MA.out'
    try:
        shell("set +e")
        shell(ma_script+options+tee)
        shell("set -e")
    except:
        print("except: errors during MA_tuple")

SnakeMake From line 135 of master/Snakefile

shell:
    'mkdir -p output/{wildcards.MC} \n'
    '{MOOREANALYSIS}/run {MOOREANALYSIS}/HltEfficiencyChecker/scripts/hlt_line_efficiencies.py'
    ' {input} --level Hlt2 --reconstructible-children={params.reconstructibles}'
    ' | tee {output}'

SnakeMake From line 160 of master/Snakefile

shell:
    '{ANALYSIS_TOOLS_ROOT}/CutCorrelation.out {input} {output} MCDecayTreeTuple/MCDecayTree'

SnakeMake From line 174 of master/Snakefile

shell:
    "mkdir -p output/{wildcards.MC} \n"
    "{MOORE}/run --set=DECAY={wildcards.MC} gaudirun.py options/Decay_options.py "
    "options/2000_Evts.py Moore_Scripts/AllLines.py Gaudi_inputs/{wildcards.MC}_input_PFNs.py "
    "| tee output/{wildcards.MC}/AllLines_Moore.out \n"
    "rm -f test_catalog*.xml"

SnakeMake From line 197 of master/Snakefile

shell:
    "python scripts/event_size.py {input} | tee {output}"

SnakeMake From line 213 of master/Snakefile

shell:
    "{DAVINCI}/run --set=DECAY={wildcards.MC} gaudirun.py DaVinci_Scripts/Decay_options.py DaVinci_Scripts/AllLines.py"

SnakeMake From line 224 of master/Snakefile

shell:
    '{ANALYSIS_TOOLS_ROOT}/Multiplicity_Extrasel.out "{input.tuples}" '
    ' Cuts/{wildcards.extra}_cuts.txt {output} {wildcards.line}Tuple/DecayTree'
    ' {wildcards.line}_{wildcards.extra}Tuple/DecayTree '