Radiative Inclusive HLT2 Efficiency and Multiplicity Analysis Pipeline

public public 1yr ago Version: NoExtraSelections_2k 0 bookmarks
Loading...

Repository to contain scripts used to run Moore, MooreAnalysis and DaVinci to obtain HLT2 efficiencies, extraselection multiplicities and event sizes from radiative inclusive HLT2 lines. The flow is controlled with a makefile. The MCs used are KstG, PhiG, K1G, LambdaG, XiG, OmegaG . The main commands are:

  • make all : Run everything

  • make all_MA : Run MooreAnalysis part only. This produces matched MCDecayTreeTuples and computes line efficiencies over reconstructible events.

  • make alltuples_MA : Produce matched ntuples with MooreAnalysis.

  • make all_Moore : Run Moore+DaVinci part only. This produces mDSTs with Moore and then ntuples with Davinci. Computes extraselection multiplicities and average event sizes in the mDST.

  • make HHGamma_multiplicities : Compute multiplicities using extraselection from HHGamma line only. Uses all MCs.

  • make HHGammaEE_multiplicities : Compute multiplicities using extraselection from HHGammaEE line only. Uses all MCs.

  • make HHHGamma_multiplicities : Compute multiplicities using extraselection from HHHGamma line only. Uses all MCs.

  • make HHHGammaEE_multiplicities : Compute multiplicities using extraselection from HHHGammaEE line only. Uses all MCs.

  • make allEvtSizes_Moore : Compute all event sizes. One for each MC

  • make allDSTs_Moore : Produce all mDSTs using Moore.

  • make alltuples_Moore : Produce all ntuples with DaVinci.

Outputs

The outputs are all saved in the output folder. Interesting results are all tagged in the repository so that they can be accessed in the future. The list of previous and a short description is added here.

  • Loose_extra_cuts_2k : Loose extraselection cuts. Ran on 2000 events per MC sample.

  • Average_extra_cuts_2k : Average extraselection cuts with maximum 3-4 of each extra particles per event. Ran on 2000 events per MC sample.

Dependencies

In order to run all the scripts provided, other software must be prepared beforehand. Here are instructions to get everything ready.

LHCb repositories

First, LHCb software is needed, and the makefile is prepared to run on my own compiled stack (which contains latest changes on master). Follow lb-stack-setup to setup your own stack or checkout published versions using lb-dev . Then, change the paths at the beginning of the makefile . The packages needed are:

  • Moore : Repository for HLT2 lines.

  • MooreAnalysis : Helper repository to retrieve efficiencies and/or rates.

  • DaVinci : Repository to produce ntuples from mDSTs.

aalfonso-Analysis-Tools/root

Personal repository with many root scripts. It is here and contains instructions on how to compile it. It is used to retrieve the multiplicity of the extraselections.

Setup

By default the makefile will assume that the stack and aalfonso-Analysis-Tools/root are under ../stack and ../root. If that is your case, everything is already configured as long as your stack has MooreAnalysis, Moore and DaVinci.

Otherwise, you need to configure the first lines in the makefile to match your installation paths

  • ANALYSIS_TOOLS_ROOT : Path to aalfonso-Analysis-Tools/root folder.

  • STACKDIR (optional): Path to your stack folder. If you compile your own stack, only configuring this variable should accomodate all three LHCb packages.

  • MOOREANALYSIS : Path to MooreAnalysis folder.

  • MOORE : Path to Moore folder.

  • DAVINCI : Path to DaVinci folder.

  • ANALYSIS_TOOLS_ROOT : Path to aalfonso-Analysis-Tools/root folder.

  • UPGRADE_BANDWIDTH_STUDIES : Path to Upgrade-bandwidth-studies folder.

  • STACKDIR (optional): Path to your stack folder. If you compile your own stack, only configuring this variable should accomodate all three LHCb packages.

Snakemake

You can also use snakemake to run the desired scripts. Targets share the same names as in the makefile, just substitute make target by snakemake -j 1 target . Additionally you can specify other usefull flags:

-n dry-run, -p print commands, -r reason, -j run in parallel

To enter an environment where snakemake is installed on lxplus type lb-conda default

The workflow can be found in the report or in the DAGs:

  • All MA: all_MA

  • All Moore: all_Moore

MC cheatsheet

We have prepared scripts to run over many, interesting radiative MC samples. We have given a short key name to each of them, which can be related to its EvtNumber down here:

Key Event Type Decay descriptor
KstG 11102202 {[[B0]nos -> (K*(892)0 -> K+ pi-) gamma]cc, [[B0]os -> (K*(892)~0 -> K- pi+) gamma]cc}
PhiG 13102202 [B_s0 -> (phi(1020) -> K+ K-) gamma]cc
K1G 12203224 [ B+ -> (K_1(1270)+ -> (X -> K+ pi- pi+)) gamma ]cc
LambdaG 15102307 [Lambda_b0 -> (Lambda0 -> p+ pi-) gamma]cc
XiG 16103330 [Xi_b- -> (Xi- ->(Lambda0 -> p+ pi-) pi-) gamma]cc
OmegaG 16103332 [Xi_b- -> (Omega- ->(Lambda0 -> p+ pi-) K- ) gamma]cc
PhiKstG 11104202 [B0 -> (phi(1020) -> K+ K-) (K*(892)0 -> K+ pi-) gamma]cc
PhiPhiG 13104212 [B0s -> (phi(1020) -> K+ K-) (phi(1020) -> K+ K-) gamma]cc
PhiKs0G 11104372 [Beauty -> (phi(1020) -> K+ K-) (KS0 -> pi+ pi-) gamma]cc
K1G_KpiPi0G 11202603 [ B0 -> (K_1(1270)0 -> (X0 -> K+ pi- pi0)) gamma ]cc
PhiPi0G 13102212 [B_s0 -> (phi(1020) -> K+ K-) pi0 gamma]cc
KstIsoG 12203303 [B+ -> (K*+ -> (K_S0 -> pi+ pi-) pi+) gamma]cc
LambdaPG 12103331 [ B+ -> (anti-Lambda0 -> p~- pi+) p gamma]cc
PhiKG 12103202 [B+ -> (phi(1020) -> K+ K-) K+ gamma]cc
K1G_Cocktail 12203271 [ B+ -> (K_1+ -> (X -> K+ pi- pi+)) gamma ]cc
L1520G 15102203 [Lambda_b0 -> (Lambda(1520)0 -> p+ K-) gamma]cc
RhoG 11102222 [B0 -> (rho0 -> pi+ pi-) gamma]cc

Running on ganga

The ganga workflow, for obvious reasons, is not included in the makefile/snakemake flow. Although it is possible to override it by substituting the .mdst or .root from ganga in their expected locations in here, so that (snake)make will continue from then on. The scripts to run ganga and their relevant output are all located in the ganga_Scripts folder. Step-by-step instructions can be found in the dedicated README .

Code Snippets

  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
from __future__ import division, print_function
import argparse
import os

import prettytable
import ROOT

# Each branch has a related branch whose name ends with this suffix
# I don't know what the relationship actually is
WEIRD_SUFFIX = '_R.'
# Ways we can order the table
ORDERING = ['name', 'size', 'ratio']

# SIGH
ROOT.PyConfig.IgnoreCommandLineOptions = True
# Suppress warnings about missing dictionaries for LHCb classes
ROOT.gErrorIgnoreLevel = ROOT.kError


def nbytes(branch):
    """Return a tuple of (uncompressed, compressed) sizes in bytes."""
    ubytes = branch.GetTotBytes("*")
    cbytes = branch.GetZipBytes()
    children = branch.GetListOfBranches()
    for child in children:
        tmp = nbytes(child)
        ubytes += tmp[0]
        cbytes += tmp[1]
    return ubytes, cbytes


def is_any_in(_list, _str):
    for l in _list:
        if l in _str:
            return True
    return False


def event_size(fname, order, pathname, bannednames):
    diskbytes = os.path.getsize(fname)
    f = ROOT.TFile(fname)
    totbytes = 0
    for key in f.GetListOfKeys():
        kname = key.GetName()
        tree = f.Get(kname)
        assert tree.Class().GetName() == 'TTree', tree
        nentries = tree.GetEntries() or 1
        tcbytes = tree.GetZipBytes()
        # Should we show statistics as an average over all events, or in total?
        show_average = kname == 'Event'

        table = prettytable.PrettyTable()
        table.field_names = [
            'Path', 'Uncompressed (B)', 'Compressed (B)', 'Ratio'
        ]
        table.align['Path'] = 'l'
        table.sortby = {
            'name': 'Path',
            'size': 'Uncompressed (B)',
            'ratio': 'Ratio'
        }[order]

        treebytes = 0
        matchbytes = 0
        for branch in tree.GetListOfBranches():
            bname = branch.GetName()
            if bname.endswith(WEIRD_SUFFIX):
                continue
            ubytes, cbytes = nbytes(branch)
            # Include the contribution from the suffixed branch
            # Remove the trailing dot and add the suffix
            branch_r = tree.GetBranch(bname[:-1] + WEIRD_SUFFIX)
            if branch_r:
                ubytes_r, cbytes_r = nbytes(branch_r)
                ubytes += ubytes_r
                cbytes += cbytes_r

            totbytes += cbytes
            treebytes += cbytes
            ratio = ubytes / cbytes

            tespath = bname.replace('_', '/').replace('.', '')
            if pathname in tespath and not is_any_in(bannednames, tespath):
                matchbytes += cbytes
            if show_average:
                ubytes = ubytes / nentries
                cbytes = cbytes / nentries
            if pathname in tespath and not is_any_in(bannednames, tespath):
                table.add_row([
                    tespath,
                    int(ubytes),
                    int(cbytes), '{0:.2f}'.format(ratio)
                ])
        # Check our bookkeeping
        assert tcbytes == treebytes
        units = 'B'
        if show_average:
            matchbytes /= nentries
            units += '/event'
        print('== {0} ({1:.0f} {2}) =='.format(kname, matchbytes, units))
        print(table)

    print('== Total ==')
    print('Disk size: {0:.0f} kB'.format(diskbytes / 1e3))
    print('Trees size: {0:.0f} kB'.format(totbytes / 1e3))


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description=__doc__)
    parser.add_argument('file', help='DST file to analyse')
    parser.add_argument(
        '--order',
        choices=ORDERING,
        default='name',
        help='How to order the printed branches')
    parser.add_argument(
        '--path',
        default='',
        help='Specify which branches must be shown and summed')
    parser.add_argument(
        '--banned', default=[], nargs='*', help='Ban branches containing this')
    args = parser.parse_args()
    event_size(args.file, args.order, args.path, args.banned)
100
101
shell:
    "lb-dirac dirac-bookkeeping-get-files --Prod={params.prodID} --OptionsFile={output}"
SnakeMake From line 100 of master/Snakefile
110
111
shell:
    "lb-dirac dirac-bookkeeping-genXMLCatalog --Options={input} --NewOptions={output}"
SnakeMake From line 110 of master/Snakefile
135
136
137
138
139
140
141
142
143
144
145
run:
    shell('mkdir -p output/{wildcards.MC}')
    ma_script = f'{MOOREANALYSIS}/run  --set=DECAY={wildcards.MC} gaudirun.py options/Decay_options.py'
    options   = f' options/2000_Evts.py MooreAnalysis_Scripts/AllLines.py Gaudi_inputs/{wildcards.MC}_input_PFNs.py'
    tee       = f' | tee output/{wildcards.MC}/AllLines_MA.out'
    try:
        shell("set +e")
        shell(ma_script+options+tee)
        shell("set -e")
    except:
        print("except: errors during MA_tuple")
SnakeMake From line 135 of master/Snakefile
160
161
162
163
164
shell:
    'mkdir -p output/{wildcards.MC} \n'
    '{MOOREANALYSIS}/run {MOOREANALYSIS}/HltEfficiencyChecker/scripts/hlt_line_efficiencies.py'
    ' {input} --level Hlt2 --reconstructible-children={params.reconstructibles}'
    ' | tee {output}'
SnakeMake From line 160 of master/Snakefile
174
175
shell:
    '{ANALYSIS_TOOLS_ROOT}/CutCorrelation.out {input} {output} MCDecayTreeTuple/MCDecayTree'
SnakeMake From line 174 of master/Snakefile
197
198
199
200
201
202
shell:
    "mkdir -p output/{wildcards.MC} \n"
    "{MOORE}/run --set=DECAY={wildcards.MC} gaudirun.py options/Decay_options.py "
    "options/2000_Evts.py Moore_Scripts/AllLines.py Gaudi_inputs/{wildcards.MC}_input_PFNs.py "
    "| tee output/{wildcards.MC}/AllLines_Moore.out \n"
    "rm -f test_catalog*.xml"
SnakeMake From line 197 of master/Snakefile
213
214
shell:
    "python scripts/event_size.py {input} | tee {output}"
SnakeMake From line 213 of master/Snakefile
224
225
shell:
    "{DAVINCI}/run --set=DECAY={wildcards.MC} gaudirun.py DaVinci_Scripts/Decay_options.py DaVinci_Scripts/AllLines.py"
SnakeMake From line 224 of master/Snakefile
240
241
242
243
shell:
    '{ANALYSIS_TOOLS_ROOT}/Multiplicity_Extrasel.out "{input.tuples}" '
    ' Cuts/{wildcards.extra}_cuts.txt {output} {wildcards.line}Tuple/DecayTree'
    ' {wildcards.line}_{wildcards.extra}Tuple/DecayTree '
SnakeMake From line 240 of master/Snakefile
ShowHide 9 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/Xaunther/Moore_Upgrade
Name: moore_upgrade
Version: NoExtraSelections_2k
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...