ϟ
 
DOI: 10.1051/epjconf/201921406035
¤ OpenAccess: Gold
This work has “Gold” OA status. This means it is published in an Open Access journal that is indexed by the DOAJ.

Pandas DataFrames for a FAST binned analysis at CMS

Benjamin Krikler,Olivier Davignon,Lukasz Kreczko,Jacob Linacre,Emmanuel Olaiya,Tai Sakuma

Frame (networking)
Computer science
Documentation
2019
Binned data frames are a generalisation of multi-dimensional histograms, represented in a tabular format with one category per row containing the labels, bin contents, uncertainties and so on. Pandas is an industry-standard tool, which provides a data frame implementation complete with routines for data frame manipultion, persistency, visualisation, and easy access to “big data” scientific libraries and machine learning tools. FAST (the Faster Analysis Software Taskforce) has developed a generic approach for typical binned HEP analyses, driving the summary of ROOT Trees to multiple binned DataFrames with a yaml-based analysis description. Using Continuous Integration to run subsets of the analysis, we can monitor and test changes to the analysis itself, and deploy documentation automatically. This report describes this approach using examples from a public CMS tutorial and details the benefit over traditional methods.
Loading...
    Cite this:
Generate Citation
Powered by Citationsy*
    Pandas DataFrames for a FAST binned analysis at CMS” is a paper by Benjamin Krikler Olivier Davignon Lukasz Kreczko Jacob Linacre Emmanuel Olaiya Tai Sakuma published in 2019. It has an Open Access status of “gold”. You can read and download a PDF Full Text of this paper here.