ϟ

Marcel Rieger

Here are all the papers by Marcel Rieger that you can download and read on OA.mg.
Marcel Rieger’s last known institution is . Download Marcel Rieger PDFs here.

Claim this Profile →
DOI: 10.21468/scipostphys.7.1.014
2019
Cited 132 times
The Machine Learning landscape of top taggers
Based on the established task of identifying boosted, hadronically decaying top quarks, we compare a wide range of modern machine learning approaches. Unlike most established methods they rely on low-level input, for instance calorimeter output. While their network architectures are vastly different, their performance is comparatively similar. In general, we find that these new approaches are extremely powerful and great fun.
DOI: 10.3389/fdata.2020.598927
2021
Cited 41 times
Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics
Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged particle tracking, jet tagging, and clustering. An important domain for the application of these networks is the FGPA-based first layer of real-time data filtering at the CERN Large Hadron Collider, which has strict latency and resource constraints. We discuss how to design distance-weighted graph networks that can be executed with a latency of less than 1$\mu\mathrm{s}$ on an FPGA. To do so, we consider a representative task associated to particle reconstruction and identification in a next-generation calorimeter operating at a particle collider. We use a graph network architecture developed for such purposes, and apply additional simplifications to match the computing constraints of Level-1 trigger systems, including weight quantization. Using the $\mathtt{hls4ml}$ library, we convert the compressed models into firmware to be implemented on an FPGA. Performance of the synthesized models is presented both in terms of inference accuracy and resource usage.
DOI: 10.1088/1748-0221/14/06/p06006
2019
Cited 44 times
Lorentz Boost Networks: autonomous physics-inspired feature engineering
We present a two-stage neural network architecture that enables a fully autonomous and comprehensive characterization of collision events by exclusively exploiting the four-momenta of final-state particles. We refer to the first stage of the architecture as Lorentz Boost Network (LBN). The LBN allows the creation of particle combinations representing rest frames. The LBN also enables the formation of further composite particles, which are then transformed into said rest frames by Lorentz transformation. The properties of the composite, transformed particles are compiled in the form of characteristic variables that serve as input for a subsequent network. This second network has to be configured for a specific analysis task such as the separation of signal and background events. Using the example of the classification of ttH and t+b events, we compare the separation power of the LBN approach with that of domain-unspecific deep neural networks (DNN). We observe leading performance with the LBN, even though we provide the DNNs with extensive additional input information beyond the particle four-momenta. Furthermore, we demonstrate that the LBN forms physically meaningful particle combinations and autonomously generates suitable characteristic variables.
DOI: 10.1088/1748-0221/12/08/p08020
2017
Cited 20 times
Jet-parton assignment in<i>tt̄H</i>events using deep learning
The direct measurement of the top quark-Higgs coupling is one of the important questions in understanding the Higgs boson. The coupling can be obtained through measurement of the top quark pair-associated Higgs boson production cross-section. Of the multiple challenges arising in this cross-section measurement, we investigate the reconstruction of the partons originating from the hard scattering process using the measured jets in simulated tH events. The task corresponds to an assignment challenge of m objects (jets) to n other objects (partons), where m≥ n. We compare several methods with emphasis on a concept based on deep learning techniques which yields the best results with more than 50% of correct jet-parton assignments.
DOI: 10.1051/epjconf/202429505012
2024
End-to-End Analysis Automation over Distributed Resources with Luigi Analysis Workflows
In particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo production. However, physicists performing data analyses are usually required to steer their individual, complex workflows manually, frequently involving job submission in several stages and interaction with distributed storage systems by hand. This process is not only time-consuming and error-prone, but also leads to undocumented relations between particular workloads, rendering the steering of an analysis a serious challenge. This article presents the Luigi Analysis Workflow (Law) Python package which is based on the open-source pipelining tool Luigi, originally developed by Spotify. It establishes a generic design pattern for analyses of arbitrary scale and complexity, and shifts the focus from executing to defining the analysis logic. Law provides the building blocks to seamlessly integrate with interchangeable remote resources without, however, limiting itself to a specific choice of infrastructure. In particular, it introduces the concept of complete separation between analysis algorithms on the one hand, and run locations, storage locations, and software environments on the other hand. To cope with the sophisticated demands of end-to-end HEP analyses, Law supports job execution on WLCG infrastructure (ARC, gLite, CMS-CRAB) as well as on local computing clusters (HTCondor, Slurm, LSF), remote file access via various protocols using the Grid File Access Library (GFAL2), and an environment sandboxing mechanism with support for sub-shells and virtual environments, as well as Docker and Singularity containers. Moreover, the novel approach ultimately aims for analysis preservation out-of-the-box. Law is developed opensource and independent of any experiment or the language of executed code, and its user-base increased steadily over the past years.
DOI: 10.13031/2013.28768
1992
Cited 32 times
Plant Wilt Detection by Computer-vision Tracking of Leaf Tips
The vertical movement of leaf-tips of four plants were tracked simultaneously by a computer vision system. Fully expanded leaves of tomato plants were found to have linear vertical motions in response to both water stress level and carbon dioxide assimilation rate. Growing leaves had complex motions which were less useful for monitoring water stress level. The computer vision detected the onset of wilt before physiological injury took place and triggered irrigation at predetermined leaf tip deflections.
DOI: 10.1088/1748-0221/7/08/t08005
2012
Cited 14 times
A development environment for visual physics analysis
The Visual Physics Analysis (VISPA) project integrates different aspects of physics analyses into a graphical development environment. It addresses the typical development cycle of (re-)designing, executing and verifying an analysis. The project provides an extendable plug-in mechanism and includes plug-ins for designing the analysis flow, for running the analysis on batch systems, and for browsing the data content. The corresponding plug-ins are based on an object-oriented toolkit for modular data analysis. We introduce the main concepts of the project, describe the technical realization and demonstrate the functionality in example applications.
DOI: 10.1088/0143-0807/35/3/035018
2014
Cited 9 times
A field study of data analysis exercises in a bachelor physics course using the internet platform VISPA
Bachelor of physics lectures on 'Particle Physics and Astrophysics' were complemented by exercises related to data analysis and data interpretation at the RWTH Aachen University recently. The students performed these exercises using the internet platform VISPA, which provides a development environment for physics data analyses. We describe the platform and its application within the physics course, and present the results of a student survey. The students' acceptance of the learning project was positive. The level of acceptance was related to their individual preference for learning with a computer. Furthermore, students with good programming skills favour working individually, while students who attribute themselves as having low programming abilities favour working in teams. The students appreciated approaching actual research through the data analysis tasks.
DOI: 10.1088/1742-6596/898/7/072047
2017
Cited 6 times
Design and Execution of make-like, distributed Analyses based on Spotify’s Pipelining Package Luigi
In high-energy particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo production. However, physicists performing data analyses are usually required to steer their individual workflows manually which is time-consuming and often leads to undocumented relations between particular workloads. We present a generic analysis design pattern that copes with the sophisticated demands of end-to-end HEP analyses and provides a make-like execution system. It is based on the open-source pipelining package Luigi which was developed at Spotify and enables the definition of arbitrary workloads, so-called Tasks, and the dependencies between them in a lightweight and scalable structure. Further features are multi-user support, automated dependency resolution and error handling, central scheduling, and status visualization in the web. In addition to already built-in features for remote jobs and file systems like Hadoop and HDFS, we added support for WLCG infrastructure such as LSF and CREAM job submission, as well as remote file access through the Grid File Access Library. Furthermore, we implemented automated resubmission functionality, software sandboxing, and a command line interface with auto-completion for a convenient working environment. For the implementation of a tH cross section measurement, we created a generic Python interface that provides programmatic access to all external information such as datasets, physics processes, statistical models, and additional files and values. In summary, the setup enables the execution of the entire analysis in a parallelized and distributed fashion with a single command.
DOI: 10.14361/9783839402863
2005
Cited 10 times
Grenzgänge
In jüngster Zeit suchen Erziehungswissenschaftler/innen immer häufiger die Auseinandersetzung mit aktuellen literarischen Texten. Dabei zeigt sich, dass die Öffnung des pädagogischen Diskurses für Gegenwartsromane nicht nur dessen Selbstreflexion stimuliert, sondern auch Chancen birgt, neue Einsichten über den eigenen Gegenstandsbereich zu gewinnen. Die experimentellen Lektüren zeitgenössischer Literatur nehmen daher die Form einer Spurensuche an: Was verraten die Romane von Imre Kertész, Zeruya Shalev, Uwe Timm, Paula Fox und anderen über die gegenwärtigen Formen von Kindheit und Jugend, von Erziehung, Bildung und Sozialisation?
DOI: 10.1088/1742-6596/523/1/012021
2014
Cited 5 times
A Web-Based Development Environment for Collaborative Data Analysis
Visual Physics Analysis (VISPA) is a web-based development environment addressing high energy and astroparticle physics. It covers the entire analysis spectrum from the design and validation phase to the execution of analyses and the visualization of results. VISPA provides a graphical steering of the analysis flow, which consists of self-written, re-usable Python and C++ modules for more demanding tasks. All common operating systems are supported since a standard internet browser is the only software requirement for users. Even access via mobile and touch-compatible devices is possible. In this contribution, we present the most recent developments of our web application concerning technical, state-of-the-art approaches as well as practical experiences. One of the key features is the use of workspaces, i.e. user-configurable connections to remote machines supplying resources and local file access. Thereby, workspaces enable the management of data, computing resources (e.g. remote clusters or computing grids), and additional software either centralized or individually. We further report on the results of an application with more than 100 third-year students using VISPA for their regular particle physics exercises during the winter term 2012/13. Besides the ambition to support and simplify the development cycle of physics analyses, new use cases such as fast, location-independent status queries, the validation of results, and the ability to share analyses within worldwide collaborations with a single click become conceivable.
DOI: 10.2172/1882567
2022
Data Science and Machine Learning in Education
The growing role of data science (DS) and machine learning (ML) in high-energy physics (HEP) is well established and pertinent given the complex detectors, large data, sets and sophisticated analyses at the heart of HEP research.Moreover, exploiting symmetries inherent in physics data have inspired physics-informed ML as a vibrant sub-field of computer science research.HEP researchers benefit greatly from materials widely available materials for use in education, training and workforce development.They are also contributing to these materials and providing software to DS/ML-related fields.Increasingly, physics departments are offering courses at the intersection of DS, ML and physics, often using curricula developed by HEP researchers and involving open software and data used in HEP.In this white paper, we explore synergies between HEP research and DS/ML education, discuss opportunities and challenges at this intersection, and propose community activities that will be mutually beneficial.
DOI: 10.1088/1742-6596/1525/1/012094
2020
Cited 3 times
Adversarial Neural Network-based data-simulation corrections for jet-tagging at CMS
Abstract Variable-dependent scale factors are commonly used in HEP to improve shape agreement of data and simulation. The choice of the underlying model is of great importance, but often requires a lot of manual tuning e.g. of bin sizes or fitted functions. This can be alleviated through the use of neural networks and their inherent powerful data modeling capabilities. We present a novel and generalized method for producing scale factors using an adversarial neural network. This method is investigated in the context of the bottom-quark jet-tagging algorithms within the CMS experiment. The primary network uses the jet variables as inputs to derive the scale factor for a single jet. It is trained through the use of a second network, the adversary, which aims to differentiate between the data and rescaled simulation.
DOI: 10.1088/1742-6596/664/3/032031
2015
The VISPA internet platform for outreach, education and scientific research in various experiments
VISPA provides a graphical front-end to computing infrastructures giving its users all functionality needed for working conditions comparable to a personal computer. It is a framework that can be extended with custom applications to support individual needs, e.g. graphical interfaces for experiment-specific software. By design, VISPA serves as a multipurpose platform for many disciplines and experiments as demonstrated in the following different use-cases. A GUI to the analysis framework OFFLINE of the Pierre Auger collaboration, submission and monitoring of computing jobs, university teaching of hundreds of students, and outreach activity, especially in CERN's open data initiative. Serving heterogeneous user groups and applications gave us lots of experience. This helps us in maturing the system, i.e. improving the robustness and responsiveness, and the interplay of the components. Among the lessons learned are the choice of a file system, the implementation of websockets, efficient load balancing, and the fine-tuning of existing technologies like the RPC over SSH. We present in detail the improved server setup and report on the performance, the user acceptance and the realized applications of the system.
DOI: 10.1016/j.nuclphysbps.2015.09.466
2016
The VISPA Internet Platform for Students
The VISPA internet platform enables users to remotely run Python scripts and view resulting plots or inspect their output data. With a standard web browser as the only user requirement on the client-side, the system becomes suitable for blended learning approaches for university physics students. VISPA was used in two consecutive years each by approx. 100 third year physics students at the RWTH Aachen University for their homework assignments. For example, in one exercise students gained a deeper understanding of Einsteins mass-energy relation by analyzing experimental data of electron-positron pairs revealing J/Ψ and Z particles. Because the students were free to choose their working hours, only few users accessed the platform simultaneously. The positive feedback from students and the stability of the platform lead to further development of the concept. This year, students accessed the platform in parallel while they analyzed the data recorded by demonstrated experiments live in the lecture hall. The platform is based on experience in the development of professional analysis tools. It combines core technologies from previous projects: an object-oriented C++ library, a modular data-driven analysis flow, and visual analysis steering. We present the platform and discuss its benefits in the context of teaching based on surveys that are conducted each semester.
DOI: 10.1088/1742-6596/898/7/072045
2017
Experiment Software and Projects on the Web with VISPA
The Visual Physics Analysis (VISPA) project defines a toolbox for accessing software via the web. It is based on latest web technologies and provides a powerful extension mechanism that enables to interface a wide range of applications. Beyond basic applications such as a code editor, a file browser, or a terminal, it meets the demands of sophisticated experiment-specific use cases that focus on physics data analyses and typically require a high degree of interactivity. As an example, we developed a data inspector that is capable of browsing interactively through event content of several data formats, e.g., MiniAOD which is utilized by the CMS collaboration. The VISPA extension mechanism can also be used to embed external web-based applications that benefit from dynamic allocation of user-defined computing resources via SSH. For example, by wrapping the JSROOT project, ROOT files located on any remote machine can be inspected directly through a VISPA server instance. We introduced domains that combine groups of users and role-based permissions. Thereby, tailored projects are enabled, e.g. for teaching where access to student's homework is restricted to a team of tutors, or for experiment-specific data that may only be accessible for members of the collaboration. We present the extension mechanism including corresponding applications and give an outlook onto the new permission system.
2017
Design and Execution of make-like, distributed Analyses based on Spotify's Pipelining Package Luigi
In high-energy particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo production. However, physicists performing data analyses are usually required to steer their individual workflows manually which is time-consuming and often leads to undocumented relations between particular workloads. We present a generic analysis design pattern that copes with the sophisticated demands of end-to-end HEP analyses and provides a make-like execution system. It is based on the open-source pipelining package Luigi which was developed at Spotify and enables the definition of arbitrary workloads, so-called Tasks, and the dependencies between them in a lightweight and scalable structure. Further features are multi-user support, automated dependency resolution and error handling, central scheduling, and status visualization in the web. In addition to already built-in features for remote jobs and file systems like Hadoop and HDFS, we added support for WLCG infrastructure such as LSF and CREAM job submission, as well as remote file access through the Grid File Access Library. Furthermore, we implemented automated resubmission functionality, software sandboxing, and a command line interface with auto-completion for a convenient working environment. For the implementation of a $t\bar{t}H$ cross section measurement, we created a generic Python interface that provides programmatic access to all external information such as datasets, physics processes, statistical models, and additional files and values. In summary, the setup enables the execution of the entire analysis in a parallelized and distributed fashion with a single command.
DOI: 10.1088/1742-6596/1085/4/042044
2018
The VISPA internet-platform in deep learning applications
Latest developments in many research fields indicate that deep learning methods have the potential to significantly improve physics analyses. They not only enhance the performance of existing algorithms but also pave the way for new measurement techniques that are not possible with conventional methods. As the computation is highly resource-intensive both dedicated hardware and software are required to obtain results in a reasonable time which poses a substantial entry barrier. We provide direct access to this technology after a revision of the internet platform VISPA to serve the needs of researches as well as students. VISPA equips its users with working conditions on remote computing resources comparable to a local computer through a standard web browser. For providing the required hardware resources for deep learning applications we extend the CPU infrastructure with a GPU cluster consisting of 10 nodes with each 2 GeForce GTX 1080 cards. Direct access through VISPA, preinstalled analysis software and a workload management system allowed us on one hand to support more than 100 participants in a workshop on deep learning and in corresponding university classes, and on the other hand to achieve significant progress in particle and astroparticle research. We present the setup of the system and report on the performance and achievements in the above mentioned usecases.
DOI: 10.1051/epjconf/201921405021
2019
Evolution of the VISPA-project
VISPA (Visual Physics Analysis) is a web-platform that enables users to work on any secure shell (SSH) reachable resource using just their webbrowser. It is used successfully in research and education for HEP data analysis. The emerging JupyterLab is an ideal choice for a comprehensive, browser-based, and extensible work environment and we seek to unify it with the efforts of the VISPA-project. The primary objective is to provide the user with the freedom to access any external resources at their disposal, while maintaining a smooth integration of preconfigured ones including their access permissions. Additionally, specialized HEP tools, such as native format data browsers (ROOT, PXL), are being migrated from VISPA- to JupyterLab-extensions as well. We present these concepts and their implementation progress.
DOI: 10.1088/1742-6596/513/6/062034
2014
A Browser-Based Multi-User Working Environment for Physicists
Many programs in experimental particle physics do not yet have a graphical interface, or demand strong platform and software requirements. With the most recent development of the VISPA project, we provide graphical interfaces to existing software programs and access to multiple computing clusters through standard web browsers. The scalable clientserver system allows analyses to be performed in sizable teams, and disburdens the individual physicist from installing and maintaining a software environment. The VISPA graphical interfaces are implemented in HTML, JavaScript and extensions to the Python webserver. The webserver uses SSH and RPC to access user data, code and processes on remote sites. As example applications we present graphical interfaces for steering the reconstruction framework OFFLINE of the Pierre-Auger experiment, and the analysis development toolkit PXL. The browser based VISPA system was field-tested in biweekly homework of a third year physics course by more than 100 students. We discuss the system deployment and the evaluation by the students.
DOI: 10.1088/1742-6596/368/1/012039
2012
Visual physics analysis – from desktop to physics analysis at your fingertips
Visual Physics Analysis (VISPA) is an analysis environment with applications in high energy and astroparticle physics. Based on a data-flow-driven paradigm, it allows users to combine graphical steering with self-written C++ and Python modules. This contribution presents new concepts integrated in VISPA: layers, convenient analysis execution, and web-based physics analysis. While the convenient execution offers full flexibility to vary settings for the execution phase of an analysis, layers allow to create different views of the analysis already during its design phase. Thus, one application of layers is to define different stages of an analysis (e.g. event selection and statistical analysis). However, there are other use cases such as to independently optimize settings for different types of input data in order to guide all data through the same analysis flow. The new execution feature makes job submission to local clusters as well as the LHC Computing Grid possible directly from VISPA. Web-based physics analysis is realized in the VISPA@Web project, which represents a whole new way to design and execute analyses via a standard web browser.
DOI: 10.1088/1742-6596/608/1/012027
2015
VISPA: Direct Access and Execution of Data Analyses for Collaborations
The VISPA project provides a graphical frontend to computing infrastructures. Currently, the focus of the project is to give an online environment for the development of data analyses. Access is provided through a web GUI, which has all functionality needed for working conditions comparable to a personal computer. This includes a new preference system as well as user configurable shortkeys. As all relevant software, data and computing resources are supplied on a common remote infrastructure the VISPA web framework offers a new way of collaborative work where analyses of colleagues can be reviewed and executed with just one click. Furthermore, VISPA can be extended to the specific needs of an experiment or other scientific use cases. This is presented in the form of a new GUI to the analysis framework Offline of the Pierre Auger collaboration.
DOI: 10.1088/1742-6596/762/1/012008
2016
Bringing Experiment Software to the Web with VISPA
The Visual Physics Analysis (VISPA) software is a toolbox for accessing analysis software via the web. It is based on latest web technologies and provides a powerful extension mechanism that enables to interface a wide range of applications. It especially meets the demands of sophisticated experiment-specific use cases that focus on physics data analyses and typically require a high degree of interactivity. As an example, we developed a data inspector which is capable of browsing interactively through event content of several data formats, e.g., MiniAOD which is utilized by the CMS collaboration. Visual control of a chain of user analysis modules as well as visualization of user specific workflows support users in rather complex analyses at the level of ttH cross section measurements. The VISPA extension mechanism is also used to embed external web-based applications which benefit from dynamic allocation of user-defined computing resources via SSH. For example, by wrapping the JSROOT project, ROOT files located on any remote machine can be inspected directly through a VISPA server instance. We present the techniques of the extension mechanism and corresponding applications.
DOI: 10.1088/1742-6596/396/5/052015
2012
A Server-Client-Based Graphical Development Environment for Physics Analyses (VISPA)
The Visual Physics Analysis (VISPA) project provides a graphical development environment for data analysis. It addresses the typical development cycle of (re-)designing, executing, and verifying an analysis. We present the new server-client-based web application of the VISPA project to perform physics analyses via a standard internet browser. This enables individual scientists to work with a large variety of devices including touch screens, and teams of scientists to share, develop, and execute analyses on a server via the web interface.
DOI: 10.1088/1742-6596/898/9/092043
2017
Workflow Management for Complex HEP Analyses
We present the novel Analysis Workflow Management (AWM) that provides users with the tools and competences of professional large scale workflow systems, e.g. Apache's Airavata[1]. The approach presents a paradigm shift from executing parts of the analysis to defining the analysis. Within AWM an analysis consists of steps. For example, a step defines to run a certain executable for multiple files of an input data collection. Each call to the executable for one of those input files can be submitted to the desired run location, which could be the local computer or a remote batch system. An integrated software manager enables automated user installation of dependencies in the working directory at the run location. Each execution of a step item creates one report for bookkeeping purposes containing error codes and output data or file references. Required files, e.g. created by previous steps, are retrieved automatically. Since data storage and run locations are exchangeable from the steps perspective, computing resources can be used opportunistically. A visualization of the workflow as a graph of the steps in the web browser provides a high-level view on the analysis. The workflow system is developed and tested alongside of a ttbb cross section measurement where, for instance, the event selection is represented by one step and a Bayesian statistical inference is performed by another. The clear interface and dependencies between steps enables a make-like execution of the whole analysis.
DOI: 10.48550/arxiv.1706.00955
2017
Design and Execution of make-like, distributed Analyses based on Spotify's Pipelining Package Luigi
In high-energy particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo production. However, physicists performing data analyses are usually required to steer their individual workflows manually which is time-consuming and often leads to undocumented relations between particular workloads. We present a generic analysis design pattern that copes with the sophisticated demands of end-to-end HEP analyses and provides a make-like execution system. It is based on the open-source pipelining package Luigi which was developed at Spotify and enables the definition of arbitrary workloads, so-called Tasks, and the dependencies between them in a lightweight and scalable structure. Further features are multi-user support, automated dependency resolution and error handling, central scheduling, and status visualization in the web. In addition to already built-in features for remote jobs and file systems like Hadoop and HDFS, we added support for WLCG infrastructure such as LSF and CREAM job submission, as well as remote file access through the Grid File Access Library. Furthermore, we implemented automated resubmission functionality, software sandboxing, and a command line interface with auto-completion for a convenient working environment. For the implementation of a $t\bar{t}H$ cross section measurement, we created a generic Python interface that provides programmatic access to all external information such as datasets, physics processes, statistical models, and additional files and values. In summary, the setup enables the execution of the entire analysis in a parallelized and distributed fashion with a single command.
DOI: 10.48550/arxiv.2207.09060
2022
Data Science and Machine Learning in Education
The growing role of data science (DS) and machine learning (ML) in high-energy physics (HEP) is well established and pertinent given the complex detectors, large data, sets and sophisticated analyses at the heart of HEP research. Moreover, exploiting symmetries inherent in physics data have inspired physics-informed ML as a vibrant sub-field of computer science research. HEP researchers benefit greatly from materials widely available materials for use in education, training and workforce development. They are also contributing to these materials and providing software to DS/ML-related fields. Increasingly, physics departments are offering courses at the intersection of DS, ML and physics, often using curricula developed by HEP researchers and involving open software and data used in HEP. In this white paper, we explore synergies between HEP research and DS/ML education, discuss opportunities and challenges at this intersection, and propose community activities that will be mutually beneficial.
2008
Accuracy of CT guided computer-assisted brainpunctures
DOI: 10.1088/1742-6596/1085/3/032002
2018
Design and Execution of make-like Distributed Analyses
In particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo production. However, physicists performing data analyses are usually required to steer their individual workflows manually, which is time-consuming and often leads to undocumented relations between particular workloads. We present a generic analysis design pattern that copes with the sophisticated demands of end-to-end HEP analyses. The approach presents a paradigm shift from executing parts of the analysis to defining the analysis. The clear interface and dependencies between individual workloads then enables a make-like execution.
DOI: 10.1088/1742-6596/1525/1/012035
2020
Design Pattern for Analysis Automation on Distributed Resources using Luigi Analysis Workflows
Abstract In particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo event generation. However, physicists performing data analyses are usually required to steer their individual workflows manually, which is time-consuming and often leads to undocumented relations between particular workloads. We present the Luigi Analysis Workflows (Law) Python package, which is based on the open-source pipelining tool Luigi, originally developed by Spotify. It establishes a generic design pattern for analyses of arbitrary scale and complexity, and shifts the focus from executing to defining the analysis logic. Law provides the building blocks to seamlessly integrate interchangeable remote resources without, however, limiting itself to a specific choice of infrastructure. In particular, it encourages and enables the separation of analysis algorithms on the one hand, and run locations, storage locations, and software environments on the other hand. To cope with the sophisticated demands of end-to-end HEP analyses, Law supports job execution on WLCG infrastructure (ARC, gLite) as well as on local computing clusters (HTCondor, LSF), remote file access via most common protocols through the GFAL2 library, and an environment sandboxing mechanism with support for Docker and Singularity containers. Moreover, the novel approach ultimately aims for analysis preservation out-of-the-box. Law is entirely experiment independent and developed open-source.
DOI: 10.1088/1742-6596/1525/1/012107
2020
Physics inspired feature engineering with Lorentz Boost Networks
Abstract We present a neural network architecture designed to autonomously create characteristic features of high energy physics collision events from basic four-vector information. It consists of two stages, the first of which we call the Lorentz Boost Network (LBN). The LBN creates composite particles and rest frames from the combination of final state particles, and then boosts said particles into their corresponding rest frames. From these boosted particles, characteristic features are created and used by the second network stage to solve a given physics problem. We apply our model to the task of separating top-quark pair associated Higgs boson events from a <?CDATA $t\bar{t}$?> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" overflow="scroll"> <mml:mrow> <mml:mi>t</mml:mi> <mml:mover accent="true"> <mml:mi>t</mml:mi> <mml:mo>¯</mml:mo> </mml:mover> </mml:mrow> </mml:math> background, and observe improved performance compared to using domain unspecific deep neural networks. We also investigate the learned combinations and boosts to gain insights into what the network is learning.
DOI: 10.1051/epjconf/202024505040
2020
Knowledge sharing on deep learning in physics research using VISPA
The VISPA (VISual Physics Analysis) project provides a streamlined work environment for physics analyses and hands-on teaching experiences with a focus on deep learning. VISPA has already been successfully used in HEP analyses and teaching and is now being further developed into an interactive deep learning platform. One specific example is to meet knowledge sharing needs in deep learning by combining paper, code and data at a central place. Additionally the possibility to run it directly from the web browser is a key feature of this development. Any SSH reachable resource can be accessed via the VISPA web interface. This enables a flexible and experiment agnostic computing experience. The user interface is based on JupyterLab and is extended with analysis specific tools, such as a parametric file browser and TensorBoard. Our VISPA instance is backed by extensive GPU resources and a rich software environment. We present the current status of the VISPA project and its upcoming new features.
DOI: 10.1051/epjconf/202024505025
2020
Design Pattern for Analysis Automation on Distributed Resources using Luigi Analysis Workflows
In particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo event generation. However, physicists performing data analyses are usually required to steer their individual workflows manually, which is time-consuming and often leads to undocumented relations between particular workloads. We present the Luigi Analysis Workflows (Law) Python package, which is based on the opensource pipelining tool Luigi, originally developed by Spotify. It establishes a generic design pattern for analyses of arbitrary scale and complexity, and shifts the focus from executing to defining the analysis logic. Law provides the building blocks to seamlessly integrate interchangeable remote resources without, however, limiting itself to a specific choice of infrastructure. In particular, it encourages and enables the separation of analysis algorithms on the one hand, and run locations, storage locations, and software environments on the other hand. To cope with the sophisticated demands of end-to-end HEP analyses, Law supports job execution on WLCG infrastructure (ARC, gLite) as well as on local computing clusters (HTCondor, LSF), remote file access via most common protocols through the GFAL2 library, and an environment sandboxing mechanism with support for Docker and Singularity containers. Moreover, the novel approach ultimately aims for analysis preservation out-of-the-box. Law is entirely experiment independent and developed open-source. It is successfully used in tt ̄ H cross section measurements and searches for di-Higgs boson production with the CMS experiment.
DOI: 10.18154/rwth-2019-06415
2019
Search for Higgs boson production in association with top quarks and decaying into bottom quarks using deep learning techniques with the CMS experiment
DOI: 10.21468/scipost.report.951
2019
Report on 1902.09914v2
Based on the established task of identifying boosted, hadronically decaying top quarks, we compare a wide range of modern machine learning approaches.We find that they are extremely powerful and great fun.
DOI: 10.21468/scipost.report.962
2019
Report on 1902.09914v2
Based on the established task of identifying boosted, hadronically decaying top quarks, we compare a wide range of modern machine learning approaches.We find that they are extremely powerful and great fun.
DOI: 10.21468/scipost.report.955
2019
Report on 1902.09914v2
Based on the established task of identifying boosted, hadronically decaying top quarks, we compare a wide range of modern machine learning approaches.We find that they are extremely powerful and great fun.
DOI: 10.1007/978-3-030-65380-4
2021
Search for tt̄H Production in the H → bb̅ Decay Channel
DOI: 10.1007/978-3-030-65380-4_6
2021
Event Samples and Selection
This section introduces the samples of measured and simulated events that constitute the basis for the analysis at hand. Firstly, the sample of events and corresponding integrated luminosities measured with the CMS detector in the data-taking period of 2016 are described. Subsequently, the procedure of event simulation at the CMS experiment is summarized, the sample of simulated “Monte Carlo” (MC) events is introduced, and a detailed overview of the employed generator setup is presented.
DOI: 10.1007/978-3-030-65380-4_2
2021
The $$t\bar{t}H$$ Process in the Standard Model of Particle Physics
The endeavor of particle physics lies in the observation, formulation, and validation of rules that describe the properties of matter particles and the forces that act between them. The status of our current understanding is summarized in the Standard Model of Particle Physics (SM), which is discussed in the beginning of this section. It introduces the elementary matter particles consisting of quarks, leptons, and their antiparticles as well as the three interactions that act on subatomic scales. Gravity, as the fourth known fundamental force, is not described by the SM as its influence on fundamental particle processes is considered to be irrelevant at accessible energies. The “Higgs mechanism”, which explains the origin of particle masses, is described subsequently. The second part of this section introduces the \(t\bar{t}H\) event in the context of its production and decay characteristics at hadron colliders. The section continues with the specification of Higgs boson and top quark pair decay channels as studied in this thesis, and closes with a brief presentation of previous measurement results.
DOI: 10.1007/978-3-030-65380-4_7
2021
Event Classification
As outlined in the measurement strategy in Sect. 4.2.2, the event categorization procedure employed in this analysis is composed of two stages.
DOI: 10.1007/978-3-030-65380-4_8
2021
Measurement
This section presents the results of the search for \(t\bar{t}H\) production conducted in this analysis.
DOI: 10.1007/978-3-030-65380-4_3
2021
Experimental Setup
This section describes the experimental environment in which this analysis is performed. It comprises the Large Hadron Collider (LHC), which provides proton-proton collision events at high energies, the Compact Muon Solenoid (CMS) detector at one of its interaction points to study induced interaction processes, as well as software and algorithms to reconstruct physics objects based on detector measurements.
DOI: 10.1007/978-3-030-65380-4_9
2021
Conclusion
After the discovery of the Higgs boson during the first run of the LHC, precise measurements of its properties and couplings to other particles are conducted by the ATLAS and CMS experiments.
DOI: 10.1007/978-3-030-65380-4_5
2021
Analysis Technologies
The strategy and implementation of this analysis are based on a set of technological key components. They can be divided into three parts and are described in the following sections.
DOI: 10.1007/978-3-030-65380-4_1
2021
Introduction
The Standard Model of Particle Physics (SM) describes our current understanding of the constituents of matter and the forces that act between them. In 1964, a mechanism explaining the origin of masses of force-carrying particles was proposed by Robert Brout, François Englert, and Peter W. Higgs [, , ] as well as by Thomas W. B. Kibble, Carl R. Hagen, and Gerald Guralnik [].
DOI: 10.1007/978-3-030-65380-4_4
2021
Analysis Strategy
This section presents an overview of the strategy that is applied in this analysis of proton-proton collisions recorded by the CMS experiment in 2016 to search for \(t\bar{t}H\,(H\rightarrow b\bar{b})\) production. Several challenges have to be addressed in order to measure the rare signal process with considerable sensitivity in the presence of dominating backgrounds from \(t\bar{t}\) contributions. While these challenges stem from physical considerations, they directly affect the pursued measurement strategy and the technical design of the analysis. The following paragraphs describe the emerging challenges as well as the key concepts employed for their accomplishment.
1991
Tracking leaf and root tips by computer vision
Tracking leaf and root tips by computer - vision , Tracking leaf and root tips by computer - vision , مرکز فناوری اطلاعات و اطلاع رسانی کشاورزی