ϟ

Lukas Layer

Here are all the papers by Lukas Layer that you can download and read on OA.mg.
Lukas Layer’s last known institution is . Download Lukas Layer PDFs here.

Claim this Profile →
DOI: 10.1016/j.revip.2023.100085
2023
Cited 5 times
Toward the end-to-end optimization of particle physics instruments with differentiable programming
The full optimization of the design and operation of instruments whose functioning relies on the interaction of radiation with matter is a super-human task, due to the large dimensionality of the space of possible choices for geometry, detection technology, materials, data-acquisition, and information-extraction techniques, and the interdependence of the related parameters. On the other hand, massive potential gains in performance over standard, "experience-driven" layouts are in principle within our reach if an objective function fully aligned with the final goals of the instrument is maximized through a systematic search of the configuration space. The stochastic nature of the involved quantum processes make the modeling of these systems an intractable problem from a classical statistics point of view, yet the construction of a fully differentiable pipeline and the use of deep learning techniques may allow the simultaneous optimization of all design parameters. In this white paper, we lay down our plans for the design of a modular and versatile modeling tool for the end-to-end optimization of complex instruments for particle physics experiments as well as industrial and medical applications that share the detection of radiation as their basic ingredient. We consider a selected set of use cases to highlight the specific needs of different applications.
DOI: 10.48550/arxiv.2405.02678
2024
Position Paper: Quo Vadis, Unsupervised Time Series Anomaly Detection?
The current state of machine learning scholarship in Timeseries Anomaly Detection (TAD) is plagued by the persistent use of flawed evaluation metrics, inconsistent benchmarking practices, and a lack of proper justification for the choices made in novel deep learning-based model designs. Our paper presents a critical analysis of the status quo in TAD, revealing the misleading track of current research and highlighting problematic methods, and evaluation practices. Our position advocates for a shift in focus from pursuing only the novelty in model design to improving benchmarking practices, creating non-trivial datasets, and placing renewed emphasis on studying the utility of model architectures for specific tasks. Our findings demonstrate the need for rigorous evaluation protocols, the creation of simple baselines, and the revelation that state-of-the-art deep anomaly detection models effectively learn linear mappings. These findings suggest the need for more exploration and development of simple and interpretable TAD methods. The increment of model complexity in the state-of-the-art deep-learning based models unfortunately offers very little improvement. We offer insights and suggestions for the field to move forward.
DOI: 10.3389/feart.2020.581742
2021
Cited 9 times
Clustering of Experimental Seismo-Acoustic Events Using Self-Organizing Map (SOM)
The analogue experiments that produce seismo-acoustic events are relevant for understanding the degassing processes of a volcanic system. The aim of this work is to design an unsupervised neural network for clustering experimental seismo-acoustic events in order to investigate the possible cause-effect relationships between the obtained signals and the processes. We focused on two tasks: 1) identify an appropriate strategy for parameterizing experimental seismo-acoustic events recorded during analogue experiments devoted to the study of degassing behavior at basaltic volcanoes; 2) define the set up of the selected neural network, the Self-Organizing Map (SOM), suitable for clustering the features extracted from the experimental events. The seismo-acoustic events were generated using an ad hoc experimental setup under different physical conditions of the analogue magma (variable viscosity), injected gas flux (variable flux velocity) and conduit surface (variable surface roughness). We tested the SOMs ability to group the experimental seismo-acoustic events generated under controlled conditions and conduit geometry of the analogue volcanic system. We used 616 seismo-acoustic events characterized by different analogue magma viscosity (10, 100, 1000 Pa s), gas flux (5, 10, 30, 60, 90, 120, 150, 180 × 10 −3 l/s) and conduit roughness (i.e. different fractal dimension corresponding to 2, 2.18, 2.99). We parameterized the seismo-acoustic events in the frequency domain by applying the Linear Predictive Coding to both accelerometric and acoustic signals generated by the dynamics of various degassing regimes, and in the time domain, applying a waveform function. Then we applied the SOM algorithm to cluster the feature vectors extracted from the seismo-acoustic data through the parameterization phase, and identified four main clusters. The results were consistent with the experimental findings on the role of viscosity, flux velocity and conduit roughness on the degassing regime. The neural network is capable to separate events generated under different experimental conditions. This suggests that the SOM is appropriate for clustering natural events such as the seismo-acoustic transients accompanying Strombolian explosions and that the adopted parameterization strategy may be suitable to extract the significant features of the seismo-acoustic (and/or infrasound) signals linked to the physical conditions of the volcanic system.
DOI: 10.3390/rs14051287
2022
Cited 5 times
Changes in the Eruptive Style of Stromboli Volcano before the 2019 Paroxysmal Phase Discovered through SOM Clustering of Seismo-Acoustic Features Compared with Camera Images and GBInSAR Data
Two paroxysmal explosions occurred at Stromboli on 3 July and 28 August 2019, the first of which caused the death of a young tourist. After the first paroxysm an effusive activity began from the summit vents and affected the NW flank of the island for the entire period between the two paroxysms. We carried out an unsupervised analysis of seismic and infrasonic data of Strombolian explosions over 10 months (15 November 2018–15 September 2019) using a Self-Organizing Map (SOM) neural network to recognize changes in the eruptive patterns of Stromboli that preceded the paroxysms. We used a dataset of 14,289 events. The SOM analysis identified three main clusters that showed different occurrences with time indicating a clear change in Stromboli’s eruptive style before the paroxysm of 3 July 2019. We compared the main clusters with the recordings of the fixed monitoring cameras and with the Ground-Based Interferometric Synthetic Aperture Radar measurements, and found that the clusters are associated with different types of Strombolian explosions and different deformation patterns of the summit area. Our findings provide new insights into Strombolian eruptive mechanisms and new perspectives to improve the monitoring of Stromboli and other open conduit volcanoes.
DOI: 10.1080/10619127.2021.1881364
2021
Cited 8 times
Toward Machine Learning Optimization of Experimental Design
The design of instruments that rely on the interaction of radiation with matter for their operation is a quite complex task if our goal is to achieve near optimality on some well-defined utility fu...
DOI: 10.1140/epjc/s10052-022-09993-5
2022
Cited 4 times
Calorimetric Measurement of Multi-TeV Muons via Deep Regression
Abstract The performance demands of future particle-physics experiments investigating the high-energy frontier pose a number of new challenges, forcing us to find improved solutions for the detection, identification, and measurement of final-state particles in subnuclear collisions. One such challenge is the precise measurement of muon momentum at very high energy, where an estimate of the curvature provided by conceivable magnetic fields in realistic detectors proves insufficient for achieving good momentum resolution when detecting, e.g., a narrow, high mass resonance decaying to a muon pair. In this work we study the feasibility of an entirely new avenue for the measurement of the energy of muons based on their radiative losses in a dense, finely segmented calorimeter. This is made possible by exploiting spatial information of the clusters of energy from radiated photons in a regression task. The use of a task-specific deep learning architecture based on convolutional layers allows us to treat the problem as one akin to image reconstruction, where images are constituted by the pattern of energy released in successive layers of the calorimeter. A measurement of muon energy with better than 20% relative resolution is shown to be achievable for ultra-TeV muons.
DOI: 10.1051/epjconf/202024503006
2020
Cited 7 times
Automatic log analysis with NLP for the CMS workflow handling
The central Monte-Carlo production of the CMS experiment utilizes the WLCG infrastructure and manages daily thousands of tasks, each up to thousands of jobs. The distributed computing system is bound to sustain a certain rate of failures of various types, which are currently handled by computing operators a posteriori. Within the context of computing operations, and operation intelligence, we propose a Machine Learning technique to learn from the operators with a view to reduce the operational workload and delays. This work is in continuation of CMS work on operation intelligence to try and reach accurate predictions with Machine Learning. We present an approach to consider the log files of the workflows as regular text to leverage modern techniques from Natural Language Processing (NLP). In general, log files contain a substantial amount of text that is not human language. Therefore, different log parsing approaches are studied in order to map the log files’ words to high dimensional vectors. These vectors are then exploited as feature space to train a model that predicts the action that the operator has to take. This approach has the advantage that the information of the log files is extracted automatically and the format of the logs can be arbitrary. In this work the performance of the log file analysis with NLP is presented and compared to previous approaches.
DOI: 10.48550/arxiv.2203.13818
2022
Cited 3 times
Toward the End-to-End Optimization of Particle Physics Instruments with Differentiable Programming: a White Paper
The full optimization of the design and operation of instruments whose functioning relies on the interaction of radiation with matter is a super-human task, given the large dimensionality of the space of possible choices for geometry, detection technology, materials, data-acquisition, and information-extraction techniques, and the interdependence of the related parameters. On the other hand, massive potential gains in performance over standard, "experience-driven" layouts are in principle within our reach if an objective function fully aligned with the final goals of the instrument is maximized by means of a systematic search of the configuration space. The stochastic nature of the involved quantum processes make the modeling of these systems an intractable problem from a classical statistics point of view, yet the construction of a fully differentiable pipeline and the use of deep learning techniques may allow the simultaneous optimization of all design parameters. In this document we lay down our plans for the design of a modular and versatile modeling tool for the end-to-end optimization of complex instruments for particle physics experiments as well as industrial and medical applications that share the detection of radiation as their basic ingredient. We consider a selected set of use cases to highlight the specific needs of different applications.
DOI: 10.5194/egusphere-egu22-10482
2022
Variations of Stromboli activity related to the 2019 paroxysmal phase revealed by SOM clustering of seismo-acoustic data and its comparison with video recordings and GBInSAR measurements 
<p>Two paroxysmal explosions occurred on Stromboli in the summer of 2019 (July 3 and August 28). The first of these explosions resulted in the death of one person. Furthermore, an effusive phase began on July 3 and lasted until August 30, 2019. This dangerous eruptive phase of Stromboli was not preceded by evident variations in the geophysical parameters routinely monitored, therefore the volcano was considered to be in a state of normal activity.</p><p>To investigate the precursors of the 2019 eruptive crisis and explain the absence of variations in the parameters routinely monitored, we analyzed the seismo-acoustic signals with an unsupervised neural network capable of discovering hidden structures of the data. We clustered about 14,200 seismo-acoustic events recorded in 10 months (November 15, 2018 - September 15, 2019) using a Self-Organizing Map (SOM). Then we compared the clustering result with the images of visible and thermal monitoring cameras, that were installed and managed by the Istituto Nazionale di Geofisica e Vulcanologia, Italy, and with the Ground-Based Interferometric Synthetic Aperture Radar displacement measurements of the summit area of the volcano recorded by BGInSAR devices, which were installed and managed by Università Degli Studi di Firenze, Italy.</p><p>The SOM analysis of the seismo-acoustic features associated with the selected dataset of explosions allowed us to recognize three main clusters in the period November 15, 2018 - September 15, 2019. We named these three clusters Red, Blue, and Green. The analysis of a subset of the selected explosions (approximately 180 events) through the videos of the visible and thermal monitoring cameras allowed us to associate distinct explosive types to the three main seismo-acoustic clusters. In particular, the cluster Red was associated with explosions characterized by well collimated oriented jets of ~ 200 m height, which eject incandescent ballistics and produce a significant infrasonic transient. The cluster Blue was associated with gas explosions with a height of 10 - 20 m and with little or no ash and pyroclastic fragment ejection. These types of explosions may not be detected by the camera recordings and infrasonic sensors. On the contrary, they are well recorded in the VLP seismic signals (filtered in the 0.05 - 0.5 Hz frequency band). The cluster Green includes explosions characterized by the emission of incandescent spatter-like fragments, with a wide range of ejection angles and hemispherical shape. The explosions of the cluster Red are mainly generated in the NE vent region, whereas the explosions of clusters Blue and Green are generally located in the central and SW vent regions.</p><p>Comparing these results with the temporal evolution of the displacement of the summit area measured by the GBInSAR device, we discovered that the variations of the eruptive style that were highlighted by the SOM clustering of the seismic-acoustic features are recognizable in the ground deformation temporal pattern. Our findings are relevant for the improvement of monitoring of volcanoes with persistent activity and volcano early warning.</p>
DOI: 10.48550/arxiv.2301.10358
2023
Application of Inferno to a Top Pair Cross Section Measurement with CMS Open Data
In recent years novel inference techniques have been developed based on the construction of non-linear summary statistics with neural networks by minimising inferencemotivated losses. One such technique is inferno (P. de Castro and T. Dorigo, Comp. Phys. Comm. 244 (2019) 170) which was shown on toy problems to outperform classical summary statistics for the problem of confidence interval estimation in the presence of nuisance parameters. In order to test and benchmark the algorithm in a real world application, a full, systematics-dominated analysis produced by the CMS experiment, "Measurement of the top-antitop production cross section in the tau+jets channel in pp collisions at sqrt(s) = 7 TeV" (CMS Collaboration, The European Physical Journal C, 2013) is reproduced with CMS Open Data. The application of the inferno-powered neural network architecture to this analysis demonstrates the potential to reduce the impact of systematic uncertainties in real LHC analyses. This work also exemplifies the extent to which LHC analyses can be reproduced with open data.
2023
Exploiting Differentiable Programming for the End-to-end Optimization of Detectors
DOI: 10.5281/zenodo.5163817
2021
Preprocessed Dataset for ``Calorimetric Measurement of Multi-TeV Muons via Deep Regression"
This record contains the fully-preprocessed training/validation and testing datasets used to train and evaluate the final models for "Calorimetric Measurement of Multi-TeV Muons via Deep Regression" by Jan Kieseler, Giles C. Strong, Filippo Chiandotto, Tommaso Dorigo, & Lukas Layer, (2021), arXiv:2107.02119 [physics.ins-det] (https://arxiv.org/abs/2107.02119). The files are LZF-compressed HDF5 format and designed to be used directly with the code-base available at https://github.com/GilesStrong/calo_muon_regression. Please use the 'issues' tab on the GitHub repo for any questions or problems with these datasets. The training dataset consists of 886,716 muons with energies in the continuous range [50,8000] GeV split into 36 subsamples (folds). The zeroth fold of this dataset is used as our validation data. The testing dataset contains 429,750 muons, generated at fixed values of muon energy (E=100, 500, 900, 1300, 1700, 2100, 2500, 2900, 3300, 3700, 4100 GeV), and split into 18 folds. The input features are the raw hits in the calorimeter (stored in a sparse COO representation), and the high-level features discussed in the paper.
DOI: 10.48550/arxiv.2203.02841
2022
Deep Regression of Muon Energy with a K-Nearest Neighbor Algorithm
Within the context of studies for novel measurement solutions for future particle physics experiments, we developed a performant kNN-based regressor to infer the energy of highly-relativistic muons from the pattern of their radiation losses in a dense and granular calorimeter. The regressor is based on a pool of weak kNN learners, which learn by adapting weights and biases to each training event through stochastic gradient descent. The effective number of parameters optimized by the procedure is in the 60 millions range, thus comparable to that of large deep learning architectures. We test the performance of the regressor on the considered application by comparing it to that of several machine learning algorithms, showing comparable accuracy to that achieved by boosted decision trees and neural networks.
2022
Toward the End-to-End Optimization of Particle Physics Instruments with Differentiable Programming: a White Paper
The full optimization of the design and operation of instruments whose functioning relies on the interaction of radiation with matter is a super-human task, given the large dimensionality of the space of possible choices for geometry, detection technology, materials, data-acquisition, and information-extraction techniques, and the interdependence of the related parameters. On the other hand, massive potential gains in performance over standard, "experience-driven" layouts are in principle within our reach if an objective function fully aligned with the final goals of the instrument is maximized by means of a systematic search of the configuration space. The stochastic nature of the involved quantum processes make the modeling of these systems an intractable problem from a classical statistics point of view, yet the construction of a fully differentiable pipeline and the use of deep learning techniques may allow the simultaneous optimization of all design parameters. In this document we lay down our plans for the design of a modular and versatile modeling tool for the end-to-end optimization of complex instruments for particle physics experiments as well as industrial and medical applications that share the detection of radiation as their basic ingredient. We consider a selected set of use cases to highlight the specific needs of different applications.
DOI: 10.48550/arxiv.2008.10958
2020
Muon Energy Measurement from Radiative Losses in a Calorimeter for a Collider Detector
The performance demands of future particle-physics experiments investigating the high-energy frontier pose a number of new challenges, forcing us to find new solutions for the detection, identification, and measurement of final-state particles in subnuclear collisions. One such challenge is the precise measurement of muon momenta at very high energy, where the curvature provided by conceivable magnetic fields in realistic detectors proves insufficient to achieve the desired resolution. In this work we show the feasibility of an entirely new avenue for the measurement of the energy of muons based on their radiative losses in a dense, finely segmented calorimeter. This is made possible by the use of the spatial information of the clusters of deposited photon energy in the regression task. Using a homogeneous lead-tungstate calorimeter as a benchmark, we show how energy losses may provide significant complementary information for the estimate of muon energies above 1 TeV.
DOI: 10.2172/1637601
2019
Automatic log analysis with NLP for the CMS workflow handling [Slides]
The automatization of failing workflow handling is discussed. Implementation of a pipeline for DAQ and ML of error logs using big data analysis tools is available at github. Development of a prototype NLP model in Keras is explored.
2021
arXiv : Calorimetric Measurement of Multi-TeV Muons via Deep Regression
The performance demands of future particle-physics experiments investigating the high-energy frontier pose a number of new challenges, forcing us to find improved solutions for the detection, identification, and measurement of final-state particles in subnuclear collisions. One such challenge is the precise measurement of muon momentum at very high energy, where an estimate of the curvature provided by conceivable magnetic fields in realistic detectors proves insufficient for achieving good momentum resolution when detecting, e.g., a narrow, high mass resonance decaying to a muon pair. In this work we show the feasibility of an entirely new avenue for the measurement of the energy of muons based on their radiative losses in a dense, finely segmented calorimeter. This is made possible by exploiting spatial information of the clusters of energy from radiated photons in a regression task. The use of a task-specific deep learning architecture based on convolutional layers allows us to treat the problem as one akin to image reconstruction, where images are constituted by the pattern of energy released in successive layers of the calorimeter. A measurement of muon energy with better than 20% relative resolution is shown to be achievable for ultra-TeV muons.
DOI: 10.5281/zenodo.5163816
2021
Preprocessed Dataset for ``Calorimetric Measurement of Multi-TeV Muons via Deep Regression"
This record contains the fully-preprocessed training/validation and testing datasets used to train and evaluate the final models for "Calorimetric Measurement of Multi-TeV Muons via Deep Regression" by Jan Kieseler, Giles C. Strong, Filippo Chiandotto, Tommaso Dorigo, & Lukas Layer, (2021), arXiv:2107.02119 [physics.ins-det] (https://arxiv.org/abs/2107.02119). The files are LZF-compressed HDF5 format and designed to be used directly with the code-base available at https://github.com/GilesStrong/calo_muon_regression. Please use the 'issues' tab on the GitHub repo for any questions or problems with these datasets. The training dataset consists of 886,716 muons with energies in the continuous range [50,8000] GeV split into 36 subsamples (folds). The zeroth fold of this dataset is used as our validation data. The testing dataset contains 429,750 muons, generated at fixed values of muon energy (E=100, 500, 900, 1300, 1700, 2100, 2500, 2900, 3300, 3700, 4100 GeV), and split into 18 folds. The input features are the raw hits in the calorimeter (stored in a sparse COO representation), and the high-level features discussed in the paper.