ϟ

S. Jindariani

Here are all the papers by S. Jindariani that you can download and read on OA.mg.
S. Jindariani’s last known institution is . Download S. Jindariani PDFs here.

Claim this Profile →
DOI: 10.1088/1748-0221/13/07/p07027
2018
Cited 269 times
Fast inference of deep neural networks in FPGAs for particle physics
Recent results at the Large Hadron Collider (LHC) have pointed to enhanced physics capabilities through the improvement of the real-time event processing techniques. Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of the use of such techniques in low-latency, low-power FPGA hardware has only just begun. FPGA-based trigger and data acquisition (DAQ) systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable, among many other physics scenarios, searches for new dark sector particles and novel measurements of the Higgs boson. While we focus on a specific example, the lessons are far-reaching. We develop a package based on High-Level Synthesis (HLS) called hls4ml to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to identify the problems in particle physics that would benefit from performing neural network inference with FPGAs. For our example jet substructure model, we fit well within the available resources of modern FPGAs with a latency on the scale of 100 ns.
DOI: 10.1088/2632-2153/aba042
2020
Cited 60 times
Compressing deep neural networks on FPGAs to binary and ternary precision with <tt>hls4ml</tt>
We present the implementation of binary and ternary neural networks in the hls4ml library, designed to automatically convert deep neural network models to digital circuits with field-programmable gate arrays (FPGA) firmware. Starting from benchmark models trained with floating point precision, we investigate different strategies to reduce the network's resource consumption by reducing the numerical precision of the network parameters to binary or ternary. We discuss the trade-off between model accuracy and resource consumption. In addition, we show how to balance between latency and accuracy by retaining full precision on a selected subset of network components. As an example, we consider two multiclass classification tasks: handwritten digit recognition with the MNIST data set and jet identification with simulated proton-proton collisions at the CERN Large Hadron Collider. The binary and ternary implementation has similar performance to the higher precision implementation while using drastically fewer FPGA resources.
DOI: 10.1088/2632-2153/ac0ea1
2021
Cited 53 times
Fast convolutional neural networks on FPGAs with hls4ml
Abstract We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the hls4ml library, we demonstrate an inference latency of 5 µ s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.
DOI: 10.3389/fdata.2020.598927
2021
Cited 41 times
Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics
Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged particle tracking, jet tagging, and clustering. An important domain for the application of these networks is the FGPA-based first layer of real-time data filtering at the CERN Large Hadron Collider, which has strict latency and resource constraints. We discuss how to design distance-weighted graph networks that can be executed with a latency of less than 1$\mu\mathrm{s}$ on an FPGA. To do so, we consider a representative task associated to particle reconstruction and identification in a next-generation calorimeter operating at a particle collider. We use a graph network architecture developed for such purposes, and apply additional simplifications to match the computing constraints of Level-1 trigger systems, including weight quantization. Using the $\mathtt{hls4ml}$ library, we convert the compressed models into firmware to be implemented on an FPGA. Performance of the synthesized models is presented both in terms of inference accuracy and resource usage.
DOI: 10.3389/fdata.2022.787421
2022
Cited 24 times
Applications and Techniques for Fast Machine Learning in Science
In this community review report, we discuss applications and techniques for fast machine learning (ML) in science-the concept of integrating powerful ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs.
DOI: 10.1007/s41781-019-0027-2
2019
Cited 43 times
FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing
Large-scale particle physics experiments face challenging demands for high-throughput computing resources both now and in the future. New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that potentially requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) ms with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600–700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.
DOI: 10.1088/1748-0221/15/05/p05026
2020
Cited 41 times
Fast inference of Boosted Decision Trees in FPGAs for particle physics
We describe the implementation of Boosted Decision Trees in the hls4ml library, which allows the translation of a trained model into FPGA firmware through an automated conversion process. Thanks to its fully on-chip implementation, hls4ml performs inference of Boosted Decision Tree models with extremely low latency. With a typical latency less than 100 ns, this solution is suitable for FPGA-based real-time processing, such as in the Level-1 Trigger system of a collider experiment. These developments open up prospects for physicists to deploy BDTs in FPGAs for identifying the origin of jets, better reconstructing the energies of muons, and enabling better selection of rare signal processes.
DOI: 10.1016/j.nima.2013.07.015
2013
Cited 25 times
Operational experience, improvements, and performance of the CDF Run II silicon vertex detector
The Collider Detector at Fermilab (CDF) pursues a broad physics program at Fermilab's Tevatron collider. Between Run II commissioning in early 2001 and the end of operations in September 2011, the Tevatron delivered 12 fb-1 of integrated luminosity of p-pbar collisions at sqrt(s)=1.96 TeV. Many physics analyses undertaken by CDF require heavy flavor tagging with large charged particle tracking acceptance. To realize these goals, in 2001 CDF installed eight layers of silicon microstrip detectors around its interaction region. These detectors were designed for 2--5 years of operation, radiation doses up to 2 Mrad (0.02 Gy), and were expected to be replaced in 2004. The sensors were not replaced, and the Tevatron run was extended for several years beyond its design, exposing the sensors and electronics to much higher radiation doses than anticipated. In this paper we describe the operational challenges encountered over the past 10 years of running the CDF silicon detectors, the preventive measures undertaken, and the improvements made along the way to ensure their optimal performance for collecting high quality physics data. In addition, we describe the quantities and methods used to monitor radiation damage in the sensors for optimal performance and summarize the detector performance quantities important to CDF's physics program, including vertex resolution, heavy flavor tagging, and silicon vertex trigger performance.
DOI: 10.48550/arxiv.2012.01563
2020
Cited 15 times
Accelerated Charged Particle Tracking with Graph Neural Networks on FPGAs
We develop and study FPGA implementations of algorithms for charged particle tracking based on graph neural networks. The two complementary FPGA designs are based on OpenCL, a framework for writing programs that execute across heterogeneous platforms, and hls4ml, a high-level-synthesis-based compiler for neural network to firmware conversion. We evaluate and compare the resource usage, latency, and tracking performance of our implementations based on a benchmark dataset. We find a considerable speedup over CPU-based execution is possible, potentially enabling such algorithms to be used effectively in future computing workflows and the FPGA-based Level-1 trigger at the CERN Large Hadron Collider.
2024
ACE Science Workshop Report
We summarize the Fermilab Accelerator Complex Evolution (ACE) Science Workshop, held on June 14-15, 2023. The workshop presented the strategy for the ACE program in two phases: ACE Main Injector Ramp and Target (MIRT) upgrade and ACE Booster Replacement (BR) upgrade. Four plenary sessions covered the primary experimental physics thrusts: Muon Collider, Neutrinos, Charged Lepton Flavor Violation, and Dark Sectors. Additional physics and technology ideas were presented from the community that could expand or augment the ACE science program. Given the physics framing, a parallel session at the workshop was dedicated to discussing priorities for accelerator R\&D. Finally, physics discussion sessions concluded the workshop where experts from the different experimental physics thrusts were brought together to begin understanding the synergies between the different physics drivers and technologies. In December of 2023, the P5 report was released setting the physics priorities for the field in the next decade and beyond, and identified ACE as an important component of the future US accelerator-based program. Given the presentations and discussions at the ACE Science Workshop and the findings of the P5 report, we lay out the topics for study to determine the physics priorities and design goals of the Fermilab ACE project in the near-term.
DOI: 10.2172/1863003
2022
Cited 5 times
Higgs-Energy LEptoN (HELEN) Collider based on advanced superconducting radio frequency technology
This Snowmass 2021 contributed paper discusses a Higgs-Energy LEptoN (HELEN) e⁺e⁻ linear collider based on advances superconducting radio frequency technology. The proposed collider offers cost and AC power savings, smaller footprint (relative to the ILC), and could be built at Fermilab with an Interaction Region within the site boundaries. After the initial physics run at 250 GeV, the collider could be upgraded either to higher luminosity or to higher (up to 500 GeV) energies. If the ILC could not be realized in Japan in a timely fashion, the HELEN collider would be a viable option to build a Higgs factory in the U.S.
DOI: 10.3390/jlpea8030025
2018
Cited 12 times
Multi-Vdd Design for Content Addressable Memories (CAM): A Power-Delay Optimization Analysis
In this paper, we characterize the interplay between power consumption and performance of a matchline-based Content Addressable Memory and then propose the use of a multi-Vdd design to save power and increase post-fabrication tunability. Exploration of the power consumption behavior of a CAM chip shows the drastically different behavior among the components and suggests the use of different and independent power supplies. The complete design, simulation and testing of a multi-Vdd CAM chip along with an exploration of the multi-Vdd design space are presented. Our analysis has been applied to simulated models on two different technology nodes (130 nm and 45 nm), followed by experiments on a 246-kb test chip fabricated in 130 nm Global Foundries Low Power CMOS technology. The proposed design, operating at an optimal operating point in a triple-Vdd configuration, increases the power-delay operation range by 2.4 times and consumes 25.3% less dynamic power when compared to a conventional single-Vdd design operating over the same voltage range with equivalent noise margin. Our multi-Vdd design also helps save 51.3% standby power. Measurement results from the test chip combined with the simulation analysis at the two nodes validate our thesis.
DOI: 10.1088/1748-0221/10/02/c02029
2015
Cited 9 times
Design and testing of the first 2D Prototype Vertically Integrated Pattern Recognition Associative Memory
An associative memory-based track finding approach has been proposed for a Level 1 tracking trigger to cope with increasing luminosities at the LHC. The associative memory uses a massively parallel architecture to tackle the intrinsically complex combinatorics of track finding algorithms, thus avoiding the typical power law dependence of execution time on occupancy and solving the pattern recognition in times roughly proportional to the number of hits. This is of crucial importance given the large occupancies typical of hadronic collisions. The design of an associative memory system capable of dealing with the complexity of HL-LHC collisions and with the short latency required by Level 1 triggering poses significant, as yet unsolved, technical challenges. For this reason, an aggressive R&D program has been launched at Fermilab to advance state of-the-art associative memory technology, the so called VIPRAM (Vertically Integrated Pattern Recognition Associative Memory) project. The VIPRAM leverages emerging 3D vertical integration technology to build faster and denser Associative Memory devices. The first step is to implement in conventional VLSI the associative memory building blocks that can be used in 3D stacking; in other words, the building blocks are laid out as if it is a 3D design. In this paper, we report on the first successful implementation of a 2D VIPRAM demonstrator chip (protoVIPRAM00). The results show that these building blocks are ready for 3D stacking.
DOI: 10.1007/s41781-021-00067-x
2021
Cited 7 times
Full Detector Simulation with Unprecedented Background Occupancy at a Muon Collider
Abstract In recent years, a Muon collider has attracted a lot of interest in the high-energy physics community, thanks to its ability of achieving clean interaction signatures at multi-TeV collision energies in the most cost-effective way. Estimation of the physics potential of such an experiment must take into account the impact of beam-induced background on the detector performance, which has to be carefully evaluated using full detector simulation. Tracing of all the background particles entering the detector region in a single bunch crossing is out of reach for any realistic computing facility due to the unprecedented number of such particles. To make it feasible a number of optimisations have been applied to the detector simulation workflow. This contribution presents an overview of the main characteristics of the beam-induced background at a Muon collider, the detector technologies considered for the experiment and how they are taken into account to strongly reduce the number of irrelevant computations performed during the detector simulation. Special attention is dedicated to the optimisation of track reconstruction with the conformal tracking algorithm in this high-occupancy environment, which is the most computationally demanding part of event reconstruction.
DOI: 10.48550/arxiv.2203.07261
2022
Cited 4 times
The physics case of a 3 TeV muon collider stage
In the path towards a muon collider with center of mass energy of 10 TeV or more, a stage at 3 TeV emerges as an appealing option. Reviewing the physics potential of such muon collider is the main purpose of this document. In order to outline the progression of the physics performances across the stages, a few sensitivity projections for higher energy are also presented. There are many opportunities for probing new physics at a 3 TeV muon collider. Some of them are in common with the extensively documented physics case of the CLIC 3 TeV energy stage, and include measuring the Higgs trilinear coupling and testing the possible composite nature of the Higgs boson and of the top quark at the 20 TeV scale. Other opportunities are unique of a 3 TeV muon collider, and stem from the fact that muons are collided rather than electrons. This is exemplified by studying the potential to explore the microscopic origin of the current $g$-2 and $B$-physics anomalies, which are both related with muons.
DOI: 10.1088/1748-0221/18/08/t08007
2023
DIMUS: super-compact Dimuonium Spectroscopy collider at Fermilab
Abstract While dimuonium ( μ + μ - ) — the “smallest QED atom” — has not yet been observed, it is of utmost fundamental interest. By virtue of the larger mass, dimuonium has greater sensitivity to beyond the standard model (BSM) effects than its cousins positronium or muonium, both discovered long ago, while not suffering from large QCD uncertainties. Dimuonium atoms can be created in e + e - collisions with large longitudinal momentum, allowing them to decay a small distance away from the beam crossing point and avoid prompt backgrounds. We envision a unique cost-effective and fast-timeline opportunity for copious production of ( μ + μ - ) atoms at the production threshold via a modest modification of existing Fermilab Accelerator Science and Technology (FAST) facility to arrange collisions of 408 MeV electrons and positrons at a 75° angle. This compact 23 m circumference collider (DIMUS) will allow for precision tests of QED and open the door for searches for new physics coupled to the muon. The FAST facility is perfectly suited for DIMUS as there are existing SRF accelerators and infrastructure, capable of producing high energy, high current electron and positron beams, sufficient for O (10 32 )cm 2 s -1 luminosity and ∼0.5 million dimuons per year. The expansion will require installation of a second SRF cryomodule, positron production and accumulation system, fast injection/extraction kickers and two small circumference intersecting rings. An approximately meter-sized detector with several layers of modern pixelated silicon detector and crystal-based electromagnetic calorimeters will ensure observation of the decays of dimuonium to electron-positron pairs in presence of the Bhabba scattering background. An expansion of the system to include solenoidal magnet outside of the calorimeter system, a layer of steel shielding behind the magnet, and a set of dedicated muon detectors would extend the physics program of DIMUS to include precision studies of rare processes with muons, pions, and η mesons produced in e + e - collisions.
DOI: 10.1103/physrevd.108.093009
2023
Anomalous production of massive gauge boson pairs at muon colliders
The prospects of searches for anomalous production of hadronically decaying weak boson pairs at proposed high-energy muon colliders are reported. Muon-muon collision events are simulated at $\sqrt{s}=6$, 10, and 30 TeV, corresponding to an integrated luminosity of 4, 10, and $10\text{ }\text{ }{\mathrm{ab}}^{\ensuremath{-}1}$, respectively. Simulated $\ensuremath{\mu}\ensuremath{\mu}\ensuremath{\rightarrow}\mathrm{W}\mathrm{W}+\ensuremath{\nu}\ensuremath{\nu}/\ensuremath{\mu}\ensuremath{\mu}$ events are used to set expected constraints on the structure of quartic vector boson interactions in the framework of a dimension-8 effective field theory. Similarly, $\ensuremath{\mu}\ensuremath{\mu}\ensuremath{\rightarrow}\mathrm{W}\mathrm{W}/\mathrm{Z}\mathrm{Z}+\ensuremath{\nu}\ensuremath{\nu}$ events are used to report constraints on the product of the cross section and branching fraction for vector boson fusion production of a heavy neutral Higgs boson decaying to weak boson pairs. These results are interpreted in the context of the Georgi-Machacek model.
DOI: 10.2172/1881962
2022
Cited 3 times
Promising Technologies and R&amp;amp;D Directions for the Future Muon Collider Detectors
Among the post-LHC generation of particle accelerators, the muon collider represents a unique machine with capability to provide very high energy leptonic collisions and to open the path to a vast and mostly unexplored physics programme. However, on the experimental side, such great physics potential is accompanied by unprecedented technological challenges, due to the fact that muons are unstable particles. Their decay products interact with the machine elements and produce an intense flux of background particles that eventually reach the detector and may degrade its performance. In this paper, we present technologies that have a potential to match the challenging specifications of a muon collider detector and outline a path forward for the future R&D efforts.
DOI: 10.1109/iccd.2015.7357156
2015
Cited 5 times
A methodology for power characterization of associative memories
Content Addressable Memories (CAM) have become increasingly more important in applications requiring high speed memory search due to their inherent massively parallel processing architecture. We present a complete power analysis methodology for CAM systems to aid the exploration of their power-performance trade-offs in future systems. Our proposed methodology uses detailed transistor level circuit simulation of power behavior and a handful of input data types to simulate full chip power consumption. Furthermore, we applied our power analysis methodology on a custom designed associative memory test chip. This chip was developed by Fermilab for the purpose of developing high performance real-time pattern recognition on high volume data produced by a future large-scale scientific experiment. We applied our methodology to configure a power model for this test chip. Our model is capable of predicting the total average power within 4% of actual power measurements. Our power analysis methodology can be generalized and applied to other CAM-like memory systems and accurately characterize their power behavior.
DOI: 10.1016/j.nima.2019.05.018
2019
Cited 4 times
A high-performance track fitter for use in ultra-fast electronics
This article describes a new charged-particle track fitting algorithm designed for use in high-speed electronics applications such as hardware-based triggers in high-energy physics experiments. Following a novel technique designed for fast electronics, the positions of the hits on the detector are transformed before being passed to a linearized track parameter fit. This transformation results in fitted track parameters with a very linear dependence on the hit positions. The approach is demonstrated in a representative detector geometry based on the CMS detector at the Large Hadron Collider. The fit is implemented in FPGA chips and optimized for track fitting throughput and obtains excellent track parameter performance. Such an algorithm is potentially useful in any high-speed track-fitting application.
DOI: 10.1145/3289602.3293986
2019
Cited 4 times
Fast Inference of Deep Neural Networks for Real-time Particle Physics Applications
Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of such techniques in low-latency, low-power FPGA (Field Programmable Gate Array) hardware has only just begun. FPGA-based trigger and data acquisition systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable many new physics measurements. While we focus on a specific example, the lessons are far-reaching. A compiler package is developed based on High-Level Synthesis (HLS) called HLS4ML to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to allow for directed resource tuning in the low latency environment and assess the impact on our benchmark Physics performance scenario For our example jet substructure model, we fit well within the available resources of modern FPGAs with latency on the scale of 100~ns.
DOI: 10.1109/mwscas.2017.8052945
2017
Cited 3 times
A content addressable memory with multi-Vdd scheme for low power tunable operation
This paper reports on a content addressable memory (CAM) employing a multi-Vdd scheme for low power pattern recognition applications. The complete design, simulation and testing of the chip is presented along with an exploration of the multi-Vdd design space. The proposed design, operating at an optimal operating point in a triple-Vdd configuration, increases the delay range by 2.4 times and consumes 25.3% less power when compared to a conventional single-Vdd design operating over the same voltage range. Measurement results from a 246 kb test chip fabricated in 130nm Global Foundries Low Power CMOS technology are presented to validate the model and analysis.
2021
Cited 3 times
hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices
Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.
DOI: 10.2172/1975518
2023
Options for DIMUS: Di-Muon-Spectroscopy Collider
is a precision measurement of electroweak mixing angle, sin{sup 2}{theta}{sub W}, which can be achieved to the precision equivalent to {delta}M{sub W}{approximately} 30MeV.
2021
arXiv : Review of opportunities for new long-lived particle triggers in Run 3 of the Large Hadron Collider
Long-lived particles (LLPs) are highly motivated signals of physics Beyond the Standard Model (BSM) with great discovery potential and unique experimental challenges. The LLP search programme made great advances during Run 2 of the Large Hadron Collider (LHC), but many important regions of signal space remain unexplored. Dedicated triggers are crucial to improve the potential of LLP searches, and their development and expansion is necessary for the full exploitation of the new data. The public discussion of triggers has therefore been a relevant theme in the recent LLP literature, in the meetings of the LLP@LHC Community workshop and in the respective experiments. This paper documents the ideas collected during talks and discussions at these Workshops, benefiting as well from the ideas under development by the trigger community within the experimental collaborations. We summarise the theoretical motivations of various LLP scenarios leading to highly elusive signals, reviewing concrete ideas for triggers that could greatly extend the reach of the LHC experiments. We thus expect this document to encourage further thinking for both the phenomenological and experimental communities, as a stepping stone to further develop the LLP@LHC physics programme.
DOI: 10.48550/arxiv.2203.08135
2022
Anomalous production of massive gauge boson pairs at muon colliders
The prospects of searches for anomalous production of hadronically decaying weak boson pairs at proposed high-energy muon colliders are reported. Muon-muon collision events are simulated at $\sqrt{s}=6$, 10, and 30 TeV, corresponding to an integrated luminosity of $4$, $10$, and $10$ ab$^{-1}$, respectively. Simulated $\mu\mu\rightarrow\mathrm{W}\mathrm{W}+\nu\nu/\mu\mu$ events are used to set expected constraints on the structure of quartic vector boson interactions in the framework of a dimension-8 effective field theory. Similarly, $\mu\mu\rightarrow\mathrm{W}\mathrm{W}/\mathrm{Z}\mathrm{Z}+\nu\nu$ events are used to report constraints on the product of the cross section and branching fraction for vector boson fusion production of a heavy neutral Higgs boson decaying to weak boson pairs. These results are interpreted in the context of the Georgi-Machacek model.
DOI: 10.1103/physrevd.108.093009
2022
Anomalous production of massive gauge boson pairs at muon colliders
Prospects for searches of anomalous quartic gauge couplings at a future high-energy muon collider using the production of $\mathrm{WW}$ boson pairs are reported. Muon-muon collision events are simulated at $\sqrt{s}=6$ TeV corresponding to an integrated luminosity of $4$ ab$^{-1}$. The simulated events are used to study the $\mathrm{W}\mathrm{W}\nu\nu$ and $\mathrm{W}\mathrm{W}\mu\mu$ final states with the $\mathrm{W}$ bosons decaying hadronically. The events are analyzed to report expected constraints on the structure of quartic vector boson interactions in the framework of dimension-8 effective field theory operators.
DOI: 10.1590/s0103-97332007000500040
2007
New results on jet fragmentation at CDF
Presented are the latest results of jet fragmentation studies at the Tevatron using the CDF Run II detector.Studies include the distribution of transverse momenta (Kt) of particles jets, two-particle momentum correlations, and indirectly global event shapes in p p collisions.Results are discussed within the context of recent Next-to-Leading Log calculations as well as earlier experimental results from the Tevatron and e + e -colliders.
DOI: 10.48550/arxiv.1709.08303
2017
Performance Study of the First 2D Prototype of Vertically Integrated Pattern Recognition Associative Memory (VIPRAM)
Extremely fast pattern recognition capabilities are necessary to find and fit billions of tracks at the hardware trigger level produced every second anticipated at high luminosity LHC (HL-LHC) running conditions. Associative Memory (AM) based approaches for fast pattern recognition have been proposed as a potential solution to the tracking trigger. However, at the HL-LHC, there is much less time available and speed performance must be improved over previous systems while maintaining a comparable number of patterns. The Vertically Integrated Pattern Recognition Associative Memory (VIPRAM) Project aims to achieve the target pattern density and performance goal using 3DIC technology. The first step taken in the VIPRAM work was the development of a 2D prototype (protoVIPRAM00) in which the associative memory building blocks were designed to be compatible with the 3D integration. In this paper, we present the results from extensive performance studies of the protoVIPRAM00 chip in both realistic HL-LHC and extreme conditions. Results indicate that the chip operates at the design frequency of 100 MHz with perfect correctness in realistic conditions and conclude that the building blocks are ready for 3D stacking. We also present performance boundary characterization of the chip under extreme conditions.
DOI: 10.2172/1570210
2019
FPGAs as a Service to Accelerate Machine Learning Inference [PowerPoint]
Large-scale particle physics experiments face challenging demands for high-throughput computing resources both now and in the future. New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC, and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave b y Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) milliseconds with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600--700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.
DOI: 10.1109/tns.2020.2968860
2020
Performance Study of the First 2-D Prototype of Vertically Integrated Pattern Recognition Associative Memory
Extremely fast pattern recognition capabilities are necessary to find and fit billions of tracks at the hardware trigger level produced every second anticipated at high-luminosity Large Hadron Collider (HL-LHC) running conditions. Associative memory (AM)-based approaches for fast pattern recognition have been proposed as a potential solution to the tracking trigger. However, at the HL-LHC, there is much less time available, and the speed performance must be improved over previous systems while maintaining a comparable number of patterns. The vertically integrated pattern recognition AM (VIPRAM) project aims to achieve the target pattern density and performance goal using 3DIC technology. The first step taken in the VIPRAM work was the development of a 2-D prototype (protoVIPRAM00) in which the AM building blocks were designed to be compatible with the 3-D integration. In this article, we present the results from extensive performance studies of the protoVIPRAM00 chip in both realistic HL-LHC and extreme conditions. Results indicate that the chip operates at the design frequency of 100 MHz with perfect correctness in realistic conditions and conclude that the building blocks are ready for 3-D stacking. We also present performance boundary characterization of the chip under extreme conditions.
DOI: 10.1016/j.nima.2014.01.017
2014
CDF Run II silicon vertex detector annealing study
Between Run II commissioning in early 2001 and the end of operations in September 2011, the Tevatron collider delivered 12 fb−1 of pp¯ collisions at s=1.96TeV to the Collider Detector at Fermilab (CDF). During that time, the CDF silicon vertex detector was subject to radiation doses of up to 12 Mrad. After the end of operations, the silicon detector was annealed for 24 days at 18 °C. In this paper, we present a measurement of the change in the bias currents for a subset of sensors during the annealing period. We also introduce a novel method for monitoring the depletion voltage throughout the annealing period. The observed bias current evolution can be characterized by a falling exponential term with time constant τI=17.88±0.36(stat.)±0.25(syst.) days. We observe an average decrease of (27±3)% in the depletion voltage, whose evolution can similarly be described by an exponential time constant of τV=6.21±0.21days. These results are consistent with the Hamburg model within the measurement uncertainties.
DOI: 10.1016/j.nuclphysbps.2015.09.375
2016
Measurements of top quark properties in top pair production and decay at the LHC using the CMS detector
Measurements are presented of the properties of top quarks in pair production and decay from proton-proton collisions at the LHC. The data were collected at centre-of-mass energies of 7 and 8 TeV by the CMS experiment during the years 2011 and 2012. The top quark-antiquark charge asymmetry is measured using the difference of the absolute rapidities of the reconstructed top and anti-top kinematics, as well as from distributions of the top quark decay products. The measurements are performed in the decay channels of the tt‾ pair into both one and two leptons in the final state. The polarization of top quarks and top pair spin correlations are measured from the angular distributions of top quark decay products. The W-boson helicity fractions and angular asymmetries are extracted and limits on anomalous contributions to the Wtb vertex are determined. The flavor content in top-quark pair events is measured using the fraction of top quarks decaying into a W-boson and a b-quark relative to all top quark decays, R=B(t→Wb)/B(t→Wq), and the result is used to determine the CKM matrix element Vtb as well as the width of the top quark resonance. All of the results are found to be in good agreement with standard model predictions.
2017
arXiv : Performance Study of the First 2D Prototype of Vertically Integrated Pattern Recognition Associative Memory (VIPRAM)
Extremely fast pattern recognition capabilities are necessary to find and fit billions of tracks at the hardware trigger level produced every second anticipated at high luminosity LHC (HL-LHC) running conditions. Associative Memory (AM) based approaches for fast pattern recognition have been proposed as a potential solution to the tracking trigger. However, at the HL-LHC, there is much less time available and speed performance must be improved over previous systems while maintaining a comparable number of patterns. The Vertically Integrated Pattern Recognition Associative Memory (VIPRAM) Project aims to achieve the target pattern density and performance goal using 3DIC technology. The first step taken in the VIPRAM work was the development of a 2D prototype (protoVIPRAM00) in which the associative memory building blocks were designed to be compatible with the 3D integration. In this paper, we present the results from extensive performance studies of the protoVIPRAM00 chip in both realistic HL-LHC and extreme conditions. Results indicate that the chip operates at the design frequency of 100 MHz with perfect correctness in realistic conditions and conclude that the building blocks are ready for 3D stacking. We also present performance boundary characterization of the chip under extreme conditions.
2010
The CDF Silicon Detector: Performance and Longevity
DOI: 10.22323/1.102.0050
2010
High mass Higgs at Tevatron
2009
High mass Higgs at Tevatron
DOI: 10.1016/j.nuclphysbps.2007.11.108
2008
Soft QCD and the underlying event at CDF
Presented are the latest results of jet fragmentation studies at the Tevatron using the CDF Run II detector. Studies include indirectly global event shapes in p p ¯ collisions, the distribution of transverse momenta ( k T ) of particles in jets, the underlying event studies and two-particle momentum correlations. Results are compared to parton shower Monte Carlos and recent NLLA calculations as well as earlier experimental results from the Tevatron and e + e − colliders.
2009
Status and Operational Experience with the CDF Run II Silicon Detector
DOI: 10.48550/arxiv.2203.07224
2022
Promising Technologies and R&amp;D Directions for the Future Muon Collider Detectors
Among the post-LHC generation of particle accelerators, the muon collider represents a unique machine with capability to provide very high energy leptonic collisions and to open the path to a vast and mostly unexplored physics programme. However, on the experimental side, such great physics potential is accompanied by unprecedented technological challenges, due to the fact that muons are unstable particles. Their decay products interact with the machine elements and produce an intense flux of background particles that eventually reach the detector and may degrade its performance. In this paper, we present technologies that have a potential to match the challenging specifications of a muon collider detector and outline a path forward for the future R&D efforts.
2022
Simulated Detector Performance at the Muon Collider
In this paper we report on the current status of studies on the expected performance for a detector designed to operate in a muon collider environment. Beam-induced backgrounds (BIB) represent the main challenge in the design of the detector and the event reconstruction algorithms. The current detector design aims to show that satisfactory performance can be achieved, while further optimizations are expected to significantly improve the overall performance. We present the characterization of the expected beam-induced background, describe the detector design and software used for detailed event simulations taking into account BIB effects. The expected performance of charged-particle reconstruction, jets, electrons, photons and muons is discussed, including an initial study on heavy-flavor jet tagging. A simple method to measure the delivered luminosity is also described. Overall, the proposed design and reconstruction algorithms can successfully reconstruct the high transverse-momentum objects needed to carry out a broad physics program.
DOI: 10.48550/arxiv.2203.07144
2022
DIMUS: Super-Compact Dimuonium Spectroscopy Collider at Fermilab
While dimuonium $(\mu^+\mu^-)$ has not yet been observed, it is of utmost fundamental interest. By virtue of the larger mass, dimuonium has greater sensitivity to beyond the standard model effects than its cousins positronium or muonium, both discovered long ago, while not suffering from large QCD uncertainties. Dimuonium atoms can be created in $e^+e^-$ collisions with large longitudinal momentum, allowing them to decay a small distance away from the beam crossing point and avoid prompt backgrounds. We envision a unique cost-effective and fast-timeline opportunity for copious production of $(\mu^+\mu^-)$ atoms at the production threshold via a modest modification of Fermilab's existing FAST/NML facility to arrange collisions of 408 MeV electrons and positrons at a 75$^{\rm o}$ angle. This compact 23 m circumference collider (DIMUS) will allow for precision tests of QED and open the door for searches for new physics coupled to the muon. Fermilab's FAST/NML is perfectly suited for DIMUS as there are existing SRF accelerators and infrastructure, capable of producing high energy, high current electron and positron beams, sufficient for $O(10^{32})\mathrm{cm}^2\mathrm{s}^{-1}$ luminosity and $\sim$0.5 million dimuons per year. The expansion will require installation of a second SRF cryomodule, positron production and accumulation system, fast injection/extraction kickers and two small circumference intersecting rings. An approximately meter-sized detector with several layers of modern pixelated silicon detector and crystal-based electromagnetic calorimeters will ensure observation of the decays of dimuonium to electron-positron pairs in presence of the Bhabba scattering background. An expansion of the system to would extend the physics program of DIMUS to include precision studies of rare processes with muons, pions, and $\eta$ mesons produced in $e^{+}e^{-}$ collisions.
2022
Muon Collider Physics Summary
The perspective of designing muon colliders with high energy and luminosity, which is being investigated by the International Muon Collider Collaboration, has triggered a growing interest in their physics reach. We present a concise summary of the muon colliders potential to explore new physics, leveraging on the unique possibility of combining high available energy with very precise measurements.
2022
The physics case of a 3 TeV muon collider stage
In the path towards a muon collider with center of mass energy of 10 TeV or more, a stage at 3 TeV emerges as an appealing option. Reviewing the physics potential of such muon collider is the main purpose of this document. In order to outline the progression of the physics performances across the stages, a few sensitivity projections for higher energy are also presented. There are many opportunities for probing new physics at a 3 TeV muon collider. Some of them are in common with the extensively documented physics case of the CLIC 3 TeV energy stage, and include measuring the Higgs trilinear coupling and testing the possible composite nature of the Higgs boson and of the top quark at the 20 TeV scale. Other opportunities are unique of a 3 TeV muon collider, and stem from the fact that muons are collided rather than electrons. This is exemplified by studying the potential to explore the microscopic origin of the current $g$-2 and $B$-physics anomalies, which are both related with muons.
2022
Muon Collider Physics Summary
DOI: 10.2172/1884523
2022
U.S. National Accelerator R&amp;amp;D Program on Future Colliders
Future colliders are an essential component of a strategic vision for particle physics. Conceptual studies and technical developments for several exciting future collider options are underway internationally. In order to realize a future collider, a concerted accelerator R&D program is required. The U.S. HEP accelerator R&D program currently has no direct effort in collider-specific R&D area. This shortcoming greatly compromises the U.S. leadership role in accelerator and particle physics. In this white paper, we propose a new national accelerator R&D program on future colliders and outline the important characteristics of such a program.
DOI: 10.48550/arxiv.2203.13900
2022
4-Dimensional Trackers
4-dimensional (4D) trackers with ultra fast timing (10-30 ps) and very fine spatial resolution (O(few $\mu$m)) represent a new avenue in the development of silicon trackers, enabling new physics capabilities beyond the reach of the existing tracking detectors. This paper reviews the impact of integrating 4D tracking capabilities on several physics benchmarks both in potential upgrades of the HL-LHC experiments and in several detectors at future colliders, and summarizes the currently available sensor technologies as well as electronics, along with their limitations and directions for R$\&$D.
2022
4-Dimensional Trackers
DOI: 10.1088/1748-0221/17/12/p12002
2022
Charged particle tracking in real-time using a full-mesh data delivery architecture and associative memory techniques
Abstract We present a flexible and scalable approach to address the challenges of charged particle track reconstruction in real-time event filters (Level-1 triggers) in collider physics experiments. The method described here is based on a full-mesh architecture for data distribution and relies on the Associative Memory approach to implement a pattern recognition algorithm that quickly identifies and organizes hits associated to trajectories of particles originating from particle collisions. We describe a successful implementation of a demonstration system composed of several innovative hardware and algorithmic elements. The implementation of a full-size system relies on the assumption that an Associative Memory device with the sufficient pattern density becomes available in the future, either through a dedicated ASIC or a modern FPGA. We demonstrate excellent performance in terms of track reconstruction efficiency, purity, momentum resolution, and processing time measured with data from a simulated LHC-like tracking detector.
DOI: 10.2172/1343954
2007
Fragmentation of Jets Produced in Proton-Antiproton Collisions at $\sqrt{s} = 1.96$ TeV
We present the first measurement of two-particle momentum correlations in jets produced in p$\bar{p}$ collisions at center of mass energy of 1.96 TeV. A comparison of the experimental data to theoretical predictions obtained for partons within the framework of resummed perturbative QCD (Next-to-Leading Log Approximation) shows that the predicted parton momentum correlations survive the hadronization stage of jet fragmentation and are present at the hadron level. We also present the measurement of the intrinsic transverse momenta of particles with respect to jet axis (kT ). Experimental data is compared to the theoretical predictions obtained for partons within the framework of Modified Leading Log Approximation and Next-to-Modified Leading Log Approximation, and shows good agreement in the range of validity of the theoretical predictions. The results of both measurements indicate that the perturbative stage of the jet formation must be dominant and give further support to the hypothesis of Local Parton-Hadron Duality.
DOI: 10.2172/1592124
2019
Accelerated Machine Learning as a Service for Particle Physics Computing
Accelerated Machine Learning as a Service for Particle Physics Computing: • Amount and complexity of high-energy physics data increases dramatically from 2025 onward • Traditional algorithms will require too much CPU time • Machine learning can solve combinatorially-scaling problems in constant time, but must be fast enough
DOI: 10.2172/1630707
2019
hls4ml: Deploying Deep Learning on FPGAs for L1 trigger and Data Acquisition
neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC, and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave b y Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) milliseconds with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600--700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.
2019
FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing
DOI: 10.5281/zenodo.3895029
2019
Accelerated Machine Learning as a Service for Particle Physics Computing
DOI: 10.5281/zenodo.3598989
2019
hls4ml: Deploying Deep Learning on FPGAs for L1 trigger and Data Acquisition [PowerPoint]
2018
A High-performance Track Fitter for Use in Ultra-fast Electronics
2007
Jet Fragmentation at CDF
DOI: 10.1063/1.2220247
2006
Two-particle Momentum Correlation in Jets at the Tevatron
Presented are the measurements of two‐particle momentum correlations in jets produced in p‐pbar collisions at center of mass frame energy 1.96 TeV. Studies were performed for charged particles within a restricted opening angle of 0.5 rad around the jet axis and for dijet events with various dijet masses. Comparison of the experimental results to the theoretical predictions obtained for partons within the framework of the resummed perturbative QCD (Next‐to‐Leading Log Approximation) shows that the parton momentum correlations do survive the hadronization stage of jet fragmentation, thus, giving further support to the hypothesis of Local Parton‐Hadron Duality.
2006
Two-particle momentum correlations in jets at the Tevatron
2006
Two-particle momentum correlation in jets at the Tevatron
2006
New results on jet fragmentation at CDF
2005
Two-particle momentum correlations in jets at Tevatron
2021
hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices
Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.
2021
Design a detector for a Muon Collider experiment
2021
Review of opportunities for new long-lived particle triggers in Run 3 of the Large Hadron Collider
DOI: 10.48550/arxiv.2110.14675
2021
Review of opportunities for new long-lived particle triggers in Run 3 of the Large Hadron Collider
Long-lived particles (LLPs) are highly motivated signals of physics Beyond the Standard Model (BSM) with great discovery potential and unique experimental challenges. The LLP search programme made great advances during Run 2 of the Large Hadron Collider (LHC), but many important regions of signal space remain unexplored. Dedicated triggers are crucial to improve the potential of LLP searches, and their development and expansion is necessary for the full exploitation of the new data. The public discussion of triggers has therefore been a relevant theme in the recent LLP literature, in the meetings of the LLP@LHC Community workshop and in the respective experiments. This paper documents the ideas collected during talks and discussions at these Workshops, benefiting as well from the ideas under development by the trigger community within the experimental collaborations. We summarise the theoretical motivations of various LLP scenarios leading to highly elusive signals, reviewing concrete ideas for triggers that could greatly extend the reach of the LHC experiments. We thus expect this document to encourage further thinking for both the phenomenological and experimental communities, as a stepping stone to further develop the LLP@LHC physics programme.
2021
DIMUS: Proposal for a Dimuonium Spectroscopy Collider on Fermilab site.
DOI: 10.48550/arxiv.2110.13041
2021
Applications and Techniques for Fast Machine Learning in Science
In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs.