S. Jindariani papers and PDFs

DOI: 10.1088/1748-0221/13/07/p07027

2018

Cited 269 times

Fast inference of deep neural networks in FPGAs for particle physics

Recent results at the Large Hadron Collider (LHC) have pointed to enhanced physics capabilities through the improvement of the real-time event processing techniques. Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of the use of such techniques in low-latency, low-power FPGA hardware has only just begun. FPGA-based trigger and data acquisition (DAQ) systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable, among many other physics scenarios, searches for new dark sector particles and novel measurements of the Higgs boson. While we focus on a specific example, the lessons are far-reaching. We develop a package based on High-Level Synthesis (HLS) called hls4ml to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to identify the problems in particle physics that would benefit from performing neural network inference with FPGAs. For our example jet substructure model, we fit well within the available resources of modern FPGAs with a latency on the scale of 100 ns.

DOI: 10.1088/2632-2153/aba042

2020

Cited 60 times

Compressing deep neural networks on FPGAs to binary and ternary precision with <tt>hls4ml</tt>

We present the implementation of binary and ternary neural networks in the hls4ml library, designed to automatically convert deep neural network models to digital circuits with field-programmable gate arrays (FPGA) firmware. Starting from benchmark models trained with floating point precision, we investigate different strategies to reduce the network's resource consumption by reducing the numerical precision of the network parameters to binary or ternary. We discuss the trade-off between model accuracy and resource consumption. In addition, we show how to balance between latency and accuracy by retaining full precision on a selected subset of network components. As an example, we consider two multiclass classification tasks: handwritten digit recognition with the MNIST data set and jet identification with simulated proton-proton collisions at the CERN Large Hadron Collider. The binary and ternary implementation has similar performance to the higher precision implementation while using drastically fewer FPGA resources.

DOI: 10.1088/2632-2153/ac0ea1

2021

Cited 53 times

Fast convolutional neural networks on FPGAs with hls4ml

Abstract We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the hls4ml library, we demonstrate an inference latency of 5 µ s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.

DOI: 10.3389/fdata.2020.598927

2021

Cited 41 times

Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics

Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged particle tracking, jet tagging, and clustering. An important domain for the application of these networks is the FGPA-based first layer of real-time data filtering at the CERN Large Hadron Collider, which has strict latency and resource constraints. We discuss how to design distance-weighted graph networks that can be executed with a latency of less than 1$\mu\mathrm{s}$ on an FPGA. To do so, we consider a representative task associated to particle reconstruction and identification in a next-generation calorimeter operating at a particle collider. We use a graph network architecture developed for such purposes, and apply additional simplifications to match the computing constraints of Level-1 trigger systems, including weight quantization. Using the $\mathtt{hls4ml}$ library, we convert the compressed models into firmware to be implemented on an FPGA. Performance of the synthesized models is presented both in terms of inference accuracy and resource usage.

DOI: 10.3389/fdata.2022.787421

2022

Cited 24 times

Applications and Techniques for Fast Machine Learning in Science

In this community review report, we discuss applications and techniques for fast machine learning (ML) in science-the concept of integrating powerful ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs.

DOI: 10.1007/s41781-019-0027-2

2019

Cited 43 times

FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing

Large-scale particle physics experiments face challenging demands for high-throughput computing resources both now and in the future. New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that potentially requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) ms with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600–700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.

DOI: 10.1088/1748-0221/15/05/p05026

2020

Cited 41 times

Fast inference of Boosted Decision Trees in FPGAs for particle physics

We describe the implementation of Boosted Decision Trees in the hls4ml library, which allows the translation of a trained model into FPGA firmware through an automated conversion process. Thanks to its fully on-chip implementation, hls4ml performs inference of Boosted Decision Tree models with extremely low latency. With a typical latency less than 100 ns, this solution is suitable for FPGA-based real-time processing, such as in the Level-1 Trigger system of a collider experiment. These developments open up prospects for physicists to deploy BDTs in FPGAs for identifying the origin of jets, better reconstructing the energies of muons, and enabling better selection of rare signal processes.

DOI: 10.1016/j.nima.2013.07.015

2013

Cited 25 times

Operational experience, improvements, and performance of the CDF Run II silicon vertex detector

The Collider Detector at Fermilab (CDF) pursues a broad physics program at Fermilab's Tevatron collider. Between Run II commissioning in early 2001 and the end of operations in September 2011, the Tevatron delivered 12 fb-1 of integrated luminosity of p-pbar collisions at sqrt(s)=1.96 TeV. Many physics analyses undertaken by CDF require heavy flavor tagging with large charged particle tracking acceptance. To realize these goals, in 2001 CDF installed eight layers of silicon microstrip detectors around its interaction region. These detectors were designed for 2--5 years of operation, radiation doses up to 2 Mrad (0.02 Gy), and were expected to be replaced in 2004. The sensors were not replaced, and the Tevatron run was extended for several years beyond its design, exposing the sensors and electronics to much higher radiation doses than anticipated. In this paper we describe the operational challenges encountered over the past 10 years of running the CDF silicon detectors, the preventive measures undertaken, and the improvements made along the way to ensure their optimal performance for collecting high quality physics data. In addition, we describe the quantities and methods used to monitor radiation damage in the sensors for optimal performance and summarize the detector performance quantities important to CDF's physics program, including vertex resolution, heavy flavor tagging, and silicon vertex trigger performance.

DOI: 10.48550/arxiv.2012.01563

2020

Cited 15 times

Accelerated Charged Particle Tracking with Graph Neural Networks on FPGAs

We develop and study FPGA implementations of algorithms for charged particle tracking based on graph neural networks. The two complementary FPGA designs are based on OpenCL, a framework for writing programs that execute across heterogeneous platforms, and hls4ml, a high-level-synthesis-based compiler for neural network to firmware conversion. We evaluate and compare the resource usage, latency, and tracking performance of our implementations based on a benchmark dataset. We find a considerable speedup over CPU-based execution is possible, potentially enabling such algorithms to be used effectively in future computing workflows and the FPGA-based Level-1 trigger at the CERN Large Hadron Collider.

2024

ACE Science Workshop Report

We summarize the Fermilab Accelerator Complex Evolution (ACE) Science Workshop, held on June 14-15, 2023. The workshop presented the strategy for the ACE program in two phases: ACE Main Injector Ramp and Target (MIRT) upgrade and ACE Booster Replacement (BR) upgrade. Four plenary sessions covered the primary experimental physics thrusts: Muon Collider, Neutrinos, Charged Lepton Flavor Violation, and Dark Sectors. Additional physics and technology ideas were presented from the community that could expand or augment the ACE science program. Given the physics framing, a parallel session at the workshop was dedicated to discussing priorities for accelerator R\&D. Finally, physics discussion sessions concluded the workshop where experts from the different experimental physics thrusts were brought together to begin understanding the synergies between the different physics drivers and technologies. In December of 2023, the P5 report was released setting the physics priorities for the field in the next decade and beyond, and identified ACE as an important component of the future US accelerator-based program. Given the presentations and discussions at the ACE Science Workshop and the findings of the P5 report, we lay out the topics for study to determine the physics priorities and design goals of the Fermilab ACE project in the near-term.

DOI: 10.2172/1863003

2022

Cited 5 times

Higgs-Energy LEptoN (HELEN) Collider based on advanced superconducting radio frequency technology

This Snowmass 2021 contributed paper discusses a Higgs-Energy LEptoN (HELEN) e⁺e⁻ linear collider based on advances superconducting radio frequency technology. The proposed collider offers cost and AC power savings, smaller footprint (relative to the ILC), and could be built at Fermilab with an Interaction Region within the site boundaries. After the initial physics run at 250 GeV, the collider could be upgraded either to higher luminosity or to higher (up to 500 GeV) energies. If the ILC could not be realized in Japan in a timely fashion, the HELEN collider would be a viable option to build a Higgs factory in the U.S.

DOI: 10.3390/jlpea8030025

2018

Cited 12 times

Multi-Vdd Design for Content Addressable Memories (CAM): A Power-Delay Optimization Analysis

In this paper, we characterize the interplay between power consumption and performance of a matchline-based Content Addressable Memory and then propose the use of a multi-Vdd design to save power and increase post-fabrication tunability. Exploration of the power consumption behavior of a CAM chip shows the drastically different behavior among the components and suggests the use of different and independent power supplies. The complete design, simulation and testing of a multi-Vdd CAM chip along with an exploration of the multi-Vdd design space are presented. Our analysis has been applied to simulated models on two different technology nodes (130 nm and 45 nm), followed by experiments on a 246-kb test chip fabricated in 130 nm Global Foundries Low Power CMOS technology. The proposed design, operating at an optimal operating point in a triple-Vdd configuration, increases the power-delay operation range by 2.4 times and consumes 25.3% less dynamic power when compared to a conventional single-Vdd design operating over the same voltage range with equivalent noise margin. Our multi-Vdd design also helps save 51.3% standby power. Measurement results from the test chip combined with the simulation analysis at the two nodes validate our thesis.

DOI: 10.1088/1748-0221/10/02/c02029

2015

Cited 9 times

Design and testing of the first 2D Prototype Vertically Integrated Pattern Recognition Associative Memory

An associative memory-based track finding approach has been proposed for a Level 1 tracking trigger to cope with increasing luminosities at the LHC. The associative memory uses a massively parallel architecture to tackle the intrinsically complex combinatorics of track finding algorithms, thus avoiding the typical power law dependence of execution time on occupancy and solving the pattern recognition in times roughly proportional to the number of hits. This is of crucial importance given the large occupancies typical of hadronic collisions. The design of an associative memory system capable of dealing with the complexity of HL-LHC collisions and with the short latency required by Level 1 triggering poses significant, as yet unsolved, technical challenges. For this reason, an aggressive R&D program has been launched at Fermilab to advance state of-the-art associative memory technology, the so called VIPRAM (Vertically Integrated Pattern Recognition Associative Memory) project. The VIPRAM leverages emerging 3D vertical integration technology to build faster and denser Associative Memory devices. The first step is to implement in conventional VLSI the associative memory building blocks that can be used in 3D stacking; in other words, the building blocks are laid out as if it is a 3D design. In this paper, we report on the first successful implementation of a 2D VIPRAM demonstrator chip (protoVIPRAM00). The results show that these building blocks are ready for 3D stacking.

DOI: 10.1007/s41781-021-00067-x

2021

Cited 7 times

Full Detector Simulation with Unprecedented Background Occupancy at a Muon Collider

Abstract In recent years, a Muon collider has attracted a lot of interest in the high-energy physics community, thanks to its ability of achieving clean interaction signatures at multi-TeV collision energies in the most cost-effective way. Estimation of the physics potential of such an experiment must take into account the impact of beam-induced background on the detector performance, which has to be carefully evaluated using full detector simulation. Tracing of all the background particles entering the detector region in a single bunch crossing is out of reach for any realistic computing facility due to the unprecedented number of such particles. To make it feasible a number of optimisations have been applied to the detector simulation workflow. This contribution presents an overview of the main characteristics of the beam-induced background at a Muon collider, the detector technologies considered for the experiment and how they are taken into account to strongly reduce the number of irrelevant computations performed during the detector simulation. Special attention is dedicated to the optimisation of track reconstruction with the conformal tracking algorithm in this high-occupancy environment, which is the most computationally demanding part of event reconstruction.

DOI: 10.48550/arxiv.2203.07261

2022

Cited 4 times

The physics case of a 3 TeV muon collider stage

In the path towards a muon collider with center of mass energy of 10 TeV or more, a stage at 3 TeV emerges as an appealing option. Reviewing the physics potential of such muon collider is the main purpose of this document. In order to outline the progression of the physics performances across the stages, a few sensitivity projections for higher energy are also presented. There are many opportunities for probing new physics at a 3 TeV muon collider. Some of them are in common with the extensively documented physics case of the CLIC 3 TeV energy stage, and include measuring the Higgs trilinear coupling and testing the possible composite nature of the Higgs boson and of the top quark at the 20 TeV scale. Other opportunities are unique of a 3 TeV muon collider, and stem from the fact that muons are collided rather than electrons. This is exemplified by studying the potential to explore the microscopic origin of the current $g$-2 and $B$-physics anomalies, which are both related with muons.

DOI: 10.1088/1748-0221/18/08/t08007

2023

DIMUS: super-compact Dimuonium Spectroscopy collider at Fermilab

Abstract While dimuonium ( μ + μ - ) — the “smallest QED atom” — has not yet been observed, it is of utmost fundamental interest. By virtue of the larger mass, dimuonium has greater sensitivity to beyond the standard model (BSM) effects than its cousins positronium or muonium, both discovered long ago, while not suffering from large QCD uncertainties. Dimuonium atoms can be created in e + e - collisions with large longitudinal momentum, allowing them to decay a small distance away from the beam crossing point and avoid prompt backgrounds. We envision a unique cost-effective and fast-timeline opportunity for copious production of ( μ + μ - ) atoms at the production threshold via a modest modification of existing Fermilab Accelerator Science and Technology (FAST) facility to arrange collisions of 408 MeV electrons and positrons at a 75° angle. This compact 23 m circumference collider (DIMUS) will allow for precision tests of QED and open the door for searches for new physics coupled to the muon. The FAST facility is perfectly suited for DIMUS as there are existing SRF accelerators and infrastructure, capable of producing high energy, high current electron and positron beams, sufficient for O (10 32 )cm 2 s -1 luminosity and ∼0.5 million dimuons per year. The expansion will require installation of a second SRF cryomodule, positron production and accumulation system, fast injection/extraction kickers and two small circumference intersecting rings. An approximately meter-sized detector with several layers of modern pixelated silicon detector and crystal-based electromagnetic calorimeters will ensure observation of the decays of dimuonium to electron-positron pairs in presence of the Bhabba scattering background. An expansion of the system to include solenoidal magnet outside of the calorimeter system, a layer of steel shielding behind the magnet, and a set of dedicated muon detectors would extend the physics program of DIMUS to include precision studies of rare processes with muons, pions, and η mesons produced in e + e - collisions.

DOI: 10.1103/physrevd.108.093009

2023

Anomalous production of massive gauge boson pairs at muon colliders

The prospects of searches for anomalous production of hadronically decaying weak boson pairs at proposed high-energy muon colliders are reported. Muon-muon collision events are simulated at $\sqrt{s}=6$, 10, and 30 TeV, corresponding to an integrated luminosity of 4, 10, and $10\text{ }\text{ }{\mathrm{ab}}^{\ensuremath{-}1}$, respectively. Simulated $\ensuremath{\mu}\ensuremath{\mu}\ensuremath{\rightarrow}\mathrm{W}\mathrm{W}+\ensuremath{\nu}\ensuremath{\nu}/\ensuremath{\mu}\ensuremath{\mu}$ events are used to set expected constraints on the structure of quartic vector boson interactions in the framework of a dimension-8 effective field theory. Similarly, $\ensuremath{\mu}\ensuremath{\mu}\ensuremath{\rightarrow}\mathrm{W}\mathrm{W}/\mathrm{Z}\mathrm{Z}+\ensuremath{\nu}\ensuremath{\nu}$ events are used to report constraints on the product of the cross section and branching fraction for vector boson fusion production of a heavy neutral Higgs boson decaying to weak boson pairs. These results are interpreted in the context of the Georgi-Machacek model.

DOI: 10.2172/1881962

2022

Cited 3 times

Promising Technologies and R&amp;D Directions for the Future Muon Collider Detectors

Among the post-LHC generation of particle accelerators, the muon collider represents a unique machine with capability to provide very high energy leptonic collisions and to open the path to a vast and mostly unexplored physics programme. However, on the experimental side, such great physics potential is accompanied by unprecedented technological challenges, due to the fact that muons are unstable particles. Their decay products interact with the machine elements and produce an intense flux of background particles that eventually reach the detector and may degrade its performance. In this paper, we present technologies that have a potential to match the challenging specifications of a muon collider detector and outline a path forward for the future R&D efforts.

DOI: 10.1109/iccd.2015.7357156

2015

Cited 5 times

A methodology for power characterization of associative memories

Content Addressable Memories (CAM) have become increasingly more important in applications requiring high speed memory search due to their inherent massively parallel processing architecture. We present a complete power analysis methodology for CAM systems to aid the exploration of their power-performance trade-offs in future systems. Our proposed methodology uses detailed transistor level circuit simulation of power behavior and a handful of input data types to simulate full chip power consumption. Furthermore, we applied our power analysis methodology on a custom designed associative memory test chip. This chip was developed by Fermilab for the purpose of developing high performance real-time pattern recognition on high volume data produced by a future large-scale scientific experiment. We applied our methodology to configure a power model for this test chip. Our model is capable of predicting the total average power within 4% of actual power measurements. Our power analysis methodology can be generalized and applied to other CAM-like memory systems and accurately characterize their power behavior.

DOI: 10.1016/j.nima.2019.05.018

2019

Cited 4 times

A high-performance track fitter for use in ultra-fast electronics

This article describes a new charged-particle track fitting algorithm designed for use in high-speed electronics applications such as hardware-based triggers in high-energy physics experiments. Following a novel technique designed for fast electronics, the positions of the hits on the detector are transformed before being passed to a linearized track parameter fit. This transformation results in fitted track parameters with a very linear dependence on the hit positions. The approach is demonstrated in a representative detector geometry based on the CMS detector at the Large Hadron Collider. The fit is implemented in FPGA chips and optimized for track fitting throughput and obtains excellent track parameter performance. Such an algorithm is potentially useful in any high-speed track-fitting application.

DOI: 10.1145/3289602.3293986

2019

Cited 4 times

Fast Inference of Deep Neural Networks for Real-time Particle Physics Applications

Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of such techniques in low-latency, low-power FPGA (Field Programmable Gate Array) hardware has only just begun. FPGA-based trigger and data acquisition systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable many new physics measurements. While we focus on a specific example, the lessons are far-reaching. A compiler package is developed based on High-Level Synthesis (HLS) called HLS4ML to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to allow for directed resource tuning in the low latency environment and assess the impact on our benchmark Physics performance scenario For our example jet substructure model, we fit well within the available resources of modern FPGAs with latency on the scale of 100~ns.

DOI: 10.1109/mwscas.2017.8052945

2017

Cited 3 times

A content addressable memory with multi-Vdd scheme for low power tunable operation

This paper reports on a content addressable memory (CAM) employing a multi-Vdd scheme for low power pattern recognition applications. The complete design, simulation and testing of the chip is presented along with an exploration of the multi-Vdd design space. The proposed design, operating at an optimal operating point in a triple-Vdd configuration, increases the delay range by 2.4 times and consumes 25.3% less power when compared to a conventional single-Vdd design operating over the same voltage range. Measurement results from a 246 kb test chip fabricated in 130nm Global Foundries Low Power CMOS technology are presented to validate the model and analysis.

2021

Cited 3 times

hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.

DOI: 10.2172/1975518

2023

Options for DIMUS: Di-Muon-Spectroscopy Collider

is a precision measurement of electroweak mixing angle, sin{sup 2}{theta}{sub W}, which can be achieved to the precision equivalent to {delta}M{sub W}{approximately} 30MeV.

2021

arXiv : Review of opportunities for new long-lived particle triggers in Run 3 of the Large Hadron Collider

Long-lived particles (LLPs) are highly motivated signals of physics Beyond the Standard Model (BSM) with great discovery potential and unique experimental challenges. The LLP search programme made great advances during Run 2 of the Large Hadron Collider (LHC), but many important regions of signal space remain unexplored. Dedicated triggers are crucial to improve the potential of LLP searches, and their development and expansion is necessary for the full exploitation of the new data. The public discussion of triggers has therefore been a relevant theme in the recent LLP literature, in the meetings of the LLP@LHC Community workshop and in the respective experiments. This paper documents the ideas collected during talks and discussions at these Workshops, benefiting as well from the ideas under development by the trigger community within the experimental collaborations. We summarise the theoretical motivations of various LLP scenarios leading to highly elusive signals, reviewing concrete ideas for triggers that could greatly extend the reach of the LHC experiments. We thus expect this document to encourage further thinking for both the phenomenological and experimental communities, as a stepping stone to further develop the LLP@LHC physics programme.

DOI: 10.48550/arxiv.2203.08135

2022

Anomalous production of massive gauge boson pairs at muon colliders

The prospects of searches for anomalous production of hadronically decaying weak boson pairs at proposed high-energy muon colliders are reported. Muon-muon collision events are simulated at $\sqrt{s}=6$, 10, and 30 TeV, corresponding to an integrated luminosity of $4$, $10$, and $10$ ab$^{-1}$, respectively. Simulated $\mu\mu\rightarrow\mathrm{W}\mathrm{W}+\nu\nu/\mu\mu$ events are used to set expected constraints on the structure of quartic vector boson interactions in the framework of a dimension-8 effective field theory. Similarly, $\mu\mu\rightarrow\mathrm{W}\mathrm{W}/\mathrm{Z}\mathrm{Z}+\nu\nu$ events are used to report constraints on the product of the cross section and branching fraction for vector boson fusion production of a heavy neutral Higgs boson decaying to weak boson pairs. These results are interpreted in the context of the Georgi-Machacek model.

DOI: 10.1103/physrevd.108.093009

2022

Anomalous production of massive gauge boson pairs at muon colliders

Prospects for searches of anomalous quartic gauge couplings at a future high-energy muon collider using the production of $\mathrm{WW}$ boson pairs are reported. Muon-muon collision events are simulated at $\sqrt{s}=6$ TeV corresponding to an integrated luminosity of $4$ ab$^{-1}$. The simulated events are used to study the $\mathrm{W}\mathrm{W}\nu\nu$ and $\mathrm{W}\mathrm{W}\mu\mu$ final states with the $\mathrm{W}$ bosons decaying hadronically. The events are analyzed to report expected constraints on the structure of quartic vector boson interactions in the framework of dimension-8 effective field theory operators.

DOI: 10.1590/s0103-97332007000500040

2007

New results on jet fragmentation at CDF

Presented are the latest results of jet fragmentation studies at the Tevatron using the CDF Run II detector.Studies include the distribution of transverse momenta (Kt) of particles jets, two-particle momentum correlations, and indirectly global event shapes in p p collisions.Results are discussed within the context of recent Next-to-Leading Log calculations as well as earlier experimental results from the Tevatron and e + e -colliders.

DOI: 10.48550/arxiv.1709.08303

2017

Performance Study of the First 2D Prototype of Vertically Integrated Pattern Recognition Associative Memory (VIPRAM)

Extremely fast pattern recognition capabilities are necessary to find and fit billions of tracks at the hardware trigger level produced every second anticipated at high luminosity LHC (HL-LHC) running conditions. Associative Memory (AM) based approaches for fast pattern recognition have been proposed as a potential solution to the tracking trigger. However, at the HL-LHC, there is much less time available and speed performance must be improved over previous systems while maintaining a comparable number of patterns. The Vertically Integrated Pattern Recognition Associative Memory (VIPRAM) Project aims to achieve the target pattern density and performance goal using 3DIC technology. The first step taken in the VIPRAM work was the development of a 2D prototype (protoVIPRAM00) in which the associative memory building blocks were designed to be compatible with the 3D integration. In this paper, we present the results from extensive performance studies of the protoVIPRAM00 chip in both realistic HL-LHC and extreme conditions. Results indicate that the chip operates at the design frequency of 100 MHz with perfect correctness in realistic conditions and conclude that the building blocks are ready for 3D stacking. We also present performance boundary characterization of the chip under extreme conditions.

DOI: 10.2172/1570210

2019

FPGAs as a Service to Accelerate Machine Learning Inference [PowerPoint]

Large-scale particle physics experiments face challenging demands for high-throughput computing resources both now and in the future. New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC, and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave b y Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) milliseconds with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600--700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.

DOI: 10.1109/tns.2020.2968860

2020

Performance Study of the First 2-D Prototype of Vertically Integrated Pattern Recognition Associative Memory

Extremely fast pattern recognition capabilities are necessary to find and fit billions of tracks at the hardware trigger level produced every second anticipated at high-luminosity Large Hadron Collider (HL-LHC) running conditions. Associative memory (AM)-based approaches for fast pattern recognition have been proposed as a potential solution to the tracking trigger. However, at the HL-LHC, there is much less time available, and the speed performance must be improved over previous systems while maintaining a comparable number of patterns. The vertically integrated pattern recognition AM (VIPRAM) project aims to achieve the target pattern density and performance goal using 3DIC technology. The first step taken in the VIPRAM work was the development of a 2-D prototype (protoVIPRAM00) in which the AM building blocks were designed to be compatible with the 3-D integration. In this article, we present the results from extensive performance studies of the protoVIPRAM00 chip in both realistic HL-LHC and extreme conditions. Results indicate that the chip operates at the design frequency of 100 MHz with perfect correctness in realistic conditions and conclude that the building blocks are ready for 3-D stacking. We also present performance boundary characterization of the chip under extreme conditions.

DOI: 10.1016/j.nima.2014.01.017

2014

CDF Run II silicon vertex detector annealing study

Between Run II commissioning in early 2001 and the end of operations in September 2011, the Tevatron collider delivered 12 fb−1 of pp¯ collisions at s=1.96TeV to the Collider Detector at Fermilab (CDF). During that time, the CDF silicon vertex detector was subject to radiation doses of up to 12 Mrad. After the end of operations, the silicon detector was annealed for 24 days at 18 °C. In this paper, we present a measurement of the change in the bias currents for a subset of sensors during the annealing period. We also introduce a novel method for monitoring the depletion voltage throughout the annealing period. The observed bias current evolution can be characterized by a falling exponential term with time constant τI=17.88±0.36(stat.)±0.25(syst.) days. We observe an average decrease of (27±3)% in the depletion voltage, whose evolution can similarly be described by an exponential time constant of τV=6.21±0.21days. These results are consistent with the Hamburg model within the measurement uncertainties.

DOI: 10.1016/j.nuclphysbps.2015.09.375

2016

Measurements of top quark properties in top pair production and decay at the LHC using the CMS detector

Measurements are presented of the properties of top quarks in pair production and decay from proton-proton collisions at the LHC. The data were collected at centre-of-mass energies of 7 and 8 TeV by the CMS experiment during the years 2011 and 2012. The top quark-antiquark charge asymmetry is measured using the difference of the absolute rapidities of the reconstructed top and anti-top kinematics, as well as from distributions of the top quark decay products. The measurements are performed in the decay channels of the tt‾ pair into both one and two leptons in the final state. The polarization of top quarks and top pair spin correlations are measured from the angular distributions of top quark decay products. The W-boson helicity fractions and angular asymmetries are extracted and limits on anomalous contributions to the Wtb vertex are determined. The flavor content in top-quark pair events is measured using the fraction of top quarks decaying into a W-boson and a b-quark relative to all top quark decays, R=B(t→Wb)/B(t→Wq), and the result is used to determine the CKM matrix element Vtb as well as the width of the top quark resonance. All of the results are found to be in good agreement with standard model predictions.

2017

arXiv : Performance Study of the First 2D Prototype of Vertically Integrated Pattern Recognition Associative Memory (VIPRAM)

Extremely fast pattern recognition capabilities are necessary to find and fit billions of tracks at the hardware trigger level produced every second anticipated at high luminosity LHC (HL-LHC) running conditions. Associative Memory (AM) based approaches for fast pattern recognition have been proposed as a potential solution to the tracking trigger. However, at the HL-LHC, there is much less time available and speed performance must be improved over previous systems while maintaining a comparable number of patterns. The Vertically Integrated Pattern Recognition Associative Memory (VIPRAM) Project aims to achieve the target pattern density and performance goal using 3DIC technology. The first step taken in the VIPRAM work was the development of a 2D prototype (protoVIPRAM00) in which the associative memory building blocks were designed to be compatible with the 3D integration. In this paper, we present the results from extensive performance studies of the protoVIPRAM00 chip in both realistic HL-LHC and extreme conditions. Results indicate that the chip operates at the design frequency of 100 MHz with perfect correctness in realistic conditions and conclude that the building blocks are ready for 3D stacking. We also present performance boundary characterization of the chip under extreme conditions.

2010

The CDF Silicon Detector: Performance and Longevity

DOI: 10.22323/1.102.0050

2010

High mass Higgs at Tevatron

2009

High mass Higgs at Tevatron

DOI: 10.1016/j.nuclphysbps.2007.11.108

2008

Soft QCD and the underlying event at CDF

Presented are the latest results of jet fragmentation studies at the Tevatron using the CDF Run II detector. Studies include indirectly global event shapes in p p ¯ collisions, the distribution of transverse momenta ( k T ) of particles in jets, the underlying event studies and two-particle momentum correlations. Results are compared to parton shower Monte Carlos and recent NLLA calculations as well as earlier experimental results from the Tevatron and e + e − colliders.

2009

Status and Operational Experience with the CDF Run II Silicon Detector

DOI: 10.48550/arxiv.2203.07224

2022

Promising Technologies and R&D Directions for the Future Muon Collider Detectors

Among the post-LHC generation of particle accelerators, the muon collider represents a unique machine with capability to provide very high energy leptonic collisions and to open the path to a vast and mostly unexplored physics programme. However, on the experimental side, such great physics potential is accompanied by unprecedented technological challenges, due to the fact that muons are unstable particles. Their decay products interact with the machine elements and produce an intense flux of background particles that eventually reach the detector and may degrade its performance. In this paper, we present technologies that have a potential to match the challenging specifications of a muon collider detector and outline a path forward for the future R&D efforts.

2022

Simulated Detector Performance at the Muon Collider

In this paper we report on the current status of studies on the expected performance for a detector designed to operate in a muon collider environment. Beam-induced backgrounds (BIB) represent the main challenge in the design of the detector and the event reconstruction algorithms. The current detector design aims to show that satisfactory performance can be achieved, while further optimizations are expected to significantly improve the overall performance. We present the characterization of the expected beam-induced background, describe the detector design and software used for detailed event simulations taking into account BIB effects. The expected performance of charged-particle reconstruction, jets, electrons, photons and muons is discussed, including an initial study on heavy-flavor jet tagging. A simple method to measure the delivered luminosity is also described. Overall, the proposed design and reconstruction algorithms can successfully reconstruct the high transverse-momentum objects needed to carry out a broad physics program.

DOI: 10.48550/arxiv.2203.07144

2022

DIMUS: Super-Compact Dimuonium Spectroscopy Collider at Fermilab

While dimuonium $(\mu^+\mu^-)$ has not yet been observed, it is of utmost fundamental interest. By virtue of the larger mass, dimuonium has greater sensitivity to beyond the standard model effects than its cousins positronium or muonium, both discovered long ago, while not suffering from large QCD uncertainties. Dimuonium atoms can be created in $e^+e^-$ collisions with large longitudinal momentum, allowing them to decay a small distance away from the beam crossing point and avoid prompt backgrounds. We envision a unique cost-effective and fast-timeline opportunity for copious production of $(\mu^+\mu^-)$ atoms at the production threshold via a modest modification of Fermilab's existing FAST/NML facility to arrange collisions of 408 MeV electrons and positrons at a 75$^{\rm o}$ angle. This compact 23 m circumference collider (DIMUS) will allow for precision tests of QED and open the door for searches for new physics coupled to the muon. Fermilab's FAST/NML is perfectly suited for DIMUS as there are existing SRF accelerators and infrastructure, capable of producing high energy, high current electron and positron beams, sufficient for $O(10^{32})\mathrm{cm}^2\mathrm{s}^{-1}$ luminosity and $\sim$0.5 million dimuons per year. The expansion will require installation of a second SRF cryomodule, positron production and accumulation system, fast injection/extraction kickers and two small circumference intersecting rings. An approximately meter-sized detector with several layers of modern pixelated silicon detector and crystal-based electromagnetic calorimeters will ensure observation of the decays of dimuonium to electron-positron pairs in presence of the Bhabba scattering background. An expansion of the system to would extend the physics program of DIMUS to include precision studies of rare processes with muons, pions, and $\eta$ mesons produced in $e^{+}e^{-}$ collisions.

2022

Muon Collider Physics Summary

The perspective of designing muon colliders with high energy and luminosity, which is being investigated by the International Muon Collider Collaboration, has triggered a growing interest in their physics reach. We present a concise summary of the muon colliders potential to explore new physics, leveraging on the unique possibility of combining high available energy with very precise measurements.

2022

The physics case of a 3 TeV muon collider stage

In the path towards a muon collider with center of mass energy of 10 TeV or more, a stage at 3 TeV emerges as an appealing option. Reviewing the physics potential of such muon collider is the main purpose of this document. In order to outline the progression of the physics performances across the stages, a few sensitivity projections for higher energy are also presented. There are many opportunities for probing new physics at a 3 TeV muon collider. Some of them are in common with the extensively documented physics case of the CLIC 3 TeV energy stage, and include measuring the Higgs trilinear coupling and testing the possible composite nature of the Higgs boson and of the top quark at the 20 TeV scale. Other opportunities are unique of a 3 TeV muon collider, and stem from the fact that muons are collided rather than electrons. This is exemplified by studying the potential to explore the microscopic origin of the current $g$-2 and $B$-physics anomalies, which are both related with muons.

2022

Muon Collider Physics Summary

DOI: 10.2172/1884523

2022

U.S. National Accelerator R&amp;D Program on Future Colliders

Future colliders are an essential component of a strategic vision for particle physics. Conceptual studies and technical developments for several exciting future collider options are underway internationally. In order to realize a future collider, a concerted accelerator R&D program is required. The U.S. HEP accelerator R&D program currently has no direct effort in collider-specific R&D area. This shortcoming greatly compromises the U.S. leadership role in accelerator and particle physics. In this white paper, we propose a new national accelerator R&D program on future colliders and outline the important characteristics of such a program.

DOI: 10.48550/arxiv.2203.13900

2022

4-Dimensional Trackers

4-dimensional (4D) trackers with ultra fast timing (10-30 ps) and very fine spatial resolution (O(few $\mu$m)) represent a new avenue in the development of silicon trackers, enabling new physics capabilities beyond the reach of the existing tracking detectors. This paper reviews the impact of integrating 4D tracking capabilities on several physics benchmarks both in potential upgrades of the HL-LHC experiments and in several detectors at future colliders, and summarizes the currently available sensor technologies as well as electronics, along with their limitations and directions for R$\&$D.

2022

4-Dimensional Trackers

DOI: 10.1088/1748-0221/17/12/p12002

2022

Charged particle tracking in real-time using a full-mesh data delivery architecture and associative memory techniques

Abstract We present a flexible and scalable approach to address the challenges of charged particle track reconstruction in real-time event filters (Level-1 triggers) in collider physics experiments. The method described here is based on a full-mesh architecture for data distribution and relies on the Associative Memory approach to implement a pattern recognition algorithm that quickly identifies and organizes hits associated to trajectories of particles originating from particle collisions. We describe a successful implementation of a demonstration system composed of several innovative hardware and algorithmic elements. The implementation of a full-size system relies on the assumption that an Associative Memory device with the sufficient pattern density becomes available in the future, either through a dedicated ASIC or a modern FPGA. We demonstrate excellent performance in terms of track reconstruction efficiency, purity, momentum resolution, and processing time measured with data from a simulated LHC-like tracking detector.

DOI: 10.2172/1343954

2007

Fragmentation of Jets Produced in Proton-Antiproton Collisions at $\sqrt{s} = 1.96$ TeV

We present the first measurement of two-particle momentum correlations in jets produced in p$\bar{p}$ collisions at center of mass energy of 1.96 TeV. A comparison of the experimental data to theoretical predictions obtained for partons within the framework of resummed perturbative QCD (Next-to-Leading Log Approximation) shows that the predicted parton momentum correlations survive the hadronization stage of jet fragmentation and are present at the hadron level. We also present the measurement of the intrinsic transverse momenta of particles with respect to jet axis (kT ). Experimental data is compared to the theoretical predictions obtained for partons within the framework of Modified Leading Log Approximation and Next-to-Modified Leading Log Approximation, and shows good agreement in the range of validity of the theoretical predictions. The results of both measurements indicate that the perturbative stage of the jet formation must be dominant and give further support to the hypothesis of Local Parton-Hadron Duality.

DOI: 10.2172/1592124

2019

Accelerated Machine Learning as a Service for Particle Physics Computing

Accelerated Machine Learning as a Service for Particle Physics Computing: • Amount and complexity of high-energy physics data increases dramatically from 2025 onward • Traditional algorithms will require too much CPU time • Machine learning can solve combinatorially-scaling problems in constant time, but must be fast enough

DOI: 10.2172/1630707

2019

hls4ml: Deploying Deep Learning on FPGAs for L1 trigger and Data Acquisition

neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC, and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave b y Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) milliseconds with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600--700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.

2019

FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing

DOI: 10.5281/zenodo.3895029

2019

Accelerated Machine Learning as a Service for Particle Physics Computing

DOI: 10.5281/zenodo.3598989

2019

hls4ml: Deploying Deep Learning on FPGAs for L1 trigger and Data Acquisition [PowerPoint]

2018

A High-performance Track Fitter for Use in Ultra-fast Electronics

2007

Jet Fragmentation at CDF

DOI: 10.1063/1.2220247

2006

Two-particle Momentum Correlation in Jets at the Tevatron

Presented are the measurements of two‐particle momentum correlations in jets produced in p‐pbar collisions at center of mass frame energy 1.96 TeV. Studies were performed for charged particles within a restricted opening angle of 0.5 rad around the jet axis and for dijet events with various dijet masses. Comparison of the experimental results to the theoretical predictions obtained for partons within the framework of the resummed perturbative QCD (Next‐to‐Leading Log Approximation) shows that the parton momentum correlations do survive the hadronization stage of jet fragmentation, thus, giving further support to the hypothesis of Local Parton‐Hadron Duality.

2006

Two-particle momentum correlations in jets at the Tevatron

2006

Two-particle momentum correlation in jets at the Tevatron

2006

New results on jet fragmentation at CDF

2005

Two-particle momentum correlations in jets at Tevatron

2021

hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.

2021

Design a detector for a Muon Collider experiment

2021

Review of opportunities for new long-lived particle triggers in Run 3 of the Large Hadron Collider

DOI: 10.48550/arxiv.2110.14675

2021

Review of opportunities for new long-lived particle triggers in Run 3 of the Large Hadron Collider

Long-lived particles (LLPs) are highly motivated signals of physics Beyond the Standard Model (BSM) with great discovery potential and unique experimental challenges. The LLP search programme made great advances during Run 2 of the Large Hadron Collider (LHC), but many important regions of signal space remain unexplored. Dedicated triggers are crucial to improve the potential of LLP searches, and their development and expansion is necessary for the full exploitation of the new data. The public discussion of triggers has therefore been a relevant theme in the recent LLP literature, in the meetings of the LLP@LHC Community workshop and in the respective experiments. This paper documents the ideas collected during talks and discussions at these Workshops, benefiting as well from the ideas under development by the trigger community within the experimental collaborations. We summarise the theoretical motivations of various LLP scenarios leading to highly elusive signals, reviewing concrete ideas for triggers that could greatly extend the reach of the LHC experiments. We thus expect this document to encourage further thinking for both the phenomenological and experimental communities, as a stepping stone to further develop the LLP@LHC physics programme.

2021

DIMUS: Proposal for a Dimuonium Spectroscopy Collider on Fermilab site.

DOI: 10.48550/arxiv.2110.13041

2021

Applications and Techniques for Fast Machine Learning in Science

In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs.

S. Jindariani

Here are all the papers by S. Jindariani that you can download and read on OA.mg.S. Jindariani’s last known institution is . Download S. Jindariani PDFs here.

Here are all the papers by S. Jindariani that you can download and read on OA.mg.
S. Jindariani’s last known institution is . Download S. Jindariani PDFs here.