
Ka Hei Martin Kwok

Here are all the papers by Ka Hei Martin Kwok that you can download and read on OA.mg.
Ka Hei Martin Kwok’s last known institution is . Download Ka Hei Martin Kwok PDFs here.

Claim this Profile →
DOI: 10.48550/arxiv.2401.14221
Application of performance portability solutions for GPUs and many-core CPUs to track reconstruction kernels
Next generation High-Energy Physics (HEP) experiments are presented with significant computational challenges, both in terms of data volume and processing power. Using compute accelerators, such as GPUs, is one of the promising ways to provide the necessary computational power to meet the challenge. The current programming models for compute accelerators often involve using architecture-specific programming languages promoted by the hardware vendors and hence limit the set of platforms that the code can run on. Developing software with platform restrictions is especially unfeasible for HEP communities as it takes significant effort to convert typical HEP algorithms into ones that are efficient for compute accelerators. Multiple performance portability solutions have recently emerged and provide an alternative path for using compute accelerators, which allow the code to be executed on hardware from different vendors. We apply several portability solutions, such as Kokkos, SYCL, C++17 std::execution::par and Alpaka, on two mini-apps extracted from the mkFit project: p2z and p2r. These apps include basic kernels for a Kalman filter track fit, such as propagation and update of track parameters, for detectors at a fixed z or fixed r position, respectively. The two mini-apps explore different memory layout formats. We report on the development experience with different portability solutions, as well as their performance on GPUs and many-core CPUs, measured as the throughput of the kernels from different GPU and CPU vendors such as NVIDIA, AMD and Intel.
DOI: 10.1051/epjconf/202429511003
Application of performance portability solutions for GPUs and many-core CPUs to track reconstruction kernels
Next generation High-Energy Physics (HEP) experiments are presented with significant computational challenges, both in terms of data volume and processing power. Using compute accelerators, such as GPUs, is one of the promising ways to provide the necessary computational power to meet the challenge. The current programming models for compute accelerators often involve using architecture-specific programming languages promoted by the hardware vendors and hence limit the set of platforms that the code can run on. Developing software with platform restrictions is especially unfeasible for HEP communities as it takes significant effort to convert typical HEP algorithms into ones that are efficient for compute accelerators. Multiple performance portability solutions have recently emerged and provide an alternative path for using compute accelerators, which allow the code to be executed on hardware from different vendors. We apply several portability solutions, such as Kokkos, SYCL, C++17 std::execution::par, Alpaka, and OpenMP/OpenACC, on two mini-apps extracted from the mkFit project: p2z and p2r. These apps include basic kernels for a Kalman filter track fit, such as propagation and update of track parameters, for detectors at a fixed z or fixed r position, respectively. The two mini-apps explore different memory layout formats. We report on the development experience with different portability solutions, as well as their performance on GPUs and many-core CPUs, measured as the throughput of the kernels from different GPU and CPU vendors such as NVIDIA, AMD and Intel.
DOI: 10.1016/j.nima.2017.03.065
Cited 11 times
On the timing performance of thin planar silicon sensors
We report on the signal timing capabilities of thin silicon sensors when traversed by multiple simultaneous minimum ionizing particles (MIP). Three different planar sensors, with depletion thicknesses 133, 211, and 285 µm, have been exposed to high energy muons and electrons at CERN. We describe signal shape and timing resolution measurements as well as the response of these devices as a function of the multiplicity of MIPs. We compare these measurements to simulations where possible. We achieve better than 20 ps timing resolution for signals larger than a few tens of MIPs.
DOI: 10.48550/arxiv.2306.15869
Evaluating Portable Parallelization Strategies for Heterogeneous Architectures in High Energy Physics
High-energy physics (HEP) experiments have developed millions of lines of code over decades that are optimized to run on traditional x86 CPU systems. However, we are seeing a rapidly increasing fraction of floating point computing power in leadership-class computing facilities and traditional data centers coming from new accelerator architectures, such as GPUs. HEP experiments are now faced with the untenable prospect of rewriting millions of lines of x86 CPU code, for the increasingly dominant architectures found in these computational accelerators. This task is made more challenging by the architecture-specific languages and APIs promoted by manufacturers such as NVIDIA, Intel and AMD. Producing multiple, architecture-specific implementations is not a viable scenario, given the available person power and code maintenance issues. The Portable Parallelization Strategies team of the HEP Center for Computational Excellence is investigating the use of Kokkos, SYCL, OpenMP, std::execution::parallel and alpaka as potential portability solutions that promise to execute on multiple architectures from the same source code, using representative use cases from major HEP experiments, including the DUNE experiment of the Long Baseline Neutrino Facility, and the ATLAS and CMS experiments of the Large Hadron Collider. This cross-cutting evaluation of portability solutions using real applications will help inform and guide the HEP community when choosing their software and hardware suites for the next generation of experimental frameworks. We present the outcomes of our studies, including performance metrics, porting challenges, API evaluations, and build system integration.
DOI: 10.1002/jmrs.323
Reviewer Acknowledgement
Incluseive Search for a Lorentz-Boosted Higgs Boson Decaying into Bottom Quarks