Related papers: Beamline Steering Using Deep Learning Models

Beamline Steering Using Deep Learning Models

URL: http://arxiv.org/abs/2408.13657v1
Date: Sat, 24 Aug 2024 19:16:10 GMT
Title: Beamline Steering Using Deep Learning Models
Authors: Dexter Allen, Isaac Kante, Dorian Bohler,
Abstract summary: The Linac To Undulator is very difficult to steer and aim due to the changes of each use of the accelerator. Human operators spend a substantial amount of time and resources on the task. A lack of training time and computational power limited the ability of our models to mature.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Beam steering involves the calibration of the angle and position at which a particle accelerator's electron beam is incident upon the x-ray target with respect to the rotation axis of the collimator. Beam Steering is an essential task for light sources. The Linac To Undulator is very difficult to steer and aim due to the changes of each use of the accelerator there must be re-calibration of magnets. However with each use of the Beamline its current method of steering runs into issues when faced with calibrating angles and positions. Human operators spend a substantial amount of time and resources on the task. We developed multiple different feed-forward-neural networks with varying hyper-parameters, inputs, and outputs, seeking to compare their performance. Specifically, our smaller models with 33 inputs and 13 outputs outperformed the larger models with 73 inputs and 50 outputs. We propose the following explanations for this lack of performance in larger models. First, a lack of training time and computational power limited the ability of our models to mature. Given more time, our models would outperform SVD. Second, when the input size of the model increases the noise increases as well. In this case more inputs corresponded to a greater length upon the LINAC accelerator. Less specific and larger models that seek to make more predictions will inherently perform worse than SVD.

Related papers

Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners [72.37408197157453]
Recent advancements have demonstrated that the performance of large language models (LLMs) can be significantly enhanced by scaling computational resources at test time. This raises a fundamental question: can models with lower complexity leverage their superior generation throughput to outperform similarly sized Transformers for a fixed computational budget? To address this question and overcome the lack of strong subquadratic reasoners, we distill pure and hybrid Mamba models from pretrained Transformers.
arXiv Detail & Related papers (2025-02-27T18:08:16Z)
The Mamba in the Llama: Distilling and Accelerating Hybrid Models [76.64055251296548]
We show how to distill large Transformers into linear RNNs by reusing the linear projection weights from attention layers with academic GPU resources. The resulting hybrid model achieves performance comparable to the original Transformer in chat benchmarks. We also introduce a hardware-aware speculative decoding algorithm that accelerates the inference speed of Mamba and hybrid models.
arXiv Detail & Related papers (2024-08-27T17:56:11Z)
SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior [53.52396082006044]
Current methods struggle to maintain rendering quality at the viewpoint that deviates significantly from the training viewpoints. This issue stems from the sparse training views captured by a fixed camera on a moving vehicle. We propose a novel approach that enhances the capacity of 3DGS by leveraging prior from a Diffusion Model.
arXiv Detail & Related papers (2024-03-29T09:20:29Z)
ODTFormer: Efficient Obstacle Detection and Tracking with Stereo Cameras Based on Transformer [12.58804521609764]
ODTFormer is a Transformer-based model to address both obstacle detection and tracking problems. We report comparable accuracy to state-of-the-art obstacle tracking models while requiring only a fraction of their cost.
arXiv Detail & Related papers (2024-03-21T17:59:55Z)
Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance [87.19164603145056]
We propose LoRAT, a method that unveils the power of large ViT model for tracking within laboratory-level resources. The essence of our work lies in adapting LoRA, a technique that fine-tunes a small subset of model parameters without adding inference latency. We design an anchor-free head solely based on to adapt PETR, enabling better performance with less computational overhead.
arXiv Detail & Related papers (2024-03-08T11:41:48Z)
Machine Learning For Beamline Steering [0.0]
The LINAC To Undulator section of the beamline is difficult to aim. Each use of the accelerator requires re-calibration of the magnets in this section. We investigate the use of deep neural networks to assist in this task.
arXiv Detail & Related papers (2023-11-13T18:00:06Z)
TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer [34.790081960470964]
We present TransNormerLLM, the first linear attention-based Large Language Model (LLM) We make advanced modifications that include positional embedding, linear attention acceleration, gating mechanisms, tensor normalization, and inference acceleration and stabilization. We validate our model design through a series of ablations and train models with sizes of 385M, 1B, and 7B on our self-collected corpus.
arXiv Detail & Related papers (2023-07-27T16:45:33Z)
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model [89.8764435351222]
We propose a new family of unbiased estimators called WTA-CRS, for matrix production with reduced variance. Our work provides both theoretical and experimental evidence that, in the context of tuning transformers, our proposed estimators exhibit lower variance compared to existing ones.
arXiv Detail & Related papers (2023-05-24T15:52:08Z)
Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models [12.062489591946457]
We study the accuracy vs. efficiency trade-off on two widely used long-sequence models. We find that LED consistently achieves better accuracy at lower energy costs than Big Bird. For question answering, we find that smaller models are both more efficient and more accurate due to the larger training batch sizes possible.
arXiv Detail & Related papers (2022-04-15T01:52:45Z)
STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data. Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z)
DA-Transformer: Distance-aware Transformer [87.20061062572391]
DA-Transformer is a distance-aware Transformer that can exploit the real distance. In this paper, we propose DA-Transformer, which is a distance-aware Transformer that can exploit the real distance.
arXiv Detail & Related papers (2020-10-14T10:09:01Z)
AxFormer: Accuracy-driven Approximation of Transformers for Faster, Smaller and more Accurate NLP Models [4.247712017691596]
AxFormer is a framework that applies accuracy-driven approximations to create optimized transformer models for a given downstream task. Our experiments show that AxFormer models are up to 4.5% more accurate, while also being up to 2.5X faster and up to 3.2X smaller than conventional fine-tuned models.
arXiv Detail & Related papers (2020-10-07T23:29:34Z)
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers [94.43313684188819]
We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. This leads to an apparent trade-off between the training efficiency of large Transformer models and the inference efficiency of small Transformer models.
arXiv Detail & Related papers (2020-02-26T21:17:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.