fSEAD: a Composable FPGA-based Streaming Ensemble Anomaly Detection Library
- URL: http://arxiv.org/abs/2406.05999v1
- Date: Mon, 10 Jun 2024 03:38:35 GMT
- Title: fSEAD: a Composable FPGA-based Streaming Ensemble Anomaly Detection Library
- Authors: Binglei Lou, David Boland, Philip H. W. Leong,
- Abstract summary: Machine learning ensembles combine multiple base models to produce a more accurate output.
We propose a flexible computing architecture consisting of multiple partially reconfigurable regions, pblocks, which each implement anomaly detectors.
Our proof-of-concept design supports three state-of-the-art anomaly detection algorithms: Loda, RS-Hash and xStream.
- Score: 1.8570740863168362
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning ensembles combine multiple base models to produce a more accurate output. They can be applied to a range of machine learning problems, including anomaly detection. In this paper, we investigate how to maximize the composability and scalability of an FPGA-based streaming ensemble anomaly detector (fSEAD). To achieve this, we propose a flexible computing architecture consisting of multiple partially reconfigurable regions, pblocks, which each implement anomaly detectors. Our proof-of-concept design supports three state-of-the-art anomaly detection algorithms: Loda, RS-Hash and xStream. Each algorithm is scalable, meaning multiple instances can be placed within a pblock to improve performance. Moreover, fSEAD is implemented using High-level synthesis (HLS), meaning further custom anomaly detectors can be supported. Pblocks are interconnected via an AXI-switch, enabling them to be composed in an arbitrary fashion before combining and merging results at run-time to create an ensemble that maximizes the use of FPGA resources and accuracy. Through utilizing reconfigurable Dynamic Function eXchange (DFX), the detector can be modified at run-time to adapt to changing environmental conditions. We compare fSEAD to an equivalent central processing unit (CPU) implementation using four standard datasets, with speed-ups ranging from $3\times$ to $8\times$.
Related papers
- AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer [54.713778961605115]
Vision Transformer (ViT) has become one of the most prevailing fundamental backbone networks in the computer vision community.
We propose a novel non-uniform quantizer, dubbed the Adaptive Logarithm AdaLog (AdaLog) quantizer.
arXiv Detail & Related papers (2024-07-17T18:38:48Z) - All-to-all reconfigurability with sparse and higher-order Ising machines [0.0]
We introduce a multiplexed architecture that emulates all-to-all network functionality.
We show that running the adaptive parallel tempering algorithm demonstrates competitive algorithmic and prefactor advantages.
scaled magnetic versions of p-bit IMs could lead to orders of magnitude improvements over the state of the art for generic optimization.
arXiv Detail & Related papers (2023-11-21T20:27:02Z) - Deep Learning Assisted Multiuser MIMO Load Modulated Systems for
Enhanced Downlink mmWave Communications [68.96633803796003]
This paper is focused on multiuser load modulation arrays (MU-LMAs) which are attractive due to their low system complexity and reduced cost for millimeter wave (mmWave) multi-input multi-output (MIMO) systems.
The existing precoding algorithm for downlink MU-LMA relies on a sub-array structured (SAS) transmitter which may suffer from decreased degrees of freedom and complex system configuration.
In this paper, we conceive an MU-LMA system employing a full-array structured (FAS) transmitter and propose two algorithms accordingly.
arXiv Detail & Related papers (2023-11-08T08:54:56Z) - Configurable Spatial-Temporal Hierarchical Analysis for Flexible Video
Anomaly Detection [10.956907116728267]
Video anomaly detection (VAD) is a vital task with great practical applications in industrial surveillance, security system, and traffic control.
We design a spatial-temporal hierarchical architecture (STHA) as a architecture to flexibly detect different degrees of anomaly.
We conduct experiments on three benchmarks and perform extensive analysis, and the results demonstrate that our method performs comparablely to the state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T09:03:38Z) - Performance Optimization using Multimodal Modeling and Heterogeneous GNN [1.304892050913381]
We propose a technique for tuning parallel code regions that is general enough to be adapted to multiple tasks.
In this paper, we analyze IR-based programming models to make task-specific performance optimizations.
Our experiments show that this multimodal learning based approach outperforms the state-of-the-art in all experiments.
arXiv Detail & Related papers (2023-04-25T04:27:43Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - Implementing Neural Network-Based Equalizers in a Coherent Optical
Transmission System Using Field-Programmable Gate Arrays [3.1543509940301946]
We show the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems.
The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware.
arXiv Detail & Related papers (2022-12-09T07:28:45Z) - Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and
Algorithm Co-design [66.39546326221176]
Attention-based neural networks have become pervasive in many AI tasks.
The use of the attention mechanism and feed-forward network (FFN) demands excessive computational and memory resources.
This paper proposes a hardware-friendly variant that adopts a unified butterfly sparsity pattern to approximate both the attention mechanism and the FFNs.
arXiv Detail & Related papers (2022-09-20T09:28:26Z) - Task-Oriented Sensing, Computation, and Communication Integration for
Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC)
We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z) - In-filter Computing For Designing Ultra-light Acoustic Pattern
Recognizers [6.335302509003343]
We present a novel in-filter computing framework that can be used for designing ultra-light acoustic classifiers.
The proposed architecture integrates the convolution and nonlinear filtering operations directly into the kernels of a Support Vector Machine.
We show that the system can achieve robust classification performance on benchmark sound recognition tasks using only 1.5k Look-Up Tables (LUTs) and 2.8k Flip-Flops (FFs)
arXiv Detail & Related papers (2021-09-11T08:16:53Z) - ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD)
ASFD is based on a combination of neural architecture search techniques as well as a new loss design.
Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.