Related papers: Fast Parallel Exact Inference on Bayesian Networks: Poster

Fast Parallel Exact Inference on Bayesian Networks: Poster

URL: http://arxiv.org/abs/2212.04241v1
Date: Thu, 8 Dec 2022 12:50:02 GMT
Title: Fast Parallel Exact Inference on Bayesian Networks: Poster
Authors: Jiantong Jiang, Zeyi Wen, Atif Mansoor, Ajmal Mian
Abstract summary: We propose a fast BN exact inference solution named Fast-BNI on multi-core CPUs. Fast-BNI enhances the efficiency of exact inference through hybrid parallelism. We also propose techniques to further simplify the bottleneck operations of BN exact inference.
Score: 33.63789467363392
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bayesian networks (BNs) are attractive, because they are graphical and interpretable machine learning models. However, exact inference on BNs is time-consuming, especially for complex problems. To improve the efficiency, we propose a fast BN exact inference solution named Fast-BNI on multi-core CPUs. Fast-BNI enhances the efficiency of exact inference through hybrid parallelism that tightly integrates coarse- and fine-grained parallelism. We also propose techniques to further simplify the bottleneck operations of BN exact inference. Fast-BNI source code is freely available at https://github.com/jjiantong/FastBN.

Related papers

APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs [81.5049387116454]
We introduce APB, an efficient long-context inference framework. APB uses multi-host approximate attention to enhance prefill speed. APB achieves speeds of up to 9.2x, 4.2x, and 1.6x compared with FlashAttn, RingAttn, and StarAttn, respectively.
arXiv Detail & Related papers (2025-02-17T17:59:56Z)
Benchmarking Edge AI Platforms for High-Performance ML Inference [0.0]
Edge computing's growing prominence, due to its ability to reduce communication latency and enable real-time processing, is promoting the rise of high-performance, heterogeneous System-on-Chip solutions. While current approaches often involve scaling down modern hardware, the performance characteristics of neural network workloads can vary significantly. We compare the latency and throughput of various linear algebra and neural network inference tasks across CPU-only, CPU/GPU, and CPU/NPU integrated solutions.
arXiv Detail & Related papers (2024-09-23T08:27:27Z)
BEBLID: Boosted efficient binary local image descriptor [2.8538628855541397]
We introduce BEBLID, an efficient learned binary image descriptor. It improves our previous real-valued descriptor, BELID, making it both more efficient for matching and more accurate. In experiments BEBLID achieves an accuracy close to SIFT and better computational efficiency than ORB, the fastest algorithm in the literature.
arXiv Detail & Related papers (2024-02-07T00:14:32Z)
Tensor Slicing and Optimization for Multicore NPUs [2.670309629218727]
This paper proposes a compiler optimization pass for Multicore NPUs, called Slicing Optimization (TSO) TSO identifies the best tensor slicing that minimizes execution time for a set of CNN models. Results show that TSO is capable of identifying the best tensor slicing that minimizes execution time for a set of CNN models.
arXiv Detail & Related papers (2023-04-06T12:03:03Z)
Fast Parallel Bayesian Network Structure Learning [37.46185698921754]
We propose a fast solution named Fast-BNS on multi-core CPUs to enhance the efficiency of the BN structure learning. Fast-BNS is powered by a series of efficiency optimizations including grouping the CI tests of the edges with the same endpoints to reduce the number of unnecessary CI tests. A comprehensive experimental study shows that the sequential version of Fast-BNS is up to 50 times faster than its counterpart.
arXiv Detail & Related papers (2022-12-08T13:17:02Z)
Improved Branch and Bound for Neural Network Verification via Lagrangian Decomposition [161.09660864941603]
We improve the scalability of Branch and Bound (BaB) algorithms for formally proving input-output properties of neural networks. We present a novel activation-based branching strategy and a BaB framework, named Branch and Dual Network Bound (BaDNB) BaDNB outperforms previous complete verification systems by a large margin, cutting average verification times by factors up to 50 on adversarial properties.
arXiv Detail & Related papers (2021-04-14T09:22:42Z)
Parareal Neural Networks Emulating a Parallel-in-time Algorithm [1.988145627448243]
As deep neural networks (DNNs) become deeper, the training time increases. In this paper, we introduce a novel methodology to construct a parallel neural network.
arXiv Detail & Related papers (2021-03-16T02:03:39Z)
Beta-CROWN: Efficient Bound Propagation with Per-neuron Split Constraints for Complete and Incomplete Neural Network Verification [151.62491805851107]
We develop $beta$-CROWN, a bound propagation based verifier that can fully encode per-neuron splits. $beta$-CROWN is close to three orders of magnitude faster than LP-based BaB methods for robustness verification. By terminating BaB early, our method can also be used for incomplete verification.
arXiv Detail & Related papers (2021-03-11T11:56:54Z)
Batch Normalization with Enhanced Linear Transformation [73.9885755599221]
properly enhancing a linear transformation module can effectively improve the ability of Batch normalization (BN) Our method, named BNET, can be implemented with 2-3 lines of code in most deep learning libraries. We verify that BNET accelerates the convergence of network training and enhances spatial information by assigning the important neurons with larger weights accordingly.
arXiv Detail & Related papers (2020-11-28T15:42:36Z)
Fast and Complete: Enabling Complete Neural Network Verification with Rapid and Massively Parallel Incomplete Verifiers [112.23981192818721]
We propose to use backward mode linear relaxation based analysis (LiRPA) to replace Linear Programming (LP) during the BaB process. Unlike LP, LiRPA when applied naively can produce much weaker bounds and even cannot check certain conflicts of sub-domains during splitting. We demonstrate an order of magnitude speedup compared to existing LP-based approaches.
arXiv Detail & Related papers (2020-11-27T16:42:12Z)
Distillation Guided Residual Learning for Binary Convolutional Neural Networks [83.6169936912264]
It is challenging to bridge the performance gap between Binary CNN (BCNN) and Floating point CNN (FCNN) We observe that, this performance gap leads to substantial residuals between intermediate feature maps of BCNN and FCNN. To minimize the performance gap, we enforce BCNN to produce similar intermediate feature maps with the ones of FCNN. This training strategy, i.e., optimizing each binary convolutional block with block-wise distillation loss derived from FCNN, leads to a more effective optimization to BCNN.
arXiv Detail & Related papers (2020-07-10T07:55:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.