Deep Forest with Hashing Screening and Window Screening
- URL: http://arxiv.org/abs/2207.11951v1
- Date: Mon, 25 Jul 2022 07:39:55 GMT
- Title: Deep Forest with Hashing Screening and Window Screening
- Authors: Pengfei Ma, Youxi Wu, Yan Li, Lei Guo, He Jiang, Xingquan Zhu, and
Xindong Wu
- Abstract summary: We introduce a hashing screening mechanism for multi-grained scanning of gcForest.
We propose a model called HW-Forest which adopts two strategies, hashing screening and window screening.
Our experimental results show that HW-Forest has higher accuracy than other models, and the time cost is also reduced.
- Score: 25.745779145969053
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a novel deep learning model, gcForest has been widely used in various
applications. However, the current multi-grained scanning of gcForest produces
many redundant feature vectors, and this increases the time cost of the model.
To screen out redundant feature vectors, we introduce a hashing screening
mechanism for multi-grained scanning and propose a model called HW-Forest which
adopts two strategies, hashing screening and window screening. HW-Forest
employs perceptual hashing algorithm to calculate the similarity between
feature vectors in hashing screening strategy, which is used to remove the
redundant feature vectors produced by multi-grained scanning and can
significantly decrease the time cost and memory consumption. Furthermore, we
adopt a self-adaptive instance screening strategy to improve the performance of
our approach, called window screening, which can achieve higher accuracy
without hyperparameter tuning on different datasets. Our experimental results
show that HW-Forest has higher accuracy than other models, and the time cost is
also reduced.
Related papers
- Leveraging VAE-Derived Latent Spaces for Enhanced Malware Detection with Machine Learning Classifiers [0.0]
This paper assesses the performance of five machine learning classifiers: Decision Tree, Naive Bayes, LightGBM, Logistic Regression, and Random Forest.
Results from the experiments conducted on different training-test splits with different random seeds reveal that all the models perform well in detecting malware.
arXiv Detail & Related papers (2025-03-24T14:44:55Z) - ETS: Efficient Tree Search for Inference-Time Scaling [61.553681244572914]
One promising approach for test-time compute scaling is search against a process reward model.
diversity of trajectories in the tree search process affects the accuracy of the search, since increasing diversity promotes more exploration.
We propose Efficient Tree Search (ETS), which promotes KV sharing by pruning redundant trajectories while maintaining necessary diverse trajectories.
arXiv Detail & Related papers (2025-02-19T09:30:38Z) - SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer [62.11796778482088]
We present a novel model-agnostic sparse vision transformer, dubbed SparseFormer, to bridge the gap of object detection between close-up and HRW shots.
The proposed SparseFormer selectively uses attentive tokens to scrutinize the sparsely distributed windows that may contain objects.
experiments on two HRW benchmarks, PANDA and DOTA-v1.0, demonstrate that the proposed SparseFormer significantly improves detection accuracy (up to 5.8%) and speed (up to 3x) over the state-of-the-art approaches.
arXiv Detail & Related papers (2025-02-11T03:21:25Z) - Efficient Generative Modeling with Residual Vector Quantization-Based Tokens [5.949779668853557]
ResGen is an efficient RVQ-based discrete diffusion model that generates high-fidelity samples without compromising sampling speed.
We validate the efficacy and generalizability of the proposed method on two challenging tasks: conditional image generation on ImageNet 256x256 and zero-shot text-to-speech synthesis.
As we scale the depth of RVQ, our generative models exhibit enhanced generation fidelity or faster sampling speeds compared to similarly sized baseline models.
arXiv Detail & Related papers (2024-12-13T15:31:17Z) - Q-VLM: Post-training Quantization for Large Vision-Language Models [73.19871905102545]
We propose a post-training quantization framework of large vision-language models (LVLMs) for efficient multi-modal inference.
We mine the cross-layer dependency that significantly influences discretization errors of the entire vision-language model, and embed this dependency into optimal quantization strategy.
Experimental results demonstrate that our method compresses the memory by 2.78x and increase generate speed by 1.44x about 13B LLaVA model without performance degradation.
arXiv Detail & Related papers (2024-10-10T17:02:48Z) - Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection.
We design a forgery-style mixture formulation that augments the diversity of forgery source domains.
We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings [15.275864151890511]
We introduce MUVERA (MUlti-VEctor Retrieval Algorithm), a retrieval mechanism which reduces multi-vector search to single-vector similarity search.
MUVERA achieves consistently good end-to-end recall and latency across a diverse set of the BEIR retrieval datasets.
arXiv Detail & Related papers (2024-05-29T20:40:20Z) - Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control [66.78146440275093]
Learned retrieval (LSR) is a family of neural methods that encode queries and documents into sparse lexical vectors.
We explore the application of LSR to the multi-modal domain, with a focus on text-image retrieval.
Current approaches like LexLIP and STAIR require complex multi-step training on massive datasets.
Our proposed approach efficiently transforms dense vectors from a frozen dense model into sparse lexical vectors.
arXiv Detail & Related papers (2024-02-27T14:21:56Z) - Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix
Factorization [60.91600465922932]
We present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder.
Our approach provides test-time recall-vs-computational cost trade-offs superior to the current widely-used methods.
arXiv Detail & Related papers (2022-10-23T00:32:04Z) - Distributed Dynamic Safe Screening Algorithms for Sparse Regularization [73.85961005970222]
We propose a new distributed dynamic safe screening (DDSS) method for sparsity regularized models and apply it on shared-memory and distributed-memory architecture respectively.
We prove that the proposed method achieves the linear convergence rate with lower overall complexity and can eliminate almost all the inactive features in a finite number of iterations almost surely.
arXiv Detail & Related papers (2022-04-23T02:45:55Z) - SADet: Learning An Efficient and Accurate Pedestrian Detector [68.66857832440897]
This paper proposes a series of systematic optimization strategies for the detection pipeline of one-stage detector.
It forms a single shot anchor-based detector (SADet) for efficient and accurate pedestrian detection.
Though structurally simple, it presents state-of-the-art result and real-time speed of $20$ FPS for VGA-resolution images.
arXiv Detail & Related papers (2020-07-26T12:32:38Z) - Representation Sharing for Fast Object Detector Search and Beyond [38.18583590914755]
We propose Fast And Diverse (FAD) to better explore the optimal configuration of receptive fields and convolution types in the sub-networks for one-stage detectors.
FAD achieves prominent improvements on two types of one-stage detectors with various backbones.
arXiv Detail & Related papers (2020-07-23T15:39:44Z) - SparseIDS: Learning Packet Sampling with Reinforcement Learning [1.978587235008588]
Recurrent Neural Networks (RNNs) have been shown to be valuable for constructing Intrusion Detection Systems (IDSs) for network data.
We show that by using a novel Reinforcement Learning (RL)-based approach called SparseIDS, we can reduce the number of consumed packets by more than three fourths.
arXiv Detail & Related papers (2020-02-10T15:38:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.