Least Squares Maximum and Weighted Generalization-Memorization Machines
- URL: http://arxiv.org/abs/2308.16456v1
- Date: Thu, 31 Aug 2023 04:48:59 GMT
- Title: Least Squares Maximum and Weighted Generalization-Memorization Machines
- Authors: Shuai Wang, Zhen Wang and Yuan-Hai Shao
- Abstract summary: We propose a new way of remembering by introducing a memory influence mechanism for the least squares support vector machine (LSSVM)
The maximum memory impact model (MIMM) and the weighted impact memory model (WIMM) are then proposed.
- Score: 14.139758779594667
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a new way of remembering by introducing a memory
influence mechanism for the least squares support vector machine (LSSVM).
Without changing the equation constraints of the original LSSVM, this
mechanism, allows an accurate partitioning of the training set without
overfitting. The maximum memory impact model (MIMM) and the weighted impact
memory model (WIMM) are then proposed. It is demonstrated that these models can
be degraded to the LSSVM. Furthermore, we propose some different memory impact
functions for the MIMM and WIMM. The experimental results show that that our
MIMM and WIMM have better generalization performance compared to the LSSVM and
significant advantage in time cost compared to other memory models.
Related papers
- Sparse Gradient Compression for Fine-Tuning Large Language Models [58.44973963468691]
Fine-tuning large language models (LLMs) for downstream tasks has become increasingly crucial due to their widespread use and the growing availability of open-source models.
High memory costs associated with fine-tuning remain a significant challenge, especially as models increase in size.
We propose sparse compression gradient (SGC) to address these limitations.
arXiv Detail & Related papers (2025-02-01T04:18:28Z) - SMMF: Square-Matricized Momentum Factorization for Memory-Efficient Optimization [0.5755004576310332]
SMMF is a memory-efficient that reduces the memory requirement of the widely used adaptive learning rate Matrix, such as Adam, by up to 96%.
We conduct a regret bound analysis of SMMF, which shows that it converges similarly to non-memory-efficient adaptive learning rate Matrix, such as AdamNC.
In our experiment, SMMF takes up to 96% less memory compared to state-of-the-art memory efficients, e.g., Adafactor, CAME, and SM3, while achieving comparable model performance.
arXiv Detail & Related papers (2024-12-12T03:14:50Z) - Hermes: Memory-Efficient Pipeline Inference for Large Models on Edge Devices [19.96064012736243]
This paper introduces PIPELOAD, a memory-efficient pipeline execution mechanism.
It reduces memory usage by incorporating dynamic memory management and minimizes inference latency.
We present Hermes, a framework optimized for large model inference on edge devices.
arXiv Detail & Related papers (2024-09-06T12:55:49Z) - MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory [49.96019697955383]
We introduce MemLLM, a novel method of enhancing large language models (LLMs) by integrating a structured and explicit read-and-write memory module.
Our experiments indicate that MemLLM enhances the LLM's performance and interpretability, in language modeling in general and knowledge-intensive tasks in particular.
arXiv Detail & Related papers (2024-04-17T18:13:16Z) - Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models [73.48675708831328]
We propose a novel parameter and computation efficient tuning method for Multi-modal Large Language Models (MLLMs)
The Efficient Attention Skipping (EAS) method evaluates the attention redundancy and skips the less important MHAs to speed up inference.
The experiments show that EAS not only retains high performance and parameter efficiency, but also greatly speeds up inference speed.
arXiv Detail & Related papers (2024-03-22T14:20:34Z) - MEMORYLLM: Towards Self-Updatable Large Language Models [101.3777486749529]
Existing Large Language Models (LLMs) usually remain static after deployment.
We introduce MEMORYLLM, a model that comprises a transformer and a fixed-size memory pool.
MEMORYLLM can self-update with text knowledge and memorize the knowledge injected earlier.
arXiv Detail & Related papers (2024-02-07T07:14:11Z) - Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference [23.207326766883405]
Mixture-of-Experts (MoE) is able to scale its model size without proportionally scaling up its computational requirements.
Pre-gated MoE employs our novel pre-gating function which alleviates the dynamic nature of sparse expert activation.
We demonstrate that Pre-gated MoE is able to improve performance, reduce GPU memory consumption, while also maintaining the same level of model quality.
arXiv Detail & Related papers (2023-08-23T11:25:37Z) - A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning [56.450090618578]
Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement.
We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work.
We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
arXiv Detail & Related papers (2022-05-26T08:24:01Z) - Master memory function for delay-based reservoir computers with
single-variable dynamics [0.0]
We show that many delay-based reservoir computers can be characterized by a universal master memory function (MMF)
Once computed for two independent parameters, this function provides linear memory capacity for any delay-based single-variable reservoir with small inputs.
arXiv Detail & Related papers (2021-08-28T13:17:24Z) - Semantically Constrained Memory Allocation (SCMA) for Embedding in
Efficient Recommendation Systems [27.419109620575313]
A key challenge for deep learning models is to work with millions of categorical classes or tokens.
We propose a novel formulation of memory shared embedding, where memory is shared in proportion to the overlap in semantic information.
We demonstrate a significant reduction in the memory footprint while maintaining performance.
arXiv Detail & Related papers (2021-02-24T19:55:49Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.