Related papers: Submodlib: A Submodular Optimization Library

Submodlib: A Submodular Optimization Library

URL: http://arxiv.org/abs/2202.10680v2
Date: Wed, 23 Feb 2022 06:30:37 GMT
Title: Submodlib: A Submodular Optimization Library
Authors: Vishal Kaushal, Ganesh Ramakrishnan, Rishabh Iyer
Abstract summary: Submodlib is an open-source, easy-to-use, efficient and scalable Python library for submodular optimization. Submodlib finds its application in summarization, data subset selection, hyper parameter tuning, efficient training and more.
Score: 17.596860081700115
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Submodular functions are a special class of set functions which naturally model the notion of representativeness, diversity, coverage etc. and have been shown to be computationally very efficient. A lot of past work has applied submodular optimization to find optimal subsets in various contexts. Some examples include data summarization for efficient human consumption, finding effective smaller subsets of training data to reduce the model development time (training, hyper parameter tuning), finding effective subsets of unlabeled data to reduce the labeling costs, etc. A recent work has also leveraged submodular functions to propose submodular information measures which have been found to be very useful in solving the problems of guided subset selection and guided summarization. In this work, we present Submodlib which is an open-source, easy-to-use, efficient and scalable Python library for submodular optimization with a C++ optimization engine. Submodlib finds its application in summarization, data subset selection, hyper parameter tuning, efficient training and more. Through a rich API, it offers a great deal of flexibility in the way it can be used. Source of Submodlib is available at https://github.com/decile-team/submodlib.

Related papers

Optimizing Datasets for Code Summarization: Is Code-Comment Coherence Enough? [11.865113785648932]
We explore the extent to which code-comment coherence, a specific quality attribute of code summaries, can be used to optimize code summarization datasets. We examine multiple levels of training instances from two state-of-the-art datasets (TL-CodeSum and Funcom) and evaluate the resulting models on three manually curated test sets.
arXiv Detail & Related papers (2025-02-11T15:02:19Z)
LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method. We propose a higher-order Candecomp/Parafac (CP) decomposition, enabling a more compact and flexible representation. Our method can achieve a reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z)
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling [62.19438812624467]
Large language models (LLMs) have exhibited their problem-solving abilities in mathematical reasoning. We propose OptiBench, a benchmark for End-to-end optimization problem-solving with human-readable inputs and outputs.
arXiv Detail & Related papers (2024-07-13T13:27:57Z)
AdaLomo: Low-memory Optimization with Adaptive Learning Rate [59.64965955386855]
We introduce low-memory optimization with adaptive learning rate (AdaLomo) for large language models. AdaLomo results on par with AdamW, while significantly reducing memory requirements, thereby lowering the hardware barrier to training large language models.
arXiv Detail & Related papers (2023-10-16T09:04:28Z)
Maximizing Submodular Functions for Recommendation in the Presence of Biases [25.081136190260015]
Subset selection tasks arise in systems and search engines and ask to select a subset of items that maximize the value for the user. In many applications, inputs have been observed to have social biases that reduce the utility of the output subset. We show that fairness constraint-based interventions can not only ensure proportional representation but also achieve near-optimal utility in the presence of biases.
arXiv Detail & Related papers (2023-05-03T15:20:00Z)
MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training. Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z)
Neural Estimation of Submodular Functions with Applications to Differentiable Subset Selection [50.14730810124592]
Submodular functions and variants, through their ability to characterize diversity and coverage, have emerged as a key tool for data selection and summarization. We propose FLEXSUBNET, a family of flexible neural models for both monotone and non-monotone submodular functions.
arXiv Detail & Related papers (2022-10-20T06:00:45Z)
abess: A Fast Best Subset Selection Library in Python and R [1.6208003359512848]
We introduce a new library named abess that implements a unified framework of best-subset selection. The abess certifiably gets the optimal solution within times under the linear model. The core of the library is programmed in C++, and it can be installed from the Python library Index.
arXiv Detail & Related papers (2021-10-19T02:34:55Z)
Captum: A unified and generic model interpretability library for PyTorch [49.72749684393332]
We introduce a novel, unified, open-source model interpretability library for PyTorch. The library contains generic implementations of a number of gradient and perturbation-based attribution algorithms. It can be used for both classification and non-classification models.
arXiv Detail & Related papers (2020-09-16T18:57:57Z)
Scalable Combinatorial Bayesian Optimization with Tractable Statistical models [44.25245545568633]
We study the problem of optimizing blackbox functions over Relaxation spaces (e.g., sets, sequences, trees, and graphs) Based on recent advances in submodular relaxation, we study an approach as Parametrized Submodular (PSR) towards the goal of improving the scalability and accuracy of solving AFO problems for BOCS model. Experiments on diverse benchmark problems show significant improvements with PSR for BOCS model.
arXiv Detail & Related papers (2020-08-18T22:56:46Z)
From Sets to Multisets: Provable Variational Inference for Probabilistic Integer Submodular Models [82.95892656532696]
Submodular functions have been studied extensively in machine learning and data mining. In this work, we propose a continuous DR-submodular extension for integer submodular functions. We formulate a new probabilistic model which is defined through integer submodular functions.
arXiv Detail & Related papers (2020-06-01T22:20:45Z)
Flexible numerical optimization with ensmallen [15.78308411537254]
This report provides an introduction to the ensmallen numerical optimization library. The library provides a fast and flexible C++ framework for mathematical optimization of arbitrary user-supplied functions.
arXiv Detail & Related papers (2020-03-09T12:57:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.