Deep Submodular Networks for Extractive Data Summarization
- URL: http://arxiv.org/abs/2010.08593v1
- Date: Fri, 16 Oct 2020 19:06:15 GMT
- Title: Deep Submodular Networks for Extractive Data Summarization
- Authors: Suraj Kothawade, Jiten Girdhar, Chandrashekhar Lavania, Rishabh Iyer
- Abstract summary: We propose an end-to-end learning framework for summarization problems.
The Deep Submodular Networks (DSN) framework can be used to learn features appropriate for summarization from scratch.
In particular, we show that DSNs outperform simple mixture models using off the shelf features.
- Score: 0.46898263272139784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Models are increasingly becoming prevalent in summarization problems
(e.g. document, video and images) due to their ability to learn complex feature
interactions and representations. However, they do not model characteristics
such as diversity, representation, and coverage, which are also very important
for summarization tasks. On the other hand, submodular functions naturally
model these characteristics because of their diminishing returns property. Most
approaches for modelling and learning submodular functions rely on very simple
models, such as weighted mixtures of submodular functions. Unfortunately, these
models only learn the relative importance of the different submodular functions
(such as diversity, representation or importance), but cannot learn more
complex feature representations, which are often required for state-of-the-art
performance. We propose Deep Submodular Networks (DSN), an end-to-end learning
framework that facilitates the learning of more complex features and richer
functions, crafted for better modelling of all aspects of summarization. The
DSN framework can be used to learn features appropriate for summarization from
scratch. We demonstrate the utility of DSNs on both generic and query focused
image-collection summarization, and show significant improvement over the
state-of-the-art. In particular, we show that DSNs outperform simple mixture
models using off the shelf features. Secondly, we also show that just using
four submodular functions in a DSN with end-to-end learning performs comparably
to the state-of-the-art mixture model with a hand-crafted set of 594 components
and outperforms other methods for image collection summarization.
Related papers
- MLP-KAN: Unifying Deep Representation and Function Learning [7.634331640151854]
We introduce a unified method designed to eliminate the need for manual model selection.
By integrating Multi-Layer Perceptrons (MLPs) for representation learning and Kolmogorov-Arnold Networks (KANsogo) for function learning, we achieve remarkable results.
arXiv Detail & Related papers (2024-10-03T22:22:43Z) - Self-Supervised Representation Learning with Meta Comprehensive
Regularization [11.387994024747842]
We introduce a module called CompMod with Meta Comprehensive Regularization (MCR), embedded into existing self-supervised frameworks.
We update our proposed model through a bi-level optimization mechanism, enabling it to capture comprehensive features.
We provide theoretical support for our proposed method from information theory and causal counterfactual perspective.
arXiv Detail & Related papers (2024-03-03T15:53:48Z) - Representation Surgery for Multi-Task Model Merging [57.63643005215592]
Multi-task learning (MTL) compresses the information from multiple tasks into a unified backbone to improve computational efficiency and generalization.
Recent work directly merges multiple independently trained models to perform MTL instead of collecting their raw data for joint training.
By visualizing the representation distribution of existing model merging schemes, we find that the merged model often suffers from the dilemma of representation bias.
arXiv Detail & Related papers (2024-02-05T03:39:39Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Neural Basis Models for Interpretability [33.51591891812176]
Generalized Additive Models (GAMs) are an inherently interpretable class of models.
We propose an entirely new subfamily of GAMs that utilize basis decomposition of shape functions.
A small number of basis functions are shared among all features, and are learned jointly for a given task.
arXiv Detail & Related papers (2022-05-27T17:31:19Z) - Submodularity In Machine Learning and Artificial Intelligence [0.0]
We offer a plethora of submodular definitions; a full description of example submodular functions and their generalizations.
We then turn to how submodularity is useful in machine learning and artificial intelligence.
arXiv Detail & Related papers (2022-01-31T22:41:35Z) - Text Modular Networks: Learning to Decompose Tasks in the Language of
Existing Models [61.480085460269514]
We propose a framework for building interpretable systems that learn to solve complex tasks by decomposing them into simpler ones solvable by existing models.
We use this framework to build ModularQA, a system that can answer multi-hop reasoning questions by decomposing them into sub-questions answerable by a neural factoid single-span QA model and a symbolic calculator.
arXiv Detail & Related papers (2020-09-01T23:45:42Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z) - Revealing the Invisible with Model and Data Shrinking for
Composite-database Micro-expression Recognition [49.463864096615254]
We analyze the influence of learning complexity, including the input complexity and model complexity.
We propose a recurrent convolutional network (RCN) to explore the shallower-architecture and lower-resolution input data.
We develop three parameter-free modules to integrate with RCN without increasing any learnable parameters.
arXiv Detail & Related papers (2020-06-17T06:19:24Z) - From Sets to Multisets: Provable Variational Inference for Probabilistic
Integer Submodular Models [82.95892656532696]
Submodular functions have been studied extensively in machine learning and data mining.
In this work, we propose a continuous DR-submodular extension for integer submodular functions.
We formulate a new probabilistic model which is defined through integer submodular functions.
arXiv Detail & Related papers (2020-06-01T22:20:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.