Interlocking-free Selective Rationalization Through Genetic-based Learning
- URL: http://arxiv.org/abs/2412.10312v1
- Date: Fri, 13 Dec 2024 17:52:48 GMT
- Title: Interlocking-free Selective Rationalization Through Genetic-based Learning
- Authors: Federico Ruggeri, Gaetano Signorelli,
- Abstract summary: We present GenSPP, the first interlocking-free architecture for selective rationalization that does not require any learning overhead.
Experiments on a synthetic and a real-world benchmark show that our model outperforms several state-of-the-art competitors.
- Score: 7.504525967508676
- License:
- Abstract: A popular end-to-end architecture for selective rationalization is the select-then-predict pipeline, comprising a generator to extract highlights fed to a predictor. Such a cooperative system suffers from suboptimal equilibrium minima due to the dominance of one of the two modules, a phenomenon known as interlocking. While several contributions aimed at addressing interlocking, they only mitigate its effect, often by introducing feature-based heuristics, sampling, and ad-hoc regularizations. We present GenSPP, the first interlocking-free architecture for selective rationalization that does not require any learning overhead, as the above-mentioned. GenSPP avoids interlocking by performing disjoint training of the generator and predictor via genetic global search. Experiments on a synthetic and a real-world benchmark show that our model outperforms several state-of-the-art competitors.
Related papers
- LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion [53.90516061351706]
We present InterHandGen, a novel framework that learns the generative prior of two-hand interaction.
For sampling, we combine anti-penetration and synthesis-free guidance to enable plausible generation.
Our method significantly outperforms baseline generative models in terms of plausibility and diversity.
arXiv Detail & Related papers (2024-03-26T06:35:55Z) - Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs)
Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations.
Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z) - Unsupervised Selective Rationalization with Noise Injection [7.17737088382948]
unsupervised selective rationalization produces rationales alongside predictions by chaining two jointly-trained components, a rationale generator and a predictor.
We introduce a novel training technique that effectively limits generation of implausible rationales by injecting noise between the generator and the predictor.
We achieve sizeable improvements in rationale plausibility and task accuracy over the state-of-the-art across a variety of tasks, including our new benchmark.
arXiv Detail & Related papers (2023-05-27T17:34:36Z) - Decoupled Rationalization with Asymmetric Learning Rates: A Flexible
Lipschitz Restraint [16.54547887989801]
Self-explaining rationalization model is generally constructed by a cooperative game where a generator selects the most human-intelligible pieces from the input text as rationales, followed by a predictor that makes predictions based on the selected rationales.
Such a cooperative game may incur the degeneration problem where the predictor overfits to the uninformative pieces generated by a not yet well-trained generator and in turn, leads the generator to converge to a sub-optimal model that tends to select senseless pieces.
We empirically propose a simple but effective method named DR, which can naturally and flexibly restrain the Lipschitz constant of the
arXiv Detail & Related papers (2023-05-23T02:01:13Z) - IPCC-TP: Utilizing Incremental Pearson Correlation Coefficient for Joint
Multi-Agent Trajectory Prediction [73.25645602768158]
IPCC-TP is a novel relevance-aware module based on Incremental Pearson Correlation Coefficient to improve multi-agent interaction modeling.
Our module can be conveniently embedded into existing multi-agent prediction methods to extend original motion distribution decoders.
arXiv Detail & Related papers (2023-03-01T15:16:56Z) - Random Feature Models for Learning Interacting Dynamical Systems [2.563639452716634]
We consider the problem of constructing a data-based approximation of the interacting forces directly from noisy observations of the paths of the agents in time.
The learned interaction kernels are then used to predict the agents behavior over a longer time interval.
In addition, imposing sparsity reduces the kernel evaluation cost which significantly lowers the simulation cost for forecasting the multi-agent systems.
arXiv Detail & Related papers (2022-12-11T20:09:36Z) - Large-Scale Sequential Learning for Recommender and Engineering Systems [91.3755431537592]
In this thesis, we focus on the design of an automatic algorithms that provide personalized ranking by adapting to the current conditions.
For the former, we propose novel algorithm called SAROS that take into account both kinds of feedback for learning over the sequence of interactions.
The proposed idea of taking into account the neighbour lines shows statistically significant results in comparison with the initial approach for faults detection in power grid.
arXiv Detail & Related papers (2022-05-13T21:09:41Z) - Understanding Interlocking Dynamics of Cooperative Rationalization [90.6863969334526]
Selective rationalization explains the prediction of complex neural networks by finding a small subset of the input that is sufficient to predict the neural model output.
We reveal a major problem with such cooperative rationalization paradigm -- model interlocking.
We propose a new rationalization framework, called A2R, which introduces a third component into the architecture, a predictor driven by soft attention as opposed to selection.
arXiv Detail & Related papers (2021-10-26T17:39:18Z) - Explaining a Series of Models by Propagating Local Feature Attributions [9.66840768820136]
Pipelines involving several machine learning models improve performance in many domains but are difficult to understand.
We introduce a framework to propagate local feature attributions through complex pipelines of models based on a connection to the Shapley value.
Our framework enables us to draw higher-level conclusions based on groups of gene expression features for Alzheimer's and breast cancer histologic grade prediction.
arXiv Detail & Related papers (2021-04-30T22:20:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.