Using Interventions to Improve Out-of-Distribution Generalization of
Text-Matching Recommendation Systems
- URL: http://arxiv.org/abs/2210.10636v2
- Date: Wed, 14 Jun 2023 10:17:56 GMT
- Title: Using Interventions to Improve Out-of-Distribution Generalization of
Text-Matching Recommendation Systems
- Authors: Parikshit Bansal, Yashoteja Prabhu, Emre Kiciman, Amit Sharma
- Abstract summary: Fine-tuning a large, base language model on paired item relevance data can be counter-productive for generalization.
For a product recommendation task, fine-tuning obtains worse accuracy than the base model when recommending items in a new category or for a future time period.
We propose an intervention-based regularizer that constraints the causal effect of any token on the model's relevance score to be similar to the base model.
- Score: 14.363532867533012
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given a user's input text, text-matching recommender systems output relevant
items by comparing the input text to available items' description, such as
product-to-product recommendation on e-commerce platforms. As users' interests
and item inventory are expected to change, it is important for a text-matching
system to generalize to data shifts, a task known as out-of-distribution (OOD)
generalization. However, we find that the popular approach of fine-tuning a
large, base language model on paired item relevance data (e.g., user clicks)
can be counter-productive for OOD generalization. For a product recommendation
task, fine-tuning obtains worse accuracy than the base model when recommending
items in a new category or for a future time period. To explain this
generalization failure, we consider an intervention-based importance metric,
which shows that a fine-tuned model captures spurious correlations and fails to
learn the causal features that determine the relevance between any two text
inputs. Moreover, standard methods for causal regularization do not apply in
this setting, because unlike in images, there exist no universally spurious
features in a text-matching task (the same token may be spurious or causal
depending on the text it is being matched to). For OOD generalization on text
inputs, therefore, we highlight a different goal: avoiding high importance
scores for certain features. We do so using an intervention-based regularizer
that constraints the causal effect of any token on the model's relevance score
to be similar to the base model. Results on Amazon product and 3 question
recommendation datasets show that our proposed regularizer improves
generalization for both in-distribution and OOD evaluation, especially in
difficult scenarios when the base model is not accurate.
Related papers
- Disentangling Likes and Dislikes in Personalized Generative Explainable Recommendation [26.214148426964794]
We introduce new datasets and evaluation methods that focus on the users' sentiments.
We construct the datasets by explicitly extracting users' positive and negative opinions from their post-purchase reviews.
We propose to evaluate systems based on whether the generated explanations align well with the users' sentiments.
arXiv Detail & Related papers (2024-10-17T06:15:00Z) - JPAVE: A Generation and Classification-based Model for Joint Product
Attribute Prediction and Value Extraction [59.94977231327573]
We propose a multi-task learning model with value generation/classification and attribute prediction called JPAVE.
Two variants of our model are designed for open-world and closed-world scenarios.
Experimental results on a public dataset demonstrate the superiority of our model compared with strong baselines.
arXiv Detail & Related papers (2023-11-07T18:36:16Z) - Text Matching Improves Sequential Recommendation by Reducing Popularity
Biases [48.272381505993366]
TASTE verbalizes items and user-item interactions using identifiers and attributes of items.
Our experiments show that TASTE outperforms the state-of-the-art methods on widely used sequential recommendation datasets.
arXiv Detail & Related papers (2023-08-27T07:44:33Z) - SWING: Balancing Coverage and Faithfulness for Dialogue Summarization [67.76393867114923]
We propose to utilize natural language inference (NLI) models to improve coverage while avoiding factual inconsistencies.
We use NLI to compute fine-grained training signals to encourage the model to generate content in the reference summaries that have not been covered.
Experiments on the DialogSum and SAMSum datasets confirm the effectiveness of the proposed approach.
arXiv Detail & Related papers (2023-01-25T09:33:11Z) - SMART: Sentences as Basic Units for Text Evaluation [48.5999587529085]
In this paper, we introduce a new metric called SMART to mitigate such limitations.
We treat sentences as basic units of matching instead of tokens, and use a sentence matching function to soft-match candidate and reference sentences.
Our results show that system-level correlations of our proposed metric with a model-based matching function outperforms all competing metrics.
arXiv Detail & Related papers (2022-08-01T17:58:05Z) - Sequential Recommendation via Stochastic Self-Attention [68.52192964559829]
Transformer-based approaches embed items as vectors and use dot-product self-attention to measure the relationship between items.
We propose a novel textbfSTOchastic textbfSelf-textbfAttention(STOSA) to overcome these issues.
We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences.
arXiv Detail & Related papers (2022-01-16T12:38:45Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.