When to Use What: An In-Depth Comparative Empirical Analysis of OpenIE
Systems for Downstream Applications
- URL: http://arxiv.org/abs/2211.08228v1
- Date: Tue, 15 Nov 2022 15:48:27 GMT
- Title: When to Use What: An In-Depth Comparative Empirical Analysis of OpenIE
Systems for Downstream Applications
- Authors: Kevin Pei (Grainger College of Engineering, University of Illinois at
Urbana-Champaign), Ishan Jindal (IBM Research), Kevin Chen-Chuan Chang
(Grainger College of Engineering, University of Illinois at
Urbana-Champaign), Chengxiang Zhai (Grainger College of Engineering,
University of Illinois at Urbana-Champaign), Yunyao Li (Apple Knowledge
Platform)
- Abstract summary: We present an application-focused empirical survey of neural OpenIE models, training sets, and benchmarks.
We find that the different assumptions made by different models and datasets have a statistically significant effect on performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open Information Extraction (OpenIE) has been used in the pipelines of
various NLP tasks. Unfortunately, there is no clear consensus on which models
to use in which tasks. Muddying things further is the lack of comparisons that
take differing training sets into account. In this paper, we present an
application-focused empirical survey of neural OpenIE models, training sets,
and benchmarks in an effort to help users choose the most suitable OpenIE
systems for their applications. We find that the different assumptions made by
different models and datasets have a statistically significant effect on
performance, making it important to choose the most appropriate model for one's
applications. We demonstrate the applicability of our recommendations on a
downstream Complex QA application.
Related papers
- Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process.
vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner.
We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z) - Discourse-Aware In-Context Learning for Temporal Expression Normalization [7.621550020607368]
In this work, we explore the feasibility of proprietary and open-source large language models (LLMs) for TE normalization.
By using a window-based prompt design approach, we can perform TE normalization across sentences, while leveraging the LLM knowledge without training the model.
Our experiments show competitive results to models designed for this task.
arXiv Detail & Related papers (2024-04-11T14:13:44Z) - LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection.
We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks.
Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z) - Streamlined Framework for Agile Forecasting Model Development towards
Efficient Inventory Management [2.0625936401496237]
This paper proposes a framework for developing forecasting models by streamlining the connections between core components of the developmental process.
The proposed framework enables swift and robust integration of new datasets, experimentation on different algorithms, and selection of the best models.
arXiv Detail & Related papers (2023-04-13T08:52:32Z) - Cross-Modal Fine-Tuning: Align then Refine [83.37294254884446]
ORCA is a cross-modal fine-tuning framework that extends the applicability of a single large-scale pretrained model to diverse modalities.
We show that ORCA obtains state-of-the-art results on 3 benchmarks containing over 60 datasets from 12 modalities.
arXiv Detail & Related papers (2023-02-11T16:32:28Z) - Reusable Self-Attention Recommender Systems in Fashion Industry
Applications [0.0]
We present live experimental results demonstrating improvements in user retention of up to 30%.
We focus on fashion inspiration use-cases, such as outfit ranking, outfit recommendation and real-time personalized outfit generation.
arXiv Detail & Related papers (2023-01-17T10:00:17Z) - A Data-Centric AI Paradigm Based on Application-Driven Fine-grained
Dataset Design [2.2223262422197907]
We propose a novel paradigm for fine-grained design of datasets, driven by industrial applications.
We flexibly select positive and negative sample sets according to the essential features of the data and application requirements.
Compared with the traditional data design methods, our method achieves better results and effectively reduces false alarm.
arXiv Detail & Related papers (2022-09-20T03:56:53Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - Top-N Recommendation with Counterfactual User Preference Simulation [26.597102553608348]
Top-N recommendation, which aims to learn user ranking-based preference, has long been a fundamental problem in a wide range of applications.
In this paper, we propose to reformulate the recommendation task within the causal inference framework to handle the data scarce problem.
arXiv Detail & Related papers (2021-09-02T14:28:46Z) - Information Directed Reward Learning for Reinforcement Learning [64.33774245655401]
We learn a model of the reward function that allows standard RL algorithms to achieve high expected return with as few expert queries as possible.
In contrast to prior active reward learning methods designed for specific types of queries, IDRL naturally accommodates different query types.
We support our findings with extensive evaluations in multiple environments and with different types of queries.
arXiv Detail & Related papers (2021-02-24T18:46:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.