Enriching Social Science Research via Survey Item Linking
- URL: http://arxiv.org/abs/2412.15831v1
- Date: Fri, 20 Dec 2024 12:14:33 GMT
- Title: Enriching Social Science Research via Survey Item Linking
- Authors: Tornike Tsereteli, Daniel Ruffinelli, Simone Paolo Ponzetto,
- Abstract summary: We model a task called Survey Item Linking (SIL) in two stages: mention detection and entity disambiguation.
To this end, we create a high-quality and richly annotated dataset consisting of 20,454 English and German sentences.
We demonstrate that the task is feasible, but observe that errors propagate from the first stage, leading to a lower overall task performance.
- Score: 11.902701975866595
- License:
- Abstract: Questions within surveys, called survey items, are used in the social sciences to study latent concepts, such as the factors influencing life satisfaction. Instead of using explicit citations, researchers paraphrase the content of the survey items they use in-text. However, this makes it challenging to find survey items of interest when comparing related work. Automatically parsing and linking these implicit mentions to survey items in a knowledge base can provide more fine-grained references. We model this task, called Survey Item Linking (SIL), in two stages: mention detection and entity disambiguation. Due to an imprecise definition of the task, existing datasets used for evaluating the performance for SIL are too small and of low-quality. We argue that latent concepts and survey item mentions should be differentiated. To this end, we create a high-quality and richly annotated dataset consisting of 20,454 English and German sentences. By benchmarking deep learning systems for each of the two stages independently and sequentially, we demonstrate that the task is feasible, but observe that errors propagate from the first stage, leading to a lower overall task performance. Moreover, mentions that require the context of multiple sentences are more challenging to identify for models in the first stage. Modeling the entire context of a document and combining the two stages into an end-to-end system could mitigate these problems in future work, and errors could additionally be reduced by collecting more diverse data and by improving the quality of the knowledge base. The data and code are available at https://github.com/e-tornike/SIL .
Related papers
- Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning [12.377363857246602]
We propose a novel model named MI-DELIGHT for short text classification.
It first performs multi-source information exploration to alleviate the sparsity issues.
Then, the graph learning approach is adopted to learn the representation of short texts.
arXiv Detail & Related papers (2025-01-16T00:26:15Z) - Adaptive Question Answering: Enhancing Language Model Proficiency for Addressing Knowledge Conflicts with Source Citations [3.3018718917393297]
We propose the novel task of Question Answering with source citation in ambiguous settings, where multiple valid answers exist.
We create a comprehensive framework consisting of: (1) five novel datasets; (2) the first ambiguous multi-hop QA dataset featuring real-world, naturally occurring contexts; and (3) two new metrics to evaluate models' performances.
We hope that this new task, datasets, metrics, and baselines will inspire the community to push the boundaries of QA research and develop more trustworthy and interpretable systems.
arXiv Detail & Related papers (2024-10-05T17:37:01Z) - List-aware Reranking-Truncation Joint Model for Search and
Retrieval-augmented Generation [80.12531449946655]
We propose a Reranking-Truncation joint model (GenRT) that can perform the two tasks concurrently.
GenRT integrates reranking and truncation via generative paradigm based on encoder-decoder architecture.
Our method achieves SOTA performance on both reranking and truncation tasks for web search and retrieval-augmented LLMs.
arXiv Detail & Related papers (2024-02-05T06:52:53Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding
Tasks [88.4408774253634]
Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community.
There are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers.
Recent work has begun to introduce such benchmark for several tasks.
arXiv Detail & Related papers (2022-12-20T18:39:59Z) - Detection, Disambiguation, Re-ranking: Autoregressive Entity Linking as
a Multi-Task Problem [46.028180604304985]
We propose an autoregressive entity linking model, that is trained with two auxiliary tasks, and learns to re-rank generated samples at inference time.
We show through ablation studies that each of the two auxiliary tasks increases performance, and that re-ranking is an important factor to the increase.
arXiv Detail & Related papers (2022-04-12T17:55:22Z) - Double-Barreled Question Detection at Momentive [6.783610970053343]
A double-barreled question (DBQ) is a common type of biased question that asks two aspects in one question.
Momentive aims to detect DBQs and recommend survey creators to make a change towards gathering high quality unbiased survey data.
We present an end-to-end machine learning approach for DBQ classification in this work.
arXiv Detail & Related papers (2022-02-12T00:04:24Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.