Preface to the Special Issue of the TAL Journal on Scholarly Document Processing
- URL: http://arxiv.org/abs/2506.03587v1
- Date: Wed, 04 Jun 2025 05:35:39 GMT
- Title: Preface to the Special Issue of the TAL Journal on Scholarly Document Processing
- Authors: Florian Boudin, Akiko Aizawa,
- Abstract summary: The rapid growth of scholarly literature makes it increasingly difficult for researchers to keep up with new knowledge.<n>This special issue of the TAL journal highlights research on natural language processing and information retrieval for scholarly and scientific documents.
- Score: 33.04325179283727
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid growth of scholarly literature makes it increasingly difficult for researchers to keep up with new knowledge. Automated tools are now more essential than ever to help navigate and interpret this vast body of information. Scientific papers pose unique difficulties, with their complex language, specialized terminology, and diverse formats, requiring advanced methods to extract reliable and actionable insights. Large language models (LLMs) offer new opportunities, enabling tasks such as literature reviews, writing assistance, and interactive exploration of research. This special issue of the TAL journal highlights research addressing these challenges and, more broadly, research on natural language processing and information retrieval for scholarly and scientific documents.
Related papers
- Patience is all you need! An agentic system for performing scientific literature review [0.0]
Large language models (LLMs) have grown in their usage to provide support for question answering across numerous disciplines.<n>We have built an LLM-based system that performs such search and distillation of information encapsulated in scientific literature.<n>We evaluate our keyword based search and information distillation system against a set of biology related questions from previously released literature benchmarks.
arXiv Detail & Related papers (2025-03-28T08:08:46Z) - Zero-Shot Complex Question-Answering on Long Scientific Documents [0.0]
We present a zero-shot pipeline framework that enables social science researchers to perform question-answering tasks on full-length research papers.<n>Our approach integrates pre-trained language models to handle challenging scenarios including multi-span extraction, multi-hop reasoning, and long-answer generation.
arXiv Detail & Related papers (2025-03-04T15:12:18Z) - LLAssist: Simple Tools for Automating Literature Review Using Large Language Models [0.0]
LLAssist is an open-source tool designed to streamline literature reviews in academic research.<n>It uses Large Language Models (LLMs) and Natural Language Processing (NLP) techniques to automate key aspects of the review process.
arXiv Detail & Related papers (2024-07-19T02:48:54Z) - Retrieval-Enhanced Machine Learning: Synthesis and Opportunities [60.34182805429511]
Retrieval-enhancement can be extended to a broader spectrum of machine learning (ML)
This work introduces a formal framework of this paradigm, Retrieval-Enhanced Machine Learning (REML), by synthesizing the literature in various domains in ML with consistent notations which is missing from the current literature.
The goal of this work is to equip researchers across various disciplines with a comprehensive, formally structured framework of retrieval-enhanced models, thereby fostering interdisciplinary future research.
arXiv Detail & Related papers (2024-07-17T20:01:21Z) - SurveyAgent: A Conversational System for Personalized and Efficient Research Survey [50.04283471107001]
This paper introduces SurveyAgent, a novel conversational system designed to provide personalized and efficient research survey assistance to researchers.
SurveyAgent integrates three key modules: Knowledge Management for organizing papers, Recommendation for discovering relevant literature, and Query Answering for engaging with content on a deeper level.
Our evaluation demonstrates SurveyAgent's effectiveness in streamlining research activities, showcasing its capability to facilitate how researchers interact with scientific literature.
arXiv Detail & Related papers (2024-04-09T15:01:51Z) - The Semantic Reader Project: Augmenting Scholarly Documents through
AI-Powered Interactive Reading Interfaces [54.2590226904332]
We describe the Semantic Reader Project, a effort across multiple institutions to explore automatic creation of dynamic reading interfaces for research papers.
Ten prototype interfaces have been developed and more than 300 participants and real-world users have shown improved reading experiences.
We structure this paper around challenges scholars and the public face when reading research papers.
arXiv Detail & Related papers (2023-03-25T02:47:09Z) - A Search Engine for Discovery of Biomedical Challenges and Directions [38.72769142277108]
We construct and release an expert-annotated corpus of texts sampled from full-length papers.
We focus on a large corpus of interdisciplinary work relating to the COVID-19 pandemic.
We apply a model trained on our data to identify challenges and directions across the corpus and build a dedicated search engine for this information.
arXiv Detail & Related papers (2021-08-31T11:08:20Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z) - A New Neural Search and Insights Platform for Navigating and Organizing
AI Research [56.65232007953311]
We introduce a new platform, AI Research Navigator, that combines classical keyword search with neural retrieval to discover and organize relevant literature.
We give an overview of the overall architecture of the system and of the components for document analysis, question answering, search, analytics, expert search, and recommendations.
arXiv Detail & Related papers (2020-10-30T19:12:25Z) - Generating Knowledge Graphs by Employing Natural Language Processing and
Machine Learning Techniques within the Scholarly Domain [1.9004296236396943]
We present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications.
Within this research work, we i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools.
We generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain.
arXiv Detail & Related papers (2020-10-28T08:31:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.