Automating Chapter-Level Classification for Electronic Theses and Dissertations
- URL: http://arxiv.org/abs/2411.17614v1
- Date: Tue, 26 Nov 2024 17:27:18 GMT
- Title: Automating Chapter-Level Classification for Electronic Theses and Dissertations
- Authors: Bipasha Banerjee, William A. Ingram, Edward A. Fox,
- Abstract summary: We propose a machine learning and AI-driven solution to automatically categorize ETD chapters.
This solution is intended to improve discoverability and promote understanding of chapters.
We aim to support interdisciplinary research and make ETDs more accessible.
- Score: 0.0
- License:
- Abstract: Traditional archival practices for describing electronic theses and dissertations (ETDs) rely on broad, high-level metadata schemes that fail to capture the depth, complexity, and interdisciplinary nature of these long scholarly works. The lack of detailed, chapter-level content descriptions impedes researchers' ability to locate specific sections or themes, thereby reducing discoverability and overall accessibility. By providing chapter-level metadata information, we improve the effectiveness of ETDs as research resources. This makes it easier for scholars to navigate them efficiently and extract valuable insights. The absence of such metadata further obstructs interdisciplinary research by obscuring connections across fields, hindering new academic discoveries and collaboration. In this paper, we propose a machine learning and AI-driven solution to automatically categorize ETD chapters. This solution is intended to improve discoverability and promote understanding of chapters. Our approach enriches traditional archival practices by providing context-rich descriptions that facilitate targeted navigation and improved access. We aim to support interdisciplinary research and make ETDs more accessible. By providing chapter-level classification labels and using them to index in our developed prototype system, we make content in ETD chapters more discoverable and usable for a diverse range of scholarly needs. Implementing this AI-enhanced approach allows archives to serve researchers better, enabling efficient access to relevant information and supporting deeper engagement with ETDs. This will increase the impact of ETDs as research tools, foster interdisciplinary exploration, and reinforce the role of archives in scholarly communication within the data-intensive academic landscape.
Related papers
- Comparison of Feature Learning Methods for Metadata Extraction from PDF Scholarly Documents [8.516310581591426]
This study evaluates various feature learning and prediction methods, including natural language processing (NLP), computer vision (CV), and multimodal approaches, for extracting metadata from documents with high template variance.
We aim to improve the accessibility of scientific documents and facilitate their wider use.
arXiv Detail & Related papers (2025-01-09T09:03:43Z) - AI in Archival Science -- A Systematic Review [0.9749638953163389]
This study underscores the benefits of integrating artificial intelligence (AI) within the broad realm of archival science.
Our findings highlight key AI driven strategies that promise to streamline record-keeping processes and enhance data retrieval efficiency.
arXiv Detail & Related papers (2024-10-07T14:39:12Z) - Knowledge Navigator: LLM-guided Browsing Framework for Exploratory Search in Scientific Literature [48.572336666741194]
We present Knowledge Navigator, a system designed to enhance exploratory search abilities.
It organizes retrieved documents into a navigable, two-level hierarchy of named and descriptive scientific topics and subtopics.
arXiv Detail & Related papers (2024-08-28T14:48:37Z) - DiscipLink: Unfolding Interdisciplinary Information Seeking Process via Human-AI Co-Exploration [34.23942131024738]
In this paper, we introduce DiscipLink, a novel interactive system that facilitates collaboration between researchers and large language models (LLMs)
Based on users' topics of interest, DiscipLink initiates exploratory questions from the perspectives of possible relevant fields of study.
Our evaluation, comprising a within-subject comparative experiment and an open-ended exploratory study, reveals that DiscipLink can effectively support researchers in breaking down disciplinary boundaries.
arXiv Detail & Related papers (2024-08-01T10:36:00Z) - Incremental hierarchical text clustering methods: a review [49.32130498861987]
This study aims to analyze various hierarchical and incremental clustering techniques.
The main contribution of this research is the organization and comparison of the techniques used by studies published between 2010 and 2018 that aimed to texts documents clustering.
arXiv Detail & Related papers (2023-12-12T22:27:29Z) - Exploring Federated Unlearning: Analysis, Comparison, and Insights [101.64910079905566]
federated unlearning enables the selective removal of data from models trained in federated systems.
This paper examines existing federated unlearning approaches, examining their algorithmic efficiency, impact on model accuracy, and effectiveness in preserving privacy.
We propose the OpenFederatedUnlearning framework, a unified benchmark for evaluating federated unlearning methods.
arXiv Detail & Related papers (2023-10-30T01:34:33Z) - DiscoverPath: A Knowledge Refinement and Retrieval System for
Interdisciplinarity on Biomedical Research [96.10765714077208]
Traditional keyword-based search engines fall short in assisting users who may not be familiar with specific terminologies.
We present a knowledge graph-based paper search engine for biomedical research to enhance the user experience.
The system, dubbed DiscoverPath, employs Named Entity Recognition (NER) and part-of-speech (POS) tagging to extract terminologies and relationships from article abstracts to create a KG.
arXiv Detail & Related papers (2023-09-04T20:52:33Z) - The Semantic Reader Project: Augmenting Scholarly Documents through
AI-Powered Interactive Reading Interfaces [54.2590226904332]
We describe the Semantic Reader Project, a effort across multiple institutions to explore automatic creation of dynamic reading interfaces for research papers.
Ten prototype interfaces have been developed and more than 300 participants and real-world users have shown improved reading experiences.
We structure this paper around challenges scholars and the public face when reading research papers.
arXiv Detail & Related papers (2023-03-25T02:47:09Z) - Making Science Simple: Corpora for the Lay Summarisation of Scientific
Literature [21.440724685950443]
We present two novel lay summarisation datasets, PLOS (large-scale) and eLife (medium-scale)
We provide a thorough characterisation of our lay summaries, highlighting differing levels of readability and abstractiveness between datasets.
arXiv Detail & Related papers (2022-10-18T15:28:30Z) - A New Neural Search and Insights Platform for Navigating and Organizing
AI Research [56.65232007953311]
We introduce a new platform, AI Research Navigator, that combines classical keyword search with neural retrieval to discover and organize relevant literature.
We give an overview of the overall architecture of the system and of the components for document analysis, question answering, search, analytics, expert search, and recommendations.
arXiv Detail & Related papers (2020-10-30T19:12:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.