Automatic Construction of Multiple Classification Dimensions for Managing Approaches in Scientific Papers
- URL: http://arxiv.org/abs/2505.23252v2
- Date: Fri, 13 Jun 2025 09:00:02 GMT
- Title: Automatic Construction of Multiple Classification Dimensions for Managing Approaches in Scientific Papers
- Authors: Bing Ma, Hai Zhuge,
- Abstract summary: This paper identifies approach patterns using a top-down way, refining the patterns through four distinct linguistic levels.<n> Approaches in scientific papers are extracted based on approach patterns.<n>Five dimensions for categorizing approaches are identified using these patterns.
- Score: 2.790757012827162
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Approaches form the foundation for conducting scientific research. Querying approaches from a vast body of scientific papers is extremely time-consuming, and without a well-organized management framework, researchers may face significant challenges in querying and utilizing relevant approaches. Constructing multiple dimensions on approaches and managing them from these dimensions can provide an efficient solution. Firstly, this paper identifies approach patterns using a top-down way, refining the patterns through four distinct linguistic levels: semantic level, discourse level, syntactic level, and lexical level. Approaches in scientific papers are extracted based on approach patterns. Additionally, five dimensions for categorizing approaches are identified using these patterns. This paper proposes using tree structure to represent step and measuring the similarity between different steps with a tree-structure-based similarity measure that focuses on syntactic-level similarities. A collection similarity measure is proposed to compute the similarity between approaches. A bottom-up clustering algorithm is proposed to construct class trees for approach components within each dimension by merging each approach component or class with its most similar approach component or class in each iteration. The class labels generated during the clustering process indicate the common semantics of the step components within the approach components in each class and are used to manage the approaches within the class. The class trees of the five dimensions collectively form a multi-dimensional approach space. The application of approach queries on the multi-dimensional approach space demonstrates that querying within this space ensures strong relevance between user queries and results and rapidly reduces search space through a class-based query mechanism.
Related papers
- Semantic Correspondence: Unified Benchmarking and a Strong Baseline [14.012377730820342]
We present the first extensive survey of semantic correspondence methods.<n>We aggregate and summarize the results of methods in literature across various benchmarks into a unified comparative table.<n>We propose a simple yet effective baseline that achieves state-of-the-art performance on multiple benchmarks.
arXiv Detail & Related papers (2025-05-23T16:07:16Z) - Evaluating LLM-based Agents for Multi-Turn Conversations: A Survey [64.08485471150486]
This survey examines evaluation methods for large language model (LLM)-based agents in multi-turn conversational settings.<n>We systematically reviewed nearly 250 scholarly sources, capturing the state of the art from various venues of publication.
arXiv Detail & Related papers (2025-03-28T14:08:40Z) - Towards a Unified View of Preference Learning for Large Language Models: A Survey [88.66719962576005]
Large Language Models (LLMs) exhibit remarkably powerful capabilities.
One of the crucial factors to achieve success is aligning the LLM's output with human preferences.
We decompose all the strategies in preference learning into four components: model, data, feedback, and algorithm.
arXiv Detail & Related papers (2024-09-04T15:11:55Z) - Empirical and Experimental Perspectives on Big Data in Recommendation
Systems: A Comprehensive Survey [2.6319554262325924]
This survey paper provides a comprehensive analysis of big data algorithms in recommendation systems.
It proposes a two-pronged approach: a thorough analysis of current algorithms and a novel, hierarchical taxonomy for precise categorization.
arXiv Detail & Related papers (2024-02-01T23:51:29Z) - Incremental hierarchical text clustering methods: a review [49.32130498861987]
This study aims to analyze various hierarchical and incremental clustering techniques.
The main contribution of this research is the organization and comparison of the techniques used by studies published between 2010 and 2018 that aimed to texts documents clustering.
arXiv Detail & Related papers (2023-12-12T22:27:29Z) - Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - Information Retrieval in long documents: Word clustering approach for improving Semantics [0.0]
We propose an alternative to deep neural networks for semantic information retrieval for the case of long documents.<n>This new approach exploiting clustering techniques takes into account the meaning of words in Information Retrieval systems targeting long as well as short documents.
arXiv Detail & Related papers (2023-02-20T18:32:57Z) - Ontology Matching Through Absolute Orientation of Embedding Spaces [1.5169370091868053]
Ontology is a core task when creating interoperable and linked open datasets.
In this paper, we explore a structure-based mapping approach which is based on knowledge graph embeddings.
We find in experiments with synthetic data, that the approach works very well on similarly structured datasets.
arXiv Detail & Related papers (2022-04-08T12:59:31Z) - Semantic Search for Large Scale Clinical Ontologies [63.71950996116403]
We present a deep learning approach to build a search system for large clinical vocabularies.
We propose a Triplet-BERT model and a method that generates training data based on semantic training data.
The model is evaluated using five real benchmark data sets and the results show that our approach achieves high results on both free text to concept and concept to searching concept vocabularies.
arXiv Detail & Related papers (2022-01-01T05:15:42Z) - Exploring the Hierarchy in Relation Labels for Scene Graph Generation [75.88758055269948]
The proposed method can improve several state-of-the-art baselines by a large margin (up to $33%$ relative gain) in terms of Recall@50.
Experiments show that the proposed simple yet effective method can improve several state-of-the-art baselines by a large margin.
arXiv Detail & Related papers (2020-09-12T17:36:53Z) - Leveraging Class Hierarchies with Metric-Guided Prototype Learning [5.070542698701158]
In many classification tasks, the set of target classes can be organized into a hierarchy.
This structure induces a semantic distance between classes, and can be summarised under the form of a cost matrix.
We propose to model the hierarchical class structure by integrating this metric in the supervision of a prototypical network.
arXiv Detail & Related papers (2020-07-06T20:22:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.