Related papers: Falcon 7b for Software Mention Detection in Scholarly Documents

Falcon 7b for Software Mention Detection in Scholarly Documents

URL: http://arxiv.org/abs/2405.08514v1
Date: Tue, 14 May 2024 11:37:26 GMT
Title: Falcon 7b for Software Mention Detection in Scholarly Documents
Authors: AmeerAli Khan, Qusai Ramadan, Cong Yang, Zeyd Boukhers,
Abstract summary: This paper investigates the application of Falcon-7b for the detection and classification of software mentions within scholarly texts. Through comprehensive experimentation, the paper explores different training strategies, including a dual-classifier approach, adaptive sampling, and weighted loss scaling. The findings highlight the benefits of selective labelling and adaptive sampling in improving the model's performance.
Score: 7.0413463890126735
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper aims to tackle the challenge posed by the increasing integration of software tools in research across various disciplines by investigating the application of Falcon-7b for the detection and classification of software mentions within scholarly texts. Specifically, the study focuses on solving Subtask I of the Software Mention Detection in Scholarly Publications (SOMD), which entails identifying and categorizing software mentions from academic literature. Through comprehensive experimentation, the paper explores different training strategies, including a dual-classifier approach, adaptive sampling, and weighted loss scaling, to enhance detection accuracy while overcoming the complexities of class imbalance and the nuanced syntax of scholarly writing. The findings highlight the benefits of selective labelling and adaptive sampling in improving the model's performance. However, they also indicate that integrating multiple strategies does not necessarily result in cumulative improvements. This research offers insights into the effective application of large language models for specific tasks such as SOMD, underlining the importance of tailored approaches to address the unique challenges presented by academic text analysis.

Related papers

A Systematic Literature Review on Detecting Software Vulnerabilities with Large Language Models [2.518519330408713]
Large Language Models (LLMs) in software engineering have sparked interest in their use for software vulnerability detection.<n>The rapid development of this field has resulted in a fragmented research landscape.<n>This fragmentation makes it difficult to obtain a clear overview of the state-of-the-art or compare and categorize studies meaningfully.
arXiv Detail & Related papers (2025-07-30T13:17:16Z)
Text-ADBench: Text Anomaly Detection Benchmark based on LLMs Embedding [27.02879006439693]
This work performs a comprehensive empirical study and introduces a benchmark for text anomaly detection.<n>Our work systematically evaluates the effectiveness of embedding-based text anomaly detection.<n>By open-sourcing our benchmark toolkit, this work provides a foundation for future research in robust and scalable text anomaly detection systems.
arXiv Detail & Related papers (2025-07-16T14:47:41Z)
Towards an Understanding of Context Utilization in Code Intelligence [37.85380387094615]
Code intelligence aims to improve the effectiveness and efficiency of various code-related tasks. Recent research suggests that incorporating contextual information beyond the basic original task inputs can substantially enhance model performance. Despite growing academic interest, there is a lack of systematic analysis of context in code intelligence.
arXiv Detail & Related papers (2025-04-11T17:59:53Z)
Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning. We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads. We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z)
TAD-Bench: A Comprehensive Benchmark for Embedding-Based Text Anomaly Detection [18.14471932503304]
Text anomaly detection is crucial for identifying spam, misinformation, and offensive language in natural language processing tasks. Despite the growing adoption of embedding-based methods, their effectiveness and generalizability across diverse application scenarios remain under-explored. We present TAD-Bench, a benchmark designed to systematically evaluate embedding-based approaches for text anomaly detection.
arXiv Detail & Related papers (2025-01-21T08:13:10Z)
Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation [21.20806568508201]
We show how to leverage class text information to mitigate distribution drifts encountered by vision-language models (VLMs) during test-time inference. We propose to generate pseudo-labels for the test-time samples by exploiting generic class text embeddings as fixed centroids of a label assignment problem. Experiments on multiple popular test-time adaptation benchmarks presenting diverse complexity empirically show the superiority of CLIP-OT.
arXiv Detail & Related papers (2024-11-26T00:15:37Z)
A Survey of Small Language Models [104.80308007044634]
Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources. We present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques.
arXiv Detail & Related papers (2024-10-25T23:52:28Z)
Automating Bibliometric Analysis with Sentence Transformers and Retrieval-Augmented Generation (RAG): A Pilot Study in Semantic and Contextual Search for Customized Literature Characterization for High-Impact Urban Research [2.1728621449144763]
Bibliometric analysis is essential for understanding research trends, scope, and impact in urban science. Traditional methods, relying on keyword searches, often fail to uncover valuable insights not explicitly stated in article titles or keywords. We leverage Generative AI models, specifically transformers and Retrieval-Augmented Generation (RAG), to automate and enhance bibliometric analysis.
arXiv Detail & Related papers (2024-10-08T05:13:27Z)
Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored. We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches. We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z)
A Reliable Knowledge Processing Framework for Combustion Science using Foundation Models [0.0]
The study introduces an approach to process diverse combustion research data, spanning experimental studies, simulations, and literature. The developed approach minimizes computational and economic expenses while optimizing data privacy and accuracy. The framework consistently delivers accurate domain-specific responses with minimal human oversight.
arXiv Detail & Related papers (2023-12-31T17:15:25Z)
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [54.19942426544731]
The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains. This paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs.
arXiv Detail & Related papers (2023-12-01T16:00:25Z)
Multi-label and Multi-target Sampling of Machine Annotation for Computational Stance Detection [44.90471123149513]
We introduce a multi-label and multi-target sampling strategy to optimize the annotation quality. Experimental results on the benchmark stance detection corpora show that our method can significantly improve performance and learning efficacy.
arXiv Detail & Related papers (2023-11-08T06:54:34Z)
GLUECons: A Generic Benchmark for Learning Under Constraints [102.78051169725455]
In this work, we create a benchmark that is a collection of nine tasks in the domains of natural language processing and computer vision. We model external knowledge as constraints, specify the sources of the constraints for each task, and implement various models that use these constraints.
arXiv Detail & Related papers (2023-02-16T16:45:36Z)
Navigating the Pitfalls of Active Learning Evaluation: A Systematic Framework for Meaningful Performance Assessment [3.3064235071867856]
Active Learning (AL) aims to reduce the labeling burden by interactively selecting the most informative samples from a pool of unlabeled data. Some studies have questioned the effectiveness of AL compared to emerging paradigms such as semi-supervised (Semi-SL) and self-supervised learning (Self-SL)
arXiv Detail & Related papers (2023-01-25T15:07:44Z)
Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs) We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date. We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z)
A Field Guide to Federated Optimization [161.3779046812383]
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data. This paper provides recommendations and guidelines on formulating, designing, evaluating and analyzing federated optimization algorithms.
arXiv Detail & Related papers (2021-07-14T18:09:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.