Related papers: CareLab at #SMM4H-HeaRD 2025: Insomnia Detection and Food Safety Event Extraction with Domain-Aware Transformers

CareLab at #SMM4H-HeaRD 2025: Insomnia Detection and Food Safety Event Extraction with Domain-Aware Transformers

URL: http://arxiv.org/abs/2506.18185v1
Date: Sun, 22 Jun 2025 21:56:59 GMT
Title: CareLab at #SMM4H-HeaRD 2025: Insomnia Detection and Food Safety Event Extraction with Domain-Aware Transformers
Authors: Zihan Liang, Ziwen Pan, Sumon Kanti Dey, Azra Ismail,
Abstract summary: This paper presents our system for the SMM4H-HeaRD 2025 shared tasks, specifically Task 4 (Subtasks 1, 2a, and 2b) and Task 5 (Subtasks 1 and 2).<n> Task 4 focused on detecting mentions of insomnia in clinical notes, while Task 5 addressed the extraction of food safety events from news articles.<n>We participated in all subtasks and report key findings across them, with particular emphasis on Task 5 Subtask 1, where our system achieved strong performance-securing first place with an F1 score of 0.958 on the test set.
Score: 8.631763683448117
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents our system for the SMM4H-HeaRD 2025 shared tasks, specifically Task 4 (Subtasks 1, 2a, and 2b) and Task 5 (Subtasks 1 and 2). Task 4 focused on detecting mentions of insomnia in clinical notes, while Task 5 addressed the extraction of food safety events from news articles. We participated in all subtasks and report key findings across them, with particular emphasis on Task 5 Subtask 1, where our system achieved strong performance-securing first place with an F1 score of 0.958 on the test set. To attain this result, we employed encoder-based models (e.g., RoBERTa), alongside GPT-4 for data augmentation. This paper outlines our approach, including preprocessing, model architecture, and subtask-specific adaptations

Related papers

How do Pre-Trained Models Support Software Engineering? An Empirical Study in Hugging Face [52.257764273141184]
Open-Source Pre-Trained Models (PTMs) provide extensive resources for various Machine Learning (ML) tasks.<n>These resources lack a classification tailored to Software Engineering (SE) needs.<n>We derive a taxonomy encompassing 147 SE tasks and apply an SE-oriented classification to PTMs in a popular open-source ML repository, Hugging Face (HF)<n>We find that code generation is the most common SE task among PTMs, while requirements engineering and software design activities receive limited attention.
arXiv Detail & Related papers (2025-06-03T15:51:17Z)
Document Attribution: Examining Citation Relationships using Large Language Models [62.46146670035751]
We propose a zero-shot approach that frames attribution as a straightforward textual entailment task.<n>We also explore the role of the attention mechanism in enhancing the attribution process.
arXiv Detail & Related papers (2025-05-09T04:40:11Z)
TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks [0.0]
This paper addresses the challenge of classifying and assigning programming tasks to experts. A novel dataset containing a total of 4,112 programming tasks was created by extracting tasks from various websites. Web scraping techniques were employed to collect this dataset of programming problems systematically.
arXiv Detail & Related papers (2024-09-30T11:04:56Z)
Localizing Task Information for Improved Model Merging and Compression [61.16012721460561]
We show that the information required to solve each task is still preserved after merging as different tasks mostly use non-overlapping sets of weights. We propose Consensus Merging, an algorithm that eliminates such weights and improves the general performance of existing model merging approaches.
arXiv Detail & Related papers (2024-05-13T14:54:37Z)
Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding [52.723297744257536]
Pre-trained language models (LMs) have shown effectiveness in scientific literature understanding tasks. We propose a multi-task contrastive learning framework, SciMult, to facilitate common knowledge sharing across different literature understanding tasks.
arXiv Detail & Related papers (2023-05-23T16:47:22Z)
TarViS: A Unified Approach for Target-based Video Segmentation [115.5770357189209]
TarViS is a novel, unified network architecture that can be applied to any task that requires segmenting a set of arbitrarily defined 'targets' in video. Our approach is flexible with respect to how tasks define these targets, since it models the latter as abstract 'queries' which are then used to predict pixel-precise target masks. To demonstrate its effectiveness, we apply TarViS to four different tasks, namely Video Instance (VIS), Video Panoptic (VPS), Video Object (VOS) and Point Exemplar-guided Tracking (PET)
arXiv Detail & Related papers (2023-01-06T18:59:52Z)
Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations [63.04466647849211]
Methods typically encode task information with a simple dataset name as a prefix to the encoder. We propose compositional task configurations, a set of prompts prepended to the encoder to improve cross-task generalization. We show this not only allows the model to better learn shared knowledge across different tasks at training, but also allows us to control the model by composing new configurations.
arXiv Detail & Related papers (2022-12-17T02:20:14Z)
Instruction Tuning for Few-Shot Aspect-Based Sentiment Analysis [72.9124467710526]
generative approaches have been proposed to extract all four elements as (one or more) quadruplets from text as a single task. We propose a unified framework for solving ABSA, and the associated sub-tasks to improve the performance in few-shot scenarios.
arXiv Detail & Related papers (2022-10-12T23:38:57Z)
Learning Task Automata for Reinforcement Learning using Hidden Markov Models [37.69303106863453]
This paper proposes a novel pipeline for learning non-Markovian task specifications as succinct finite-state task automata' We learn a product MDP, a model composed of the specification's automaton and the environment's MDP, by treating the product MDP as a partially observable MDP and using the well-known Baum-Welch algorithm for learning hidden Markov models. Our learnt task automaton enables the decomposition of a task into its constituent sub-tasks, which improves the rate at which an RL agent can later synthesise an optimal policy.
arXiv Detail & Related papers (2022-08-25T02:58:23Z)
ML_LTU at SemEval-2022 Task 4: T5 Towards Identifying Patronizing and Condescending Language [1.3445335428144554]
This paper describes the system used by the Machine Learning Group of LTU in subtask 1 of the SemEval-2022 Task 4: Patronizing and Condescending Language (PCL) Detection. Our system consists of finetuning a pretrained Text-to-Text-Transfer Transformer (T5) and innovatively reducing its out-of-class predictions.
arXiv Detail & Related papers (2022-04-15T12:00:25Z)
BERT based Transformers lead the way in Extraction of Health Information from Social Media [0.0]
We participated in two tasks: (1) Classification, extraction and normalization of adverse drug effect (ADE) mentions in English tweets (Task-1) and (2) Classification of COVID-19 tweets containing symptoms (Task-6) Our system ranked first among all the submissions for subtask-1(a) with an F1-score of 61%. For subtask-1(b), our system obtained an F1-score of 50% with improvements up to +8% F1 over the score averaged across all submissions. The BERTweet model achieved an F1 score of 94% on SMM4H 2021 Task-6.
arXiv Detail & Related papers (2021-04-15T10:50:21Z)
YNU-HPCC at SemEval-2020 Task 11: LSTM Network for Detection of Propaganda Techniques in News Articles [5.352512345142247]
This paper summarizes our studies on propaganda detection techniques for news articles in the SemEval-2020 task 11. We implement the GloVe word representation, the BERT pretraining model, and the LSTM model architecture to accomplish this task. Our method significantly outperforms the officially released baseline method, and the SI and TC subtasks rank 17th and 22nd, respectively, for the test set.
arXiv Detail & Related papers (2020-08-24T02:42:12Z)
BUT-FIT at SemEval-2020 Task 5: Automatic detection of counterfactual statements with deep pre-trained language representation models [6.853018135783218]
This paper describes BUT-FIT's submission at SemEval-2020 Task 5: Modelling Causal Reasoning in Language: Detecting Counterfactuals. The challenge focused on detecting whether a given statement contains a counterfactual. We found RoBERTa LRM to perform the best in both subtasks.
arXiv Detail & Related papers (2020-07-28T11:16:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.