Related papers: DICoE@FinSim-3: Financial Hypernym Detection using Augmented Terms and Distance-based Features

DICoE@FinSim-3: Financial Hypernym Detection using Augmented Terms and Distance-based Features

URL: http://arxiv.org/abs/2109.14906v1
Date: Thu, 30 Sep 2021 08:01:48 GMT
Title: DICoE@FinSim-3: Financial Hypernym Detection using Augmented Terms and Distance-based Features
Authors: Lefteris Loukas, Konstantinos Bougiatiotis, Manos Fergadiotis, Dimitris Mavroeidis, Elias Zavitsanos
Abstract summary: We present the submission of team DICoE for FinSim-3, the 3rd Shared Task on Learning Semantic Similarities for the Financial Domain. The task provides a set of terms in the financial domain and requires to classify them into the most relevant hypernym from a financial ontology. Our best-performing submission ranked 4th on the task's leaderboard.
Score: 2.6599014990168834
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present the submission of team DICoE for FinSim-3, the 3rd Shared Task on Learning Semantic Similarities for the Financial Domain. The task provides a set of terms in the financial domain and requires to classify them into the most relevant hypernym from a financial ontology. After augmenting the terms with their Investopedia definitions, our system employs a Logistic Regression classifier over financial word embeddings and a mix of hand-crafted and distance-based features. Also, for the first time in this task, we employ different replacement methods for out-of-vocabulary terms, leading to improved performance. Finally, we have also experimented with word representations generated from various financial corpora. Our best-performing submission ranked 4th on the task's leaderboard.

Related papers

FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting [58.70072722290475]
Financial time series (FinTS) record the behavior of human-brain-augmented decision-making. FinTSB is a comprehensive and practical benchmark for financial time series forecasting.
arXiv Detail & Related papers (2025-02-26T05:19:16Z)
Demystifying Domain-adaptive Post-training for Financial LLMs [79.581577578952]
FINDAP is a systematic and fine-grained investigation into domain adaptive post-training of large language models (LLMs) Our approach consists of four key components: FinCap, FinRec, FinTrain and FinEval. The resulting model, Llama-Fin, achieves state-of-the-art performance across a wide range of financial tasks.
arXiv Detail & Related papers (2025-01-09T04:26:15Z)
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey [93.72125112643596]
Next Token Prediction (NTP) is a versatile training objective for machine learning tasks across various modalities. This survey introduces a comprehensive taxonomy that unifies both understanding and generation within multimodal learning. The proposed taxonomy covers five key aspects: Multimodal tokenization, MMNTP model architectures, unified task representation, datasets & evaluation, and open challenges.
arXiv Detail & Related papers (2024-12-16T05:02:25Z)
FinBen: A Holistic Financial Benchmark for Large Language Models [75.09474986283394]
FinBen is the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks. FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, novel agent and Retrieval-Augmented Generation (RAG) evaluation, and three novel open-source evaluation datasets for text summarization, question answering, and stock trading.
arXiv Detail & Related papers (2024-02-20T02:16:16Z)
DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning [74.99318727786337]
We propose Multiple Experts Fine-tuning Framework to build a financial large language model (LLM) We build a financial instruction-tuning dataset named DISC-FIN-SFT, including instruction samples of four categories (consulting, NLP tasks, computing and retrieval-augmented generation) Evaluations conducted on multiple benchmarks demonstrate that our model performs better than baseline models in various financial scenarios.
arXiv Detail & Related papers (2023-10-23T11:33:41Z)
Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions. This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z)
Learning Semantic Text Similarity to rank Hypernyms of Financial Terms [0.23940819037450983]
We propose a system capable of extracting and ranking hypernyms for a given financial term. The system has been trained with financial text corpora obtained from various sources like DBpedia. A novel approach has been used to augment the training set with negative samples.
arXiv Detail & Related papers (2023-03-20T16:53:36Z)
Exploiting Semantic Role Contextualized Video Features for Multi-Instance Text-Video Retrieval EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022 [72.12974259966592]
We present our approach for EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022. We first parse sentences into semantic roles corresponding to verbs and nouns, then utilize self-attentions to exploit semantic role contextualized video features.
arXiv Detail & Related papers (2022-06-29T03:24:43Z)
FinQA: A Dataset of Numerical Reasoning over Financial Data [52.7249610894623]
We focus on answering deep questions over financial data, aiming to automate the analysis of a large corpus of financial documents. We propose a new large-scale dataset, FinQA, with Question-Answering pairs over Financial reports, written by financial experts. The results demonstrate that popular, large, pre-trained models fall far short of expert humans in acquiring finance knowledge.
arXiv Detail & Related papers (2021-09-01T00:08:14Z)
Yseop at FinSim-3 Shared Task 2021: Specializing Financial Domain Learning with Phrase Representations [0.0]
We present our approaches for the FinSim-3 Shared Task 2021: Learning Semantic Similarities for the Financial Domain. The aim of this task is to correctly classify a list of given terms from the financial domain into the most relevant hypernym. Our system ranks 2nd overall on both metrics, scoring 0.917 on Average Accuracy and 1.141 on Mean Rank.
arXiv Detail & Related papers (2021-08-21T10:53:12Z)
Term Expansion and FinBERT fine-tuning for Hypernym and Synonym Ranking of Financial Terms [0.0]
We present systems that attempt to solve Hypernym and synonym matching problem. We designed these systems to participate in the FinSim-3, a shared task of FinNLP workshop at IJCAI-2021. Our best performing model (Accuracy: 0.917, Rank: 1.156) was developed by fine-tuning SentenceBERT [Reimers et al., 2019] over an extended labelled set created using the hierarchy of labels present in FIBO.
arXiv Detail & Related papers (2021-07-29T06:17:44Z)
FinMatcher at FinSim-2: Hypernym Detection in the Financial Services Domain using Knowledge Graphs [1.2891210250935146]
This paper presents the FinMatcher system and its results for the FinSim 2021 shared task. The FinSim-2 shared task consists of a set of concept labels from the financial services domain. The goal is to find the most relevant top-level concept from a given set of concepts.
arXiv Detail & Related papers (2021-03-02T08:56:28Z)
IITK at the FinSim Task: Hypernym Detection in Financial Domain via Context-Free and Contextualized Word Embeddings [2.515934533974176]
FinSim 2020 task is to classify financial terms into the most relevant hypernym (or top-level) concept in an external ontology. We leverage both context-dependent and context-independent word embeddings in our analysis. Our system ranks 1st based on both the metrics, i.e. mean rank and accuracy.
arXiv Detail & Related papers (2020-07-22T04:56:23Z)
RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the Russian language [70.27072729280528]
This paper describes the results of the first shared task on taxonomy enrichment for the Russian language. 16 teams participated in the task demonstrating high results with more than half of them outperforming the provided baseline.
arXiv Detail & Related papers (2020-05-22T13:30:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.