THEME: Enhancing Thematic Investing with Semantic Stock Representations and Temporal Dynamics
- URL: http://arxiv.org/abs/2508.16936v2
- Date: Fri, 29 Aug 2025 08:56:06 GMT
- Title: THEME: Enhancing Thematic Investing with Semantic Stock Representations and Temporal Dynamics
- Authors: Hoyoung Lee, Wonbin Ahn, Suhwan Park, Jaehoon Lee, Minjae Kim, Sungdong Yoo, Taeyoon Lim, Woohyung Lim, Yongjae Lee,
- Abstract summary: Thematic investing aims to construct portfolios aligned with structural trends.<n>We introduce THEME, a framework that fine-tunes embeddings using hierarchical contrastive learning.<n>By jointly modeling thematic relationships from text and market dynamics from returns, THEME generates stock embeddings specifically tailored for a wide range of practical investment applications.
- Score: 30.94860968271092
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Thematic investing, which aims to construct portfolios aligned with structural trends, remains a challenging endeavor due to overlapping sector boundaries and evolving market dynamics. A promising direction is to build semantic representations of investment themes from textual data. However, despite their power, general-purpose LLM embedding models are not well-suited to capture the nuanced characteristics of financial assets, since the semantic representation of investment assets may differ fundamentally from that of general financial text. To address this, we introduce THEME, a framework that fine-tunes embeddings using hierarchical contrastive learning. THEME aligns themes and their constituent stocks using their hierarchical relationship, and subsequently refines these embeddings by incorporating stock returns. This process yields representations effective for retrieving thematically aligned assets with strong return potential. Empirical results demonstrate that THEME excels in two key areas. For thematic asset retrieval, it significantly outperforms leading large language models. Furthermore, its constructed portfolios demonstrate compelling performance. By jointly modeling thematic relationships from text and market dynamics from returns, THEME generates stock embeddings specifically tailored for a wide range of practical investment applications.
Related papers
- Reconstructing Content via Collaborative Attention to Improve Multimodal Embedding Quality [59.651410243721045]
CoCoA is a Content reconstruction pre-training paradigm based on Collaborative Attention for multimodal embedding optimization.<n>We introduce an EOS-based reconstruction task, encouraging the model to reconstruct input from the corresponding EOS> embeddings.<n>Experiments on MMEB-V1 demonstrate that CoCoA built upon Qwen2-VL and Qwen2.5-VL significantly improves embedding quality.
arXiv Detail & Related papers (2026-03-02T05:34:45Z) - CREM: Compression-Driven Representation Enhancement for Multimodal Retrieval and Comprehension [49.6969505536365]
We propose CREM, with a unified framework that enhances multimodal representations for retrieval while preserving generative ability.<n>CREM achieves state-of-the-art retrieval performance on MMEB while maintaining strong generative performance on multiple comprehension benchmarks.
arXiv Detail & Related papers (2026-02-22T08:09:51Z) - Beyond Unimodal Shortcuts: MLLMs as Cross-Modal Reasoners for Grounded Named Entity Recognition [51.68340973140949]
Multimodal Named Entity Recognition (GMNER) aims to extract text-based entities, assign them semantic categories, and ground them to corresponding visual regions.<n> MLLMs exhibit $textbfmodality bias$, including visual bias and textual bias, which stems from their tendency to take unimodal shortcuts.<n>We propose Modality-aware Consistency Reasoning ($bfMCR$), which enforces structured cross-modal reasoning.
arXiv Detail & Related papers (2026-02-04T12:12:49Z) - Learning to Manage Investment Portfolios beyond Simple Utility Functions [0.9999629695552193]
We propose a generative framework that learns latent representations of fund manager strategies without requiring explicit utility specification.<n>We validate our framework on a dataset of 1436 U.S. equity mutual funds.<n>Our framework provides a data-driven approach for characterizing investment strategies for applications in market simulation, strategy attribution, and regulatory oversight.
arXiv Detail & Related papers (2025-10-30T06:01:20Z) - Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding [61.36285696607487]
Document understanding is critical for applications from financial analysis to scientific discovery.<n>Current approaches, whether OCR-based pipelines feeding Large Language Models (LLMs) or native Multimodal LLMs (MLLMs) face key limitations.<n>Retrieval-Augmented Generation (RAG) helps ground models in external data, but documents' multimodal nature, combining text, tables, charts, and layout, demands a more advanced paradigm: Multimodal RAG.
arXiv Detail & Related papers (2025-10-17T02:33:16Z) - Explaining multimodal LLMs via intra-modal token interactions [55.27436637894534]
Multimodal Large Language Models (MLLMs) have achieved remarkable success across diverse vision-language tasks, yet their internal decision-making mechanisms remain insufficiently understood.<n>We propose enhancing interpretability by leveraging intra-modal interaction.
arXiv Detail & Related papers (2025-09-26T14:39:13Z) - Leveraging Foundation Models for Multimodal Graph-Based Action Recognition [1.533133219129073]
We introduce a graph-based framework that integrates a vision-temporal foundation leveraging VideoMAE for dynamic visual encoding and BERT for contextual textual embedding.<n>We show that our method consistently outperforms state-of-the-art baselines on diverse benchmark datasets.
arXiv Detail & Related papers (2025-05-21T07:15:14Z) - SAFT: Structure-aware Transformers for Textual Interaction Classification [15.022958096869734]
Textual interaction networks (TINs) are an omnipresent data structure used to model the interplay between users and items on e-commerce websites, social networks, etc.<n>We propose SAFT, a new architecture that integrates language- and graph-based modules for the effective fusion of textual and structural semantics in the representation learning of interactions.
arXiv Detail & Related papers (2025-04-07T09:19:12Z) - GridMind: A Multi-Agent NLP Framework for Unified, Cross-Modal NFL Data Insights [0.0]
This paper introduces GridMind, a framework that unifies structured, semi-structured, and unstructured data through Retrieval-Augmented Generation (RAG) and large language models (LLMs)<n>This approach aligns with the evolving field of multimodal representation learning, where unified models are increasingly essential for real-time, cross-modal interactions.
arXiv Detail & Related papers (2025-03-24T18:33:36Z) - Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models [19.710059031046377]
Temporal graph neural networks (TGNNs) have shown remarkable performance in temporal graph modeling.<n>We present textbfCROSS, a flexible framework that seamlessly extends existing TGNNs for TTAG modeling.
arXiv Detail & Related papers (2025-03-18T16:50:10Z) - A Survey on Post-training of Large Language Models [185.51013463503946]
Large Language Models (LLMs) have fundamentally transformed natural language processing, making them indispensable across domains ranging from conversational systems to scientific exploration.<n>These challenges necessitate advanced post-training language models (PoLMs) to address shortcomings, such as restricted reasoning capacities, ethical uncertainties, and suboptimal domain-specific performance.<n>This paper presents the first comprehensive survey of PoLMs, systematically tracing their evolution across five core paradigms: Fine-tuning, which enhances task-specific accuracy; Alignment, which ensures ethical coherence and alignment with human preferences; Reasoning, which advances multi-step inference despite challenges in reward design; Integration and Adaptation, which
arXiv Detail & Related papers (2025-03-08T05:41:42Z) - Flex: End-to-End Text-Instructed Visual Navigation from Foundation Model Features [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.<n>Our findings are synthesized in Flex (Fly lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.<n>We demonstrate the effectiveness of this approach on a quadrotor fly-to-target task, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Contrastive Learning of Asset Embeddings from Financial Time Series [8.595725772518332]
We propose a novel contrastive learning framework to generate asset embeddings from financial time series data.
Our approach leverages the similarity of asset returns over many subwindows to generate informative positive and negative samples.
Experiments on real-world datasets demonstrate the effectiveness of the learned asset embeddings on benchmark industry classification and portfolio optimization tasks.
arXiv Detail & Related papers (2024-07-26T10:26:44Z) - Financial Assets Dependency Prediction Utilizing Spatiotemporal Patterns [6.424226384944309]
We propose a novel approach that models the dependencies of assets as an Asset Dependency Matrix (ADM)
Unlike video images where neighboring pixels exhibit explicittemporal dependencies due to the natural continuity of object movements, assets in ADM do not have a natural order.
We propose the Asset Dependency Neural Network (ADNN), which employs the Conal Long Short-Term Memory (ConLSTM) network, a highly successful method for video prediction.
arXiv Detail & Related papers (2024-06-13T09:42:28Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - MDGNN: Multi-Relational Dynamic Graph Neural Network for Comprehensive
and Dynamic Stock Investment Prediction [22.430266982219496]
Multi-relational Dynamic Graph Neural Network (MDGNN) framework is proposed.
Our proposed MDGNN framework achieves the best performance in public datasets compared with state-of-the-art (SOTA) stock investment methods.
arXiv Detail & Related papers (2024-01-19T02:51:29Z) - Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal
Structured Representations [70.41385310930846]
We present an end-to-end framework Structure-CLIP to enhance multi-modal structured representations.
We use scene graphs to guide the construction of semantic negative examples, which results in an increased emphasis on learning structured representations.
A Knowledge-Enhance (KEE) is proposed to leverage SGK as input to further enhance structured representations.
arXiv Detail & Related papers (2023-05-06T03:57:05Z) - Modeling Complex Financial Products [5.873416857161077]
We focus on residential mortgage backed securities, resMBS, that were at the heart of the 2008 US financial crisis.
We provide insight into the performance of the resMBS securities through a series of increasingly complex models.
arXiv Detail & Related papers (2021-02-03T23:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.