Advances and Challenges in Semantic Textual Similarity: A Comprehensive Survey
- URL: http://arxiv.org/abs/2601.03270v1
- Date: Fri, 19 Dec 2025 18:07:36 GMT
- Title: Advances and Challenges in Semantic Textual Similarity: A Comprehensive Survey
- Authors: Lokendra Kumar, Neelesh S. Upadhye, Kannan Piedy,
- Abstract summary: This survey reviews progress across six key areas: transformer-based models, contrastive learning, domain-focused solutions, multi-modal methods, graph-based approaches, and knowledge-enhanced techniques.<n>It aims to guide researchers and practitioners alike in navigating rapid advancements, highlighting emerging trends and future opportunities in the evolving field of Semantic Textual Similarity.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic Textual Similarity (STS) research has expanded rapidly since 2021, driven by advances in transformer architectures, contrastive learning, and domain-specific techniques. This survey reviews progress across six key areas: transformer-based models, contrastive learning, domain-focused solutions, multi-modal methods, graph-based approaches, and knowledge-enhanced techniques. Recent transformer models such as FarSSiBERT and DeBERTa-v3 have achieved remarkable accuracy, while contrastive methods like AspectCSE have established new benchmarks. Domain-adapted models, including CXR-BERT for medical texts and Financial-STS for finance, demonstrate how STS can be effectively customized for specialized fields. Moreover, multi-modal, graph-based, and knowledge-integrated models further enhance semantic understanding and representation. By organizing and analyzing these developments, the survey provides valuable insights into current methods, practical applications, and remaining challenges. It aims to guide researchers and practitioners alike in navigating rapid advancements, highlighting emerging trends and future opportunities in the evolving field of STS.
Related papers
- Monitoring Transformative Technological Convergence Through LLM-Extracted Semantic Entity Triple Graphs [35.70283902821063]
We propose a novel, data-driven pipeline to monitor the emergence of transformative technologies.<n>Our approach leverages advances in Large Language Models (LLMs) to extract semantic triples from unstructured text.<n>The pipeline includes multi-stage filtering, domain-specific clustering, and a temporal trend analysis of topic co-occurence.
arXiv Detail & Related papers (2025-10-29T10:41:03Z) - Survey of Multimodal Geospatial Foundation Models: Techniques, Applications, and Challenges [54.669838624278924]
Foundation models have transformed natural language processing and computer vision.<n>With powerful generalization and transfer learning capabilities, they align naturally with the multimodal, multi-resolution, and multi-temporal characteristics of remote sensing data.<n>This survey delivers a comprehensive review of multimodal GFMs from a modality-driven perspective.
arXiv Detail & Related papers (2025-10-27T03:40:00Z) - Bridging Text and Video Generation: A Survey [0.41998444721319217]
Text-to-video technology holds potential to transform domains such as education, marketing, entertainment, and assistive technologies for individuals with visual or reading comprehension challenges.<n>We present a comprehensive survey of text-to-video generative models, tracing their development from early GANs and VAEs to hybrid Diffusion-Transformer (DiT) architectures.<n>We provide a systematic account of the datasets, which the surveyed text-to-video models were trained and evaluated on, and to support and assess the accessibility of training such models.
arXiv Detail & Related papers (2025-10-06T16:39:05Z) - Anomaly Detection and Generation with Diffusion Models: A Survey [51.61574868316922]
Anomaly detection (AD) plays a pivotal role across diverse domains, including cybersecurity, finance, healthcare, and industrial manufacturing.<n>Recent advancements in deep learning, specifically diffusion models (DMs), have sparked significant interest.<n>This survey aims to guide researchers and practitioners in leveraging DMs for innovative AD solutions across diverse applications.
arXiv Detail & Related papers (2025-06-11T03:29:18Z) - Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey [124.23247710880008]
multimodal CoT (MCoT) reasoning has recently garnered significant research attention.<n>Existing MCoT studies design various methodologies to address the challenges of image, video, speech, audio, 3D, and structured data.<n>We present the first systematic survey of MCoT reasoning, elucidating the relevant foundational concepts and definitions.
arXiv Detail & Related papers (2025-03-16T18:39:13Z) - Multimodal Alignment and Fusion: A Survey [11.3029945633295]
This survey provides a comprehensive overview of advances in multimodal alignment and fusion within the field of machine learning.<n>We systematically categorize and analyze key approaches to alignment and fusion through both structural perspectives.<n>This survey highlights critical challenges such as cross-modal misalignment, computational bottlenecks, data quality issues, and the modality gap.
arXiv Detail & Related papers (2024-11-26T02:10:27Z) - From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models [56.9134620424985]
Cross-modal reasoning (CMR) is increasingly recognized as a crucial capability in the progression toward more sophisticated artificial intelligence systems.
The recent trend of deploying Large Language Models (LLMs) to tackle CMR tasks has marked a new mainstream of approaches for enhancing their effectiveness.
This survey offers a nuanced exposition of current methodologies applied in CMR using LLMs, classifying these into a detailed three-tiered taxonomy.
arXiv Detail & Related papers (2024-09-19T02:51:54Z) - 3D Gaussian Splatting: Survey, Technologies, Challenges, and Opportunities [57.444435654131006]
3D Gaussian Splatting (3DGS) has emerged as a prominent technique with the potential to become a mainstream method for 3D representations.<n>This survey aims to analyze existing 3DGS-related works from multiple intersecting perspectives.
arXiv Detail & Related papers (2024-07-24T16:53:17Z) - A Multimodal Fusion Network For Student Emotion Recognition Based on Transformer and Tensor Product [4.528221075598755]
This paper introduces a new multi-modal model based on the Transformer architecture and tensor product fusion strategy.
It combines BERT's text vectors and ViT's image vectors to classify students' psychological conditions, with an accuracy of 93.65%.
arXiv Detail & Related papers (2024-03-13T13:16:26Z) - A Recent Survey of Heterogeneous Transfer Learning [15.830786437956144]
heterogeneous transfer learning has become a vital strategy in various tasks.
We offer an extensive review of over 60 HTL methods, covering both data-based and model-based approaches.
We explore applications in natural language processing, computer vision, multimodal learning, and biomedicine.
arXiv Detail & Related papers (2023-10-12T16:19:58Z) - A Comprehensive Survey on Source-free Domain Adaptation [69.17622123344327]
The research of Source-Free Domain Adaptation (SFDA) has drawn growing attention in recent years.
We provide a comprehensive survey of recent advances in SFDA and organize them into a unified categorization scheme.
We compare the results of more than 30 representative SFDA methods on three popular classification benchmarks.
arXiv Detail & Related papers (2023-02-23T06:32:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.