Text Meets Topology: Rethinking Out-of-distribution Detection in Text-Rich Networks
- URL: http://arxiv.org/abs/2508.17690v2
- Date: Tue, 02 Sep 2025 11:53:15 GMT
- Title: Text Meets Topology: Rethinking Out-of-distribution Detection in Text-Rich Networks
- Authors: Danny Wang, Ruihong Qiu, Guangdong Bai, Zi Huang,
- Abstract summary: Out-of-distribution (OOD) detection remains challenging in text-rich networks, where textual features intertwine with topological structures.<n>We introduce the TextTopoOOD framework for evaluating detection across diverse OOD scenarios.<n>We also propose TNT-OOD to model the complex interplay between Text aNd Topology using: 1) a novel cross-attention module to fuse local structure into node-level text representations, and 2) a HyperNetwork to generate node-specific transformation parameters.
- Score: 40.16812361254501
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Out-of-distribution (OOD) detection remains challenging in text-rich networks, where textual features intertwine with topological structures. Existing methods primarily address label shifts or rudimentary domain-based splits, overlooking the intricate textual-structural diversity. For example, in social networks, where users represent nodes with textual features (name, bio) while edges indicate friendship status, OOD may stem from the distinct language patterns between bot and normal users. To address this gap, we introduce the TextTopoOOD framework for evaluating detection across diverse OOD scenarios: (1) attribute-level shifts via text augmentations and embedding perturbations; (2) structural shifts through edge rewiring and semantic connections; (3) thematically-guided label shifts; and (4) domain-based divisions. Furthermore, we propose TNT-OOD to model the complex interplay between Text aNd Topology using: 1) a novel cross-attention module to fuse local structure into node-level text representations, and 2) a HyperNetwork to generate node-specific transformation parameters. This aligns topological and semantic features of ID nodes, enhancing ID/OOD distinction across structural and textual shifts. Experiments on 11 datasets across four OOD scenarios demonstrate the nuanced challenge of TextTopoOOD for evaluating OOD detection in text-rich networks.
Related papers
- Threads of Subtlety: Detecting Machine-Generated Texts Through Discourse Motifs [19.073560504913356]
The line between human-crafted and machine-generated texts has become increasingly blurred.
This paper delves into the inquiry of identifying discernible and unique linguistic properties in texts that were written by humans.
arXiv Detail & Related papers (2024-02-16T11:20:30Z) - Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis [52.01356859448068]
HTS can recognize text in an image and identify its 4-level hierarchical structure: characters, words, lines, and paragraphs.
HTS achieves state-of-the-art results on multiple word-level text spotting benchmark datasets as well as geometric layout analysis tasks.
arXiv Detail & Related papers (2023-10-25T22:23:54Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - TeKo: Text-Rich Graph Neural Networks with External Knowledge [75.91477450060808]
We propose a novel text-rich graph neural network with external knowledge (TeKo)
We first present a flexible heterogeneous semantic network that incorporates high-quality entities.
We then introduce two types of external knowledge, that is, structured triplets and unstructured entity description.
arXiv Detail & Related papers (2022-06-15T02:33:10Z) - Heterformer: Transformer-based Deep Node Representation Learning on
Heterogeneous Text-Rich Networks [29.33447325640058]
Heterformer performs contextualized text encoding and heterogeneous structure encoding in a unified model.
We conduct comprehensive experiments on three tasks (i.e., link prediction, node classification, and node clustering) on three large-scale datasets.
arXiv Detail & Related papers (2022-05-20T16:26:39Z) - SwinTextSpotter: Scene Text Spotting via Better Synergy between Text
Detection and Text Recognition [73.61592015908353]
We propose a new end-to-end scene text spotting framework termed SwinTextSpotter.
Using a transformer with dynamic head as the detector, we unify the two tasks with a novel Recognition Conversion mechanism.
The design results in a concise framework that requires neither additional rectification module nor character-level annotation.
arXiv Detail & Related papers (2022-03-19T01:14:42Z) - Adversarial Context Aware Network Embeddings for Textual Networks [8.680676599607123]
Existing approaches learn embeddings of text and network structure by enforcing embeddings of connected nodes to be similar.
This implies that these approaches require edge information for learning embeddings and they cannot learn embeddings of unseen nodes.
We propose an approach that achieves both modality fusion and the capability to learn embeddings of unseen nodes.
arXiv Detail & Related papers (2020-11-05T05:20:01Z) - BiTe-GCN: A New GCN Architecture via BidirectionalConvolution of
Topology and Features on Text-Rich Networks [44.74164340799386]
BiTe-GCN is a novel GCN architecture with bidirectional convolution of both topology and features on text-rich networks.
Our new architecture outperforms state-of-the-art by a breakout improvement.
This architecture can also be applied to several e-commerce searching scenes such as JD searching.
arXiv Detail & Related papers (2020-10-23T04:38:30Z) - Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [49.768327669098674]
We propose an end-to-end trainable text spotting approach named Text Perceptron.
It first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information.
Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies.
arXiv Detail & Related papers (2020-02-17T08:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.