Related papers: Scientific and Technological Information Oriented Semantics-adversarial and Media-adversarial Cross-media Retrieval

Scientific and Technological Information Oriented Semantics-adversarial and Media-adversarial Cross-media Retrieval

URL: http://arxiv.org/abs/2203.08615v3
Date: Wed, 30 Oct 2024 14:56:09 GMT
Title: Scientific and Technological Information Oriented Semantics-adversarial and Media-adversarial Cross-media Retrieval
Authors: Ang Li, Junping Du, Feifei Kou, Zhe Xue, Xin Xu, Mingying Xu, Yang Jiang,
Abstract summary: Cross-media scientific and technological information retrieval is one of the important tasks in the cross-media study. We propose a scientific and technological information oriented Semantics-adversarial and Media-adversarial Cross-media Retrieval method (SMCR) to find an effective common subspace. SMCR minimizes the loss of inter-media semantic consistency in addition to modeling intra-media semantic discrimination, to preserve semantic similarity before and after mapping.
Score: 21.630525836722036
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cross-media retrieval of scientific and technological information is one of the important tasks in the cross-media study. Cross-media scientific and technological information retrieval obtain target information from massive multi-source and heterogeneous scientific and technological resources, which helps to design applications that meet users' needs, including scientific and technological information recommendation, personalized scientific and technological information retrieval, etc. The core of cross-media retrieval is to learn a common subspace, so that data from different media can be directly compared with each other after being mapped into this subspace. In subspace learning, existing methods often focus on modeling the discrimination of intra-media data and the invariance of inter-media data after mapping; however, they ignore the semantic consistency of inter-media data before and after mapping and media discrimination of intra-semantics data, which limit the result of cross-media retrieval. In light of this, we propose a scientific and technological information oriented Semantics-adversarial and Media-adversarial Cross-media Retrieval method (SMCR) to find an effective common subspace. Specifically, SMCR minimizes the loss of inter-media semantic consistency in addition to modeling intra-media semantic discrimination, to preserve semantic similarity before and after mapping. Furthermore, SMCR constructs a basic feature mapping network and a refined feature mapping network to jointly minimize the media discriminative loss within semantics, so as to enhance the feature mapping network's ability to confuse the media discriminant network. Experimental results on two datasets demonstrate that the proposed SMCR outperforms state-of-the-art methods in cross-media retrieval.

Related papers

METER: Multi-modal Evidence-based Thinking and Explainable Reasoning -- Algorithm and Benchmark [48.78602579128459]
We introduce METER, a unified benchmark for interpretable forgery detection spanning images, videos, audio, and audio-visual content.<n>Our dataset comprises four tracks, each requiring not only real-vs-fake classification but also evidence-chain-based explanations.
arXiv Detail & Related papers (2025-07-22T03:42:51Z)
Bridging Cognition and Emotion: Empathy-Driven Multimodal Misinformation Detection [56.644686934050576]
Social media has become a major conduit for information dissemination, yet it also facilitates the rapid spread of misinformation. Traditional misinformation detection methods primarily focus on surface-level features, overlooking the crucial roles of human empathy in the propagation process. We propose the Dual-Aspect Empathy Framework (DAE), which integrates cognitive and emotional empathy to analyze misinformation from both the creator and reader perspectives.
arXiv Detail & Related papers (2025-04-24T07:48:26Z)
Semantic Learning for Molecular Communication in Internet of Bio-Nano Things [0.0]
This paper proposes an end-to-end semantic learning framework to optimize task-oriented molecular communication. The proposed framework employs a deep encoder-decoder architecture to efficiently extract, quantize, and decode semantic features. Experimental results demonstrate that the proposed semantic framework improves diagnostic accuracy by at least 25% compared to conventional JPEG compression.
arXiv Detail & Related papers (2025-02-12T14:09:05Z)
CNN-Transformer Rectified Collaborative Learning for Medical Image Segmentation [60.08541107831459]
This paper proposes a CNN-Transformer rectified collaborative learning framework to learn stronger CNN-based and Transformer-based models for medical image segmentation. Specifically, we propose a rectified logit-wise collaborative learning (RLCL) strategy which introduces the ground truth to adaptively select and rectify the wrong regions in student soft labels. We also propose a class-aware feature-wise collaborative learning (CFCL) strategy to achieve effective knowledge transfer between CNN-based and Transformer-based models in the feature space.
arXiv Detail & Related papers (2024-08-25T01:27:35Z)
Ontology Embedding: A Survey of Methods, Applications and Resources [54.3453925775069]
Ontologies are widely used for representing domain knowledge and meta data. One straightforward solution is to integrate statistical analysis and machine learning. Numerous papers have been published on embedding, but a lack of systematic reviews hinders researchers from gaining a comprehensive understanding of this field.
arXiv Detail & Related papers (2024-06-16T14:49:19Z)
Computer Vision for Multimedia Geolocation in Human Trafficking Investigation: A Systematic Literature Review [0.1611401281366893]
This systematic literature review examines the state-of-the-art leveraging computer vision techniques for multimedia geolocation. It identifies their applicability in combating human trafficking, and highlights the potential implications of enhanced multimedia geolocation for prosecuting human trafficking. The findings suggest numerous potential paths for future impactful research on the subject.
arXiv Detail & Related papers (2024-02-23T17:23:06Z)
Inference of Media Bias and Content Quality Using Natural-Language Processing [6.092956184948962]
We present a framework to infer both political bias and content quality of media outlets from text. We apply a bidirectional long short-term memory (LSTM) neural network to a data set of more than 1 million tweets. Our results illustrate the importance of leveraging word order into machine-learning methods in text analysis.
arXiv Detail & Related papers (2022-12-01T03:04:55Z)
Imitation Learning-based Implicit Semantic-aware Communication Networks: Multi-layer Representation and Collaborative Reasoning [68.63380306259742]
Despite its promising potential, semantic communications and semantic-aware networking are still at their infancy. We propose a novel reasoning-based implicit semantic-aware communication network architecture that allows multiple tiers of CDC and edge servers to collaborate. We introduce a new multi-layer representation of semantic information taking into consideration both the hierarchical structure of implicit semantics as well as the personalized inference preference of individual users.
arXiv Detail & Related papers (2022-10-28T13:26:08Z)
Cross-Media Scientific Research Achievements Retrieval Based on Deep Language Model [2.900289363118179]
This paper proposes a cross-media scientific research achievements retrieval method based on deep language model (CARDL) It achieves a unified cross-media semantic representation by learning the semantic association between different modal data. Cross-media retrieval is realized through semantic similarity matching between different modal data.
arXiv Detail & Related papers (2022-03-29T14:04:53Z)
Deep Learning Techniques for Future Intelligent Cross-Media Retrieval [58.20547387332133]
Cross-media retrieval plays a significant role in big data applications. We provide a novel taxonomy according to the challenges faced by multi-modal deep learning approaches. We present some well-known cross-media datasets used for retrieval.
arXiv Detail & Related papers (2020-07-21T09:49:33Z)
Context-Aware Refinement Network Incorporating Structural Connectivity Prior for Brain Midline Delineation [50.868845400939314]
We propose a context-aware refinement network (CAR-Net) to refine and integrate the feature pyramid representation generated by the UNet. For keeping the structural connectivity of the brain midline, we introduce a novel connectivity regular loss. The proposed method requires fewer parameters and outperforms three state-of-the-art methods in terms of four evaluation metrics.
arXiv Detail & Related papers (2020-07-10T14:01:20Z)
GCN for HIN via Implicit Utilization of Attention and Meta-paths [104.24467864133942]
Heterogeneous information network (HIN) embedding aims to map the structure and semantic information in a HIN to distributed representations. We propose a novel neural network method via implicitly utilizing attention and meta-paths. We first use the multi-layer graph convolutional network (GCN) framework, which performs a discriminative aggregation at each layer. We then give an effective relaxation and improvement via introducing a new propagation operation which can be separated from aggregation.
arXiv Detail & Related papers (2020-07-06T11:09:40Z)
On the Combined Use of Extrinsic Semantic Resources for Medical Information Search [0.0]
We develop a framework to highlight and expand head medical concepts in verbose medical queries. We also build semantically enhanced inverted index documents. To demonstrate the effectiveness of the proposed approach, we conducted several experiments over the CLEF 2014 dataset.
arXiv Detail & Related papers (2020-05-17T14:18:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.