Deep Learning Techniques for Future Intelligent Cross-Media Retrieval
- URL: http://arxiv.org/abs/2008.01191v1
- Date: Tue, 21 Jul 2020 09:49:33 GMT
- Title: Deep Learning Techniques for Future Intelligent Cross-Media Retrieval
- Authors: Sadaqat ur Rehman, Muhammad Waqas, Shanshan Tu, Anis Koubaa, Obaid ur
Rehman, Jawad Ahmad, Muhammad Hanif, Zhu Han
- Abstract summary: Cross-media retrieval plays a significant role in big data applications.
We provide a novel taxonomy according to the challenges faced by multi-modal deep learning approaches.
We present some well-known cross-media datasets used for retrieval.
- Score: 58.20547387332133
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the advancement in technology and the expansion of broadcasting,
cross-media retrieval has gained much attention. It plays a significant role in
big data applications and consists in searching and finding data from different
types of media. In this paper, we provide a novel taxonomy according to the
challenges faced by multi-modal deep learning approaches in solving cross-media
retrieval, namely: representation, alignment, and translation. These challenges
are evaluated on deep learning (DL) based methods, which are categorized into
four main groups: 1) unsupervised methods, 2) supervised methods, 3) pairwise
based methods, and 4) rank based methods. Then, we present some well-known
cross-media datasets used for retrieval, considering the importance of these
datasets in the context in of deep learning based cross-media retrieval
approaches. Moreover, we also present an extensive review of the
state-of-the-art problems and its corresponding solutions for encouraging deep
learning in cross-media retrieval. The fundamental objective of this work is to
exploit Deep Neural Networks (DNNs) for bridging the "media gap", and provide
researchers and developers with a better understanding of the underlying
problems and the potential solutions of deep learning assisted cross-media
retrieval. To the best of our knowledge, this is the first comprehensive survey
to address cross-media retrieval under deep learning methods.
Related papers
- Datasets, Clues and State-of-the-Arts for Multimedia Forensics: An
Extensive Review [19.30075248247771]
This survey focusses on approaches for tampering detection in multimedia data using deep learning models.
It presents a detailed analysis of benchmark datasets for malicious manipulation detection that are publicly available.
It also offers a comprehensive list of tampering clues and commonly used deep learning architectures.
arXiv Detail & Related papers (2024-01-13T07:03:58Z) - Guided Depth Map Super-resolution: A Survey [88.54731860957804]
Guided depth map super-resolution (GDSR) aims to reconstruct a high-resolution (HR) depth map from a low-resolution (LR) observation with the help of a paired HR color image.
A myriad of novel and effective approaches have been proposed recently, especially with powerful deep learning techniques.
This survey is an effort to present a comprehensive survey of recent progress in GDSR.
arXiv Detail & Related papers (2023-02-19T15:43:54Z) - Cross-Media Scientific Research Achievements Retrieval Based on Deep
Language Model [2.900289363118179]
This paper proposes a cross-media scientific research achievements retrieval method based on deep language model (CARDL)
It achieves a unified cross-media semantic representation by learning the semantic association between different modal data.
Cross-media retrieval is realized through semantic similarity matching between different modal data.
arXiv Detail & Related papers (2022-03-29T14:04:53Z) - Scientific and Technological Information Oriented Semantics-adversarial and Media-adversarial Cross-media Retrieval [21.630525836722036]
Cross-media scientific and technological information retrieval is one of the important tasks in the cross-media study.
We propose a scientific and technological information oriented Semantics-adversarial and Media-adversarial Cross-media Retrieval method (SMCR) to find an effective common subspace.
SMCR minimizes the loss of inter-media semantic consistency in addition to modeling intra-media semantic discrimination, to preserve semantic similarity before and after mapping.
arXiv Detail & Related papers (2022-03-16T13:31:48Z) - Weakly Supervised Object Localization and Detection: A Survey [145.5041117184952]
weakly supervised object localization and detection plays an important role for developing new generation computer vision systems.
We review (1) classic models, (2) approaches with feature representations from off-the-shelf deep networks, (3) approaches solely based on deep learning, and (4) publicly available datasets and standard evaluation metrics that are widely used in this field.
We discuss the key challenges in this field, development history of this field, advantages/disadvantages of the methods in each category, relationships between methods in different categories, applications of the weakly supervised object localization and detection methods, and potential future directions to further promote the development of this research field
arXiv Detail & Related papers (2021-04-16T06:44:50Z) - Continual learning in cross-modal retrieval [47.73014647702813]
We study how the interference caused by new tasks impacts the embedding spaces and their cross-modal alignment required for effective retrieval.
We propose a general framework that decouples the training, indexing and querying stages.
We also identify and study different factors that may lead to forgetting, and propose tools to alleviate it.
arXiv Detail & Related papers (2021-04-14T12:13:39Z) - A Survey on Deep Semi-supervised Learning [51.26862262550445]
We first present a taxonomy for deep semi-supervised learning that categorizes existing methods.
We then offer a detailed comparison of these methods in terms of the type of losses, contributions, and architecture differences.
arXiv Detail & Related papers (2021-02-28T16:22:58Z) - A Survey of Community Detection Approaches: From Statistical Modeling to
Deep Learning [95.27249880156256]
We develop and present a unified architecture of network community-finding methods.
We introduce a new taxonomy that divides the existing methods into two categories, namely probabilistic graphical model and deep learning.
We conclude with discussions of the challenges of the field and suggestions of possible directions for future research.
arXiv Detail & Related papers (2021-01-03T02:32:45Z) - A Survey of Deep Meta-Learning [1.2891210250935143]
Deep neural networks can achieve great successes when presented with large data sets and sufficient computational resources.
However, their ability to learn new concepts quickly is limited.
Deep Meta-Learning is one approach to address this issue, by enabling the network to learn how to learn.
arXiv Detail & Related papers (2020-10-07T17:09:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.