Memory-Guided Semantic Learning Network for Temporal Sentence Grounding
- URL: http://arxiv.org/abs/2201.00454v1
- Date: Mon, 3 Jan 2022 02:32:06 GMT
- Title: Memory-Guided Semantic Learning Network for Temporal Sentence Grounding
- Authors: Daizong Liu, Xiaoye Qu, Xing Di, Yu Cheng, Zichuan Xu, Pan Zhou
- Abstract summary: We propose a memory-augmented network that learns and memorizes the rarely appeared content in TSG tasks.
MGSL-Net consists of three main parts: a cross-modal inter-action module, a memory augmentation module, and a heterogeneous attention module.
- Score: 55.31041933103645
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Temporal sentence grounding (TSG) is crucial and fundamental for video
understanding. Although the existing methods train well-designed deep networks
with a large amount of data, we find that they can easily forget the rarely
appeared cases in the training stage due to the off-balance data distribution,
which influences the model generalization and leads to undesirable performance.
To tackle this issue, we propose a memory-augmented network, called
Memory-Guided Semantic Learning Network (MGSL-Net), that learns and memorizes
the rarely appeared content in TSG tasks. Specifically, MGSL-Net consists of
three main parts: a cross-modal inter-action module, a memory augmentation
module, and a heterogeneous attention module. We first align the given
video-query pair by a cross-modal graph convolutional network, and then utilize
a memory module to record the cross-modal shared semantic features in the
domain-specific persistent memory. During training, the memory slots are
dynamically associated with both common and rare cases, alleviating the
forgetting issue. In testing, the rare cases can thus be enhanced by retrieving
the stored memories, resulting in better generalization. At last, the
heterogeneous attention module is utilized to integrate the enhanced
multi-modal features in both video and query domains. Experimental results on
three benchmarks show the superiority of our method on both effectiveness and
efficiency, which substantially improves the accuracy not only on the entire
dataset but also on rare cases.
Related papers
- Benchmarking Hebbian learning rules for associative memory [0.0]
Associative memory is a key concept in cognitive and computational brain science.
We benchmark six different learning rules on storage capacity and prototype extraction.
arXiv Detail & Related papers (2023-12-30T21:49:47Z) - Black-box Unsupervised Domain Adaptation with Bi-directional
Atkinson-Shiffrin Memory [59.51934126717572]
Black-box unsupervised domain adaptation (UDA) learns with source predictions of target data without accessing either source data or source models during training.
We propose BiMem, a bi-directional memorization mechanism that learns to remember useful and representative information to correct noisy pseudo labels on the fly.
BiMem achieves superior domain adaptation performance consistently across various visual recognition tasks such as image classification, semantic segmentation and object detection.
arXiv Detail & Related papers (2023-08-25T08:06:48Z) - GLEAM: Greedy Learning for Large-Scale Accelerated MRI Reconstruction [50.248694764703714]
Unrolled neural networks have recently achieved state-of-the-art accelerated MRI reconstruction.
These networks unroll iterative optimization algorithms by alternating between physics-based consistency and neural-network based regularization.
We propose Greedy LEarning for Accelerated MRI reconstruction, an efficient training strategy for high-dimensional imaging settings.
arXiv Detail & Related papers (2022-07-18T06:01:29Z) - Pin the Memory: Learning to Generalize Semantic Segmentation [68.367763672095]
We present a novel memory-guided domain generalization method for semantic segmentation based on meta-learning framework.
Our method abstracts the conceptual knowledge of semantic classes into categorical memory which is constant beyond the domains.
arXiv Detail & Related papers (2022-04-07T17:34:01Z) - Universal Hopfield Networks: A General Framework for Single-Shot
Associative Memory Models [41.58529335439799]
We propose a general framework for understanding the operation of memory networks as a sequence of three operations.
We derive all these memory models as instances of our general framework with differing similarity and separation functions.
arXiv Detail & Related papers (2022-02-09T16:48:06Z) - End-to-End Egospheric Spatial Memory [32.42361470456194]
We propose a parameter-free module, Egospheric Spatial Memory (ESM), which encodes the memory in an ego-sphere around the agent.
ESM can be trained end-to-end via either imitation or reinforcement learning.
We show applications to semantic segmentation on the ScanNet dataset, where ESM naturally combines image-level and map-level inference modalities.
arXiv Detail & Related papers (2021-02-15T18:59:07Z) - Memorizing Comprehensively to Learn Adaptively: Unsupervised
Cross-Domain Person Re-ID with Multi-level Memory [89.43986007948772]
We propose a novel multi-level memory network (MMN) to discover multi-level complementary information in the target domain.
Unlike the simple memory in previous works, we propose a novel multi-level memory network (MMN) to discover multi-level complementary information in the target domain.
arXiv Detail & Related papers (2020-01-13T09:48:03Z) - Learning and Memorizing Representative Prototypes for 3D Point Cloud
Semantic and Instance Segmentation [117.29799759864127]
3D point cloud semantic and instance segmentation is crucial and fundamental for 3D scene understanding.
Deep networks can easily forget the non-dominant cases during the learning process, resulting in unsatisfactory performance.
We propose a memory-augmented network to learn and memorize the representative prototypes that cover diverse samples universally.
arXiv Detail & Related papers (2020-01-06T01:07:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.