Referring Expression Comprehension Using Language Adaptive Inference
- URL: http://arxiv.org/abs/2306.04451v1
- Date: Tue, 6 Jun 2023 07:58:59 GMT
- Title: Referring Expression Comprehension Using Language Adaptive Inference
- Authors: Wei Su, Peihan Miao, Huanzhang Dou, Yongjian Fu, and Xi Li
- Abstract summary: This paper explores the adaptation between expressions and REC models for dynamic inference.
We propose a framework named Language Adaptive Subnets (LADS), which can extract language-adaptives from the REC model conditioned on the referring expressions.
Experiments on RefCOCO, RefCO+, RefCOCOg, and Referit show that the proposed method achieves faster inference speed and higher accuracy against state-of-the-art approaches.
- Score: 15.09309604460633
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Different from universal object detection, referring expression comprehension
(REC) aims to locate specific objects referred to by natural language
expressions. The expression provides high-level concepts of relevant visual and
contextual patterns, which vary significantly with different expressions and
account for only a few of those encoded in the REC model. This leads us to a
question: do we really need the entire network with a fixed structure for
various referring expressions? Ideally, given an expression, only
expression-relevant components of the REC model are required. These components
should be small in number as each expression only contains very few visual and
contextual clues. This paper explores the adaptation between expressions and
REC models for dynamic inference. Concretely, we propose a neat yet efficient
framework named Language Adaptive Dynamic Subnets (LADS), which can extract
language-adaptive subnets from the REC model conditioned on the referring
expressions. By using the compact subnet, the inference can be more economical
and efficient. Extensive experiments on RefCOCO, RefCOCO+, RefCOCOg, and
Referit show that the proposed method achieves faster inference speed and
higher accuracy against state-of-the-art approaches.
Related papers
- Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation [18.806738617249426]
Generalized Referring Expression introduces new challenges by allowing expressions to describe multiple objects or lack specific object references.
Existing RES methods, usually rely on sophisticated encoder-decoder and feature fusion modules.
We propose a novel Model with Adaptive Binding Prototypes (MABP) that adaptively binds queries to object features in the corresponding region.
arXiv Detail & Related papers (2024-05-24T03:07:38Z) - Dense X Retrieval: What Retrieval Granularity Should We Use? [56.90827473115201]
Often-overlooked design choice is the retrieval unit in which the corpus is indexed, e.g. document, passage, or sentence.
We introduce a novel retrieval unit, proposition, for dense retrieval.
Experiments reveal that indexing a corpus by fine-grained units such as propositions significantly outperforms passage-level units in retrieval tasks.
arXiv Detail & Related papers (2023-12-11T18:57:35Z) - BERM: Training the Balanced and Extractable Representation for Matching
to Improve Generalization Ability of Dense Retrieval [54.66399120084227]
We propose a novel method to improve the generalization of dense retrieval via capturing matching signal called BERM.
Dense retrieval has shown promise in the first-stage retrieval process when trained on in-domain labeled datasets.
arXiv Detail & Related papers (2023-05-18T15:43:09Z) - Semantics-Aware Dynamic Localization and Refinement for Referring Image
Segmentation [102.25240608024063]
Referring image segments an image from a language expression.
We develop an algorithm that shifts from being localization-centric to segmentation-language.
Compared to its counterparts, our method is more versatile yet effective.
arXiv Detail & Related papers (2023-03-11T08:42:40Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - One for All: One-stage Referring Expression Comprehension with Dynamic
Reasoning [11.141645707535599]
We propose a Dynamic Multi-step Reasoning Network, which allows the reasoning steps to be dynamically adjusted based on the reasoning state and expression complexity.
The work achieves the state-of-the-art performance or significant improvements on several REC datasets.
arXiv Detail & Related papers (2022-07-31T04:51:27Z) - UnifieR: A Unified Retriever for Large-Scale Retrieval [84.61239936314597]
Large-scale retrieval is to recall relevant documents from a huge collection given a query.
Recent retrieval methods based on pre-trained language models (PLM) can be coarsely categorized into either dense-vector or lexicon-based paradigms.
We propose a new learning framework, UnifieR which unifies dense-vector and lexicon-based retrieval in one model with a dual-representing capability.
arXiv Detail & Related papers (2022-05-23T11:01:59Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z) - Cops-Ref: A new Dataset and Task on Compositional Referring Expression
Comprehension [39.40351938417889]
Referring expression comprehension (REF) aims at identifying a particular object in a scene by a natural language expression.
Some popular referring expression datasets fail to provide an ideal test bed for evaluating the reasoning ability of the models.
We propose a new dataset for visual reasoning in context of referring expression comprehension with two main features.
arXiv Detail & Related papers (2020-03-01T04:59:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.