Unbiased Scene Graph Generation via Rich and Fair Semantic Extraction
- URL: http://arxiv.org/abs/2002.00176v1
- Date: Sat, 1 Feb 2020 09:28:44 GMT
- Title: Unbiased Scene Graph Generation via Rich and Fair Semantic Extraction
- Authors: Bin Wen, Jie Luo, Xianglong Liu, Lei Huang
- Abstract summary: We propose a new and simple architecture named Rich and Fair semantic extraction network (RiFa)
RiFa predicts subject-object relations based on both the visual and semantic features of entities under certain contextual area.
Experiments on the popular Visual Genome dataset show that RiFa achieves state-of-the-art performance.
- Score: 42.37557498737781
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extracting graph representation of visual scenes in image is a challenging
task in computer vision. Although there has been encouraging progress of scene
graph generation in the past decade, we surprisingly find that the performance
of existing approaches is largely limited by the strong biases, which mainly
stem from (1) unconsciously assuming relations with certain semantic properties
such as symmetric and (2) imbalanced annotations over different relations. To
alleviate the negative effects of these biases, we proposed a new and simple
architecture named Rich and Fair semantic extraction network (RiFa for short),
to not only capture rich semantic properties of the relations, but also fairly
predict relations with different scale of annotations. Using pseudo-siamese
networks, RiFa embeds the subject and object respectively to distinguish their
semantic differences and meanwhile preserve their underlying semantic
properties. Then, it further predicts subject-object relations based on both
the visual and semantic features of entities under certain contextual area, and
fairly ranks the relation predictions for those with a few annotations.
Experiments on the popular Visual Genome dataset show that RiFa achieves
state-of-the-art performance under several challenging settings of scene graph
task. Especially, it performs significantly better on capturing different
semantic properties of relations, and obtains the best overall per relation
performance.
Related papers
- Situational Scene Graph for Structured Human-centric Situation Understanding [15.91717913059569]
We propose a graph-based representation called Situational Scene Graph (SSG) to encode both humanobject-relationships and the corresponding semantic properties.
The semantic details are represented as predefined roles and values inspired by situation frame, which is originally designed to represent a single action.
We will release the code and the dataset soon.
arXiv Detail & Related papers (2024-10-30T09:11:25Z) - Explainable Representations for Relation Prediction in Knowledge Graphs [0.0]
We propose SEEK, a novel approach for explainable representations to support relation prediction in knowledge graphs.
It is based on identifying relevant shared semantic aspects between entities and learning representations for each subgraph.
We evaluate SEEK on two real-world relation prediction tasks: protein-protein interaction prediction and gene-disease association prediction.
arXiv Detail & Related papers (2023-06-22T06:18:40Z) - Sparse Relational Reasoning with Object-Centric Representations [78.83747601814669]
We investigate the composability of soft-rules learned by relational neural architectures when operating over object-centric representations.
We find that increasing sparsity, especially on features, improves the performance of some models and leads to simpler relations.
arXiv Detail & Related papers (2022-07-15T14:57:33Z) - Good Visual Guidance Makes A Better Extractor: Hierarchical Visual
Prefix for Multimodal Entity and Relation Extraction [88.6585431949086]
We propose a novel Hierarchical Visual Prefix fusion NeTwork (HVPNeT) for visual-enhanced entity and relation extraction.
We regard visual representation as pluggable visual prefix to guide the textual representation for error insensitive forecasting decision.
Experiments on three benchmark datasets demonstrate the effectiveness of our method, and achieve state-of-the-art performance.
arXiv Detail & Related papers (2022-05-07T02:10:55Z) - FactGraph: Evaluating Factuality in Summarization with Semantic Graph
Representations [114.94628499698096]
We propose FactGraph, a method that decomposes the document and the summary into structured meaning representations (MRs)
MRs describe core semantic concepts and their relations, aggregating the main content in both document and summary in a canonical form, and reducing data sparsity.
Experiments on different benchmarks for evaluating factuality show that FactGraph outperforms previous approaches by up to 15%.
arXiv Detail & Related papers (2022-04-13T16:45:33Z) - Biasing Like Human: A Cognitive Bias Framework for Scene Graph
Generation [20.435023745201878]
We propose a novel 3-paradigms framework that simulates how humans incorporate the label linguistic features as guidance of vision-based representations.
Our framework is model-agnostic to any scene graph model.
arXiv Detail & Related papers (2022-03-17T08:29:52Z) - Semantic Compositional Learning for Low-shot Scene Graph Generation [122.51930904132685]
Many scene graph generation (SGG) models solely use the limited annotated relation triples for training.
We propose a novel semantic compositional learning strategy that makes it possible to construct additional, realistic relation triples.
For three recent SGG models, adding our strategy improves their performance by close to 50%, and all of them substantially exceed the current state-of-the-art.
arXiv Detail & Related papers (2021-08-19T10:13:55Z) - Semi-Supervised Graph-to-Graph Translation [31.47555366566109]
Graph translation is a promising research direction and has a wide range of potential real-world applications.
One important reason is the lack of high-quality paired dataset.
We propose to construct a dual representation space, where transformation is performed explicitly to model the semantic transitions.
arXiv Detail & Related papers (2021-03-16T03:24:20Z) - Adaptive Attentional Network for Few-Shot Knowledge Graph Completion [16.722373937828117]
Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs.
Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties.
This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations.
arXiv Detail & Related papers (2020-10-19T16:27:48Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.