Are standard Object Segmentation models sufficient for Learning
Affordance Segmentation?
- URL: http://arxiv.org/abs/2107.02095v1
- Date: Mon, 5 Jul 2021 15:34:20 GMT
- Title: Are standard Object Segmentation models sufficient for Learning
Affordance Segmentation?
- Authors: Hugo Caselles-Dupr\'e, Michael Garcia-Ortiz, David Filliat
- Abstract summary: We show that applying the out-of-the-box Mask R-CNN to the problem of affordances segmentation outperforms the current state-of-the-art.
We argue that better benchmarks for affordance learning should include action capacities.
- Score: 3.845877724862319
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Affordances are the possibilities of actions the environment offers to the
individual. Ordinary objects (hammer, knife) usually have many affordances
(grasping, pounding, cutting), and detecting these allow artificial agents to
understand what are their possibilities in the environment, with obvious
application in Robotics. Proposed benchmarks and state-of-the-art prediction
models for supervised affordance segmentation are usually modifications of
popular object segmentation models such as Mask R-CNN. We observe that
theoretically, these popular object segmentation methods should be sufficient
for detecting affordances masks. So we ask the question: is it necessary to
tailor new architectures to the problem of learning affordances? We show that
applying the out-of-the-box Mask R-CNN to the problem of affordances
segmentation outperforms the current state-of-the-art. We conclude that the
problem of supervised affordance segmentation is included in the problem of
object segmentation and argue that better benchmarks for affordance learning
should include action capacities.
Related papers
- MaskUno: Switch-Split Block For Enhancing Instance Segmentation [0.0]
We propose replacing mask prediction with a Switch-Split block that processes refined ROIs, classifies them, and assigns them to specialized mask predictors.
An increase in the mean Average Precision (mAP) of 2.03% was observed for the high-performing DetectoRS when trained on 80 classes.
arXiv Detail & Related papers (2024-07-31T10:12:14Z) - OoDIS: Anomaly Instance Segmentation Benchmark [57.89836988990543]
We extend the most commonly used anomaly segmentation benchmarks to include the instance segmentation task.
Development in this area has been lagging, largely due to the lack of dedicated benchmarks.
Our evaluation of anomaly instance segmentation methods shows that this challenge remains an unsolved problem.
arXiv Detail & Related papers (2024-06-17T17:59:56Z) - Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - Robotic Scene Segmentation with Memory Network for Runtime Surgical
Context Inference [8.600278838838163]
Space Time Correspondence Network (STCN) is a memory network that performs binary segmentation and minimizes the effects of class imbalance.
We show that STCN achieves superior segmentation performance for objects that are difficult to segment, such as needle and thread.
We also demonstrate that segmentation and context inference can be performed at runtime without compromising performance.
arXiv Detail & Related papers (2023-08-24T13:44:55Z) - LISA: Reasoning Segmentation via Large Language Model [68.24075852136761]
We propose a new segmentation task -- reasoning segmentation.
The task is designed to output a segmentation mask given a complex and implicit query text.
We present LISA: large Language Instructed Assistant, which inherits the language generation capabilities of multimodal Large Language Models.
arXiv Detail & Related papers (2023-08-01T17:50:17Z) - Masked Supervised Learning for Semantic Segmentation [5.177947445379688]
Masked Supervised Learning (MaskSup) is an effective single-stage learning paradigm that models both short- and long-range context.
We show that the proposed method is computationally efficient, yielding an improved performance by 10% on the mean intersection-over-union (mIoU)
arXiv Detail & Related papers (2022-10-03T13:30:19Z) - AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance.
We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations.
AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z) - Target-Aware Object Discovery and Association for Unsupervised Video
Multi-Object Segmentation [79.6596425920849]
This paper addresses the task of unsupervised video multi-object segmentation.
We introduce a novel approach for more accurate and efficient unseen-temporal segmentation.
We evaluate the proposed approach on DAVIS$_17$ and YouTube-VIS, and the results demonstrate that it outperforms state-of-the-art methods both in segmentation accuracy and inference speed.
arXiv Detail & Related papers (2021-04-10T14:39:44Z) - Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals [78.12377360145078]
We introduce a novel two-step framework that adopts a predetermined prior in a contrastive optimization objective to learn pixel embeddings.
This marks a large deviation from existing works that relied on proxy tasks or end-to-end clustering.
In particular, when fine-tuning the learned representations using just 1% of labeled examples on PASCAL, we outperform supervised ImageNet pre-training by 7.1% mIoU.
arXiv Detail & Related papers (2021-02-11T18:54:47Z) - A Three-Stage Self-Training Framework for Semi-Supervised Semantic
Segmentation [0.9786690381850356]
We propose a holistic solution framed as a three-stage self-training framework for semantic segmentation.
The key idea of our technique is the extraction of the pseudo-masks statistical information.
We then decrease the uncertainty of the pseudo-masks using a multi-task model that enforces consistency.
arXiv Detail & Related papers (2020-12-01T21:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.