OW-Rep: Open World Object Detection with Instance Representation Learning
- URL: http://arxiv.org/abs/2409.16073v2
- Date: Mon, 17 Mar 2025 04:24:20 GMT
- Title: OW-Rep: Open World Object Detection with Instance Representation Learning
- Authors: Sunoh Lee, Minsik Jeon, Jihong Min, Junwon Seo,
- Abstract summary: Open World Object Detection (OWOD) addresses realistic scenarios where unseen object classes emerge.<n>We extend the OWOD framework to jointly detect unknown objects and learn semantically rich instance embeddings.
- Score: 1.8749305679160366
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open World Object Detection(OWOD) addresses realistic scenarios where unseen object classes emerge, enabling detectors trained on known classes to detect unknown objects and incrementally incorporate the knowledge they provide. While existing OWOD methods primarily focus on detecting unknown objects, they often overlook the rich semantic relationships between detected objects, which are essential for scene understanding and applications in open-world environments (e.g., open-world tracking and novel class discovery). In this paper, we extend the OWOD framework to jointly detect unknown objects and learn semantically rich instance embeddings, enabling the detector to capture fine-grained semantic relationships between instances. To this end, we propose two modules that leverage the rich and generalizable knowledge of Vision Foundation Models(VFM). First, the Unknown Box Refine Module uses instance masks from the Segment Anything Model to accurately localize unknown objects. The Embedding Transfer Module then distills instance-wise semantic similarities from VFM features to the detector's embeddings via a relaxed contrastive loss, enabling the detector to learn a semantically meaningful and generalizable instance feature. Extensive experiments show that our method significantly improves both unknown object detection and instance embedding quality, while also enhancing performance in downstream tasks such as open-world tracking.
Related papers
- From Objects to Events: Unlocking Complex Visual Understanding in Object Detectors via LLM-guided Symbolic Reasoning [71.41062111470414]
The proposed plug-and-play framework interfaces with any open-vocabulary detector.
At its core, our approach combines (i) a symbolic regression mechanism exploring relationship patterns among detected entities.
We compared our training-free framework against specialized event recognition systems across diverse application domains.
arXiv Detail & Related papers (2025-02-09T10:30:54Z) - From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects [0.6262268096839562]
Recent works on open vocabulary object detection (OVD) enable the detection of objects defined by an in-principle unbounded vocabulary.
OVD relies on accurate prompts provided by an oracle'', which limits their use in critical applications such as driving scene perception.
We propose a framework that enables OVD models to operate in open world settings, by identifying and incrementally learning previously unseen objects.
arXiv Detail & Related papers (2024-11-27T10:33:51Z) - Towards Open-World Object-based Anomaly Detection via Self-Supervised Outlier Synthesis [15.748043194987075]
This work aims to bridge the gap by leveraging an open-world object detector and an OoD detector via virtual outlier.
Our approach empowers our overall object detector architecture to learn anomaly-aware feature representations without relying on class labels.
Our method establishes state-of-the-art performance on object-level anomaly detection, achieving an average recall score improvement of over 5.4% for natural images.
arXiv Detail & Related papers (2024-07-22T16:16:38Z) - Open World Object Detection in the Era of Foundation Models [53.683963161370585]
We introduce a new benchmark that includes five real-world application-driven datasets.
We introduce a novel method, Foundation Object detection Model for the Open world, or FOMO, which identifies unknown objects based on their shared attributes with the base known objects.
arXiv Detail & Related papers (2023-12-10T03:56:06Z) - ECEA: Extensible Co-Existing Attention for Few-Shot Object Detection [52.16237548064387]
Few-shot object detection (FSOD) identifies objects from extremely few annotated samples.
Most existing FSOD methods, recently, apply the two-stage learning paradigm, which transfers the knowledge learned from abundant base classes to assist the few-shot detectors by learning the global features.
We propose an Extensible Co-Existing Attention (ECEA) module to enable the model to infer the global object according to the local parts.
arXiv Detail & Related papers (2023-09-15T06:55:43Z) - Unsupervised Recognition of Unknown Objects for Open-World Object
Detection [28.787586991713535]
Open-World Object Detection (OWOD) extends object detection problem to a realistic and dynamic scenario.
Current OWOD models, such as ORE and OW-DETR, focus on pseudo-labeling regions with high objectness scores as unknowns.
This paper proposes a novel approach that learns an unsupervised discriminative model to recognize true unknown objects.
arXiv Detail & Related papers (2023-08-31T08:17:29Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Detecting the open-world objects with the help of the Brain [20.00772846521719]
Open World Object Detection (OWOD) is a novel computer vision task with a considerable challenge.
OWOD algorithms are expected to detect unseen/unknown objects and incrementally learn them.
We propose leveraging the VL as the Brain'' of the open-world detector by simply generating unknown labels.
arXiv Detail & Related papers (2023-03-21T06:44:02Z) - Open-World Object Detection via Discriminative Class Prototype Learning [4.055884768256164]
Open-world object detection (OWOD) is a challenging problem that combines object detection with incremental learning and open-set learning.
We propose a novel and efficient OWOD solution from a prototype perspective, which we call OCPL: Open-world object detection via discnative OCPL: Open-world object detection via discriminative OCPL: Open-world object detection via discriminative OCPL: Open-world object detection via discriminative OCPL: Open-world object detection via discriminative OCPL: Open-world object detection via discriminative OCPL: Open-world object detection via
arXiv Detail & Related papers (2023-02-23T03:05:04Z) - Open World DETR: Transformer based Open World Object Detection [60.64535309016623]
We propose a two-stage training approach named Open World DETR for open world object detection based on Deformable DETR.
We fine-tune the class-specific components of the model with a multi-view self-labeling strategy and a consistency constraint.
Our proposed method outperforms other state-of-the-art open world object detection methods by a large margin.
arXiv Detail & Related papers (2022-12-06T13:39:30Z) - Towards Open-Set Object Detection and Discovery [38.81806249664884]
We present a new task, namely Open-Set Object Detection and Discovery (OSODD)
We propose a two-stage method that first uses an open-set object detector to predict both known and unknown objects.
Then, we study the representation of predicted objects in an unsupervised manner and discover new categories from the set of unknown objects.
arXiv Detail & Related papers (2022-04-12T08:07:01Z) - Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs.
We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z) - Discovery-and-Selection: Towards Optimal Multiple Instance Learning for
Weakly Supervised Object Detection [86.86602297364826]
We propose a discoveryand-selection approach fused with multiple instance learning (DS-MIL)
Our proposed DS-MIL approach can consistently improve the baselines, reporting state-of-the-art performance.
arXiv Detail & Related papers (2021-10-18T07:06:57Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Synthesizing the Unseen for Zero-shot Object Detection [72.38031440014463]
We propose to synthesize visual features for unseen classes, so that the model learns both seen and unseen objects in the visual domain.
We use a novel generative model that uses class-semantics to not only generate the features but also to discriminatively separate them.
arXiv Detail & Related papers (2020-10-19T12:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.