Related papers: Object Detection with Deep Reinforcement Learning

Related papers

Pattern Integration and Enhancement Vision Transformer for Self-Supervised Learning in Remote Sensing [11.626527403157922]
We present the Pattern Integration and Enhancement Vision Transformer (PIEViT), a novel self-supervised learning framework for remote sensing imagery. PIEViT enhances the representation of internal patch features, providing significant improvements over existing self-supervised baselines. It achieves excellent results in object detection, land cover classification, and change detection, underscoring its robustness, generalization, and transferability for remote sensing image interpretation tasks.
arXiv Detail & Related papers (2024-11-09T07:06:31Z)
Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization [61.64304227831361]
Single-domain generalization aims to learn a model from single source domain data to achieve generalized performance on other unseen target domains. We propose a dynamic object-centric perception network based on prompt learning, aiming to adapt to the variations in image complexity.
arXiv Detail & Related papers (2024-02-28T16:16:51Z)
Graphical Object-Centric Actor-Critic [55.2480439325792]
We propose a novel object-centric reinforcement learning algorithm combining actor-critic and model-based approaches. We use a transformer encoder to extract object representations and graph neural networks to approximate the dynamics of an environment. Our algorithm performs better in a visually complex 3D robotic environment and a 2D environment with compositional structure than the state-of-the-art model-free actor-critic algorithm.
arXiv Detail & Related papers (2023-10-26T06:05:12Z)
Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner. We design a semantic-guided self-supervised learning model to extract high-level semantic features from images. We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z)
Dynamic-Resolution Model Learning for Object Pile Manipulation [33.05246884209322]
We investigate how to learn dynamic and adaptive representations at different levels of abstraction to achieve the optimal trade-off between efficiency and effectiveness. Specifically, we construct dynamic-resolution particle representations of the environment and learn a unified dynamics model using graph neural networks (GNNs) We show that our method achieves significantly better performance than state-of-the-art fixed-resolution baselines at the gathering, sorting, and redistribution of granular object piles.
arXiv Detail & Related papers (2023-06-29T05:51:44Z)
Learning-based Relational Object Matching Across Views [63.63338392484501]
We propose a learning-based approach which combines local keypoints with novel object-level features for matching object detections between RGB images. We train our object-level matching features based on appearance and inter-frame and cross-frame spatial relations between objects in an associative graph neural network.
arXiv Detail & Related papers (2023-05-03T19:36:51Z)
Localizing Object-level Shape Variations with Text-to-Image Diffusion Models [60.422435066544814]
We present a technique to generate a collection of images that depicts variations in the shape of a specific object. A particular challenge when generating object variations is accurately localizing the manipulation applied over the object's shape. To localize the image-space operation, we present two techniques that use the self-attention layers in conjunction with the cross-attention layers.
arXiv Detail & Related papers (2023-03-20T17:45:08Z)
Dynamic Object Removal for Effective Slam [1.8907108368038215]
The paper proposes a two-step process to address this challenge, which involves finding the dynamic objects in the scene using a Flow-based method and then using a deep Video inpainting algorithm to remove them. The study aims to test the validity of this approach by comparing it with baseline results using two state-of-the-art SLAM algorithms, ORB-SLAM2 and LSD, and understanding the impact of dynamic objects and the corresponding trade-offs.
arXiv Detail & Related papers (2023-03-20T07:47:36Z)
D2SLAM: Semantic visual SLAM based on the influence of Depth for Dynamic environments [0.483420384410068]
We propose a novel approach to determine dynamic elements that lack generalization and scene awareness. We use scene depth information that refines the accuracy of estimates from geometric and semantic modules. The obtained results demonstrate the efficacy of the proposed method in providing accurate localization and mapping in dynamic environments.
arXiv Detail & Related papers (2022-10-16T22:13:59Z)
SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding [5.715548995729382]
We propose an effective technique for image augmentation by injecting contextually meaningful knowledge into the scenes. Our method of semantically meaningful image augmentation for object detection via language grounding, SemAug, starts by calculating semantically appropriate new objects.
arXiv Detail & Related papers (2022-08-15T19:00:56Z)
Dynamic Proposals for Efficient Object Detection [48.66093789652899]
We propose a simple yet effective method which is adaptive to different computational resources by generating dynamic proposals for object detection. Our method achieves significant speed-up across a wide range of detection models including two-stage and query-based models.
arXiv Detail & Related papers (2022-07-12T01:32:50Z)
ObjectFormer for Image Manipulation Detection and Localization [118.89882740099137]
We propose ObjectFormer to detect and localize image manipulations. We extract high-frequency features of the images and combine them with RGB features as multimodal patch embeddings. We conduct extensive experiments on various datasets and the results verify the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-03-28T12:27:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.