Mechanical Search on Shelves using a Novel "Bluction" Tool
- URL: http://arxiv.org/abs/2201.08968v1
- Date: Sat, 22 Jan 2022 05:47:30 GMT
- Title: Mechanical Search on Shelves using a Novel "Bluction" Tool
- Authors: Huang Huang, Michael Danielczuk, Chung Min Kim, Letian Fu, Zachary
Tam, Jeffrey Ichnowski, Anelia Angelova, Brian Ichter, and Ken Goldberg
- Abstract summary: Storage efficiency comes at the cost of reduced visibility and accessibility.
We introduce a novel bluction tool, which combines a thin pushing blade and suction cup gripper.
Using suction grasping actions improves the success rate over the highest performing push-only policy by 26% in simulation and 67% in physical environments.
- Score: 39.44966150696158
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Shelves are common in homes, warehouses, and commercial settings due to their
storage efficiency. However, this efficiency comes at the cost of reduced
visibility and accessibility. When looking from a side (lateral) view of a
shelf, most objects will be fully occluded, resulting in a constrained
lateral-access mechanical search problem. To address this problem, we
introduce: (1) a novel bluction tool, which combines a thin pushing blade and
suction cup gripper, (2) an improved LAX-RAY simulation pipeline and perception
model that combines ray-casting with 2D Minkowski sums to efficiently generate
target occupancy distributions, and (3) a novel SLAX-RAY search policy, which
optimally reduces target object distribution support area using the bluction
tool. Experimental data from 2000 simulated shelf trials and 18 trials with a
physical Fetch robot equipped with the bluction tool suggest that using suction
grasping actions improves the success rate over the highest performing
push-only policy by 26% in simulation and 67% in physical environments.
Related papers
- Learning to Optimize Package Picking for Large-Scale, Real-World Robot Induction [17.521846970697535]
We propose an ML-based framework that predicts transform adjustments as well as improving the selection of suction cups for sampled picks to enhance their success probabilities.<n>The proposed method achieves a 20% reduction in pick failure rates compared to a sampled-based pick sampling baseline, demonstrating its effectiveness in large-scale warehouse automation scenarios.
arXiv Detail & Related papers (2025-06-11T14:04:50Z) - Prompt Injection Attack to Tool Selection in LLM Agents [74.90338504778781]
We introduce textitToolHijacker, a novel prompt injection attack targeting tool selection in no-box scenarios.
ToolHijacker injects a malicious tool document into the tool library to manipulate the LLM agent's tool selection process.
We show that ToolHijacker is highly effective, significantly outperforming existing manual-based and automated prompt injection attacks.
arXiv Detail & Related papers (2025-04-28T13:36:43Z) - Dual-Path Enhancements in Event-Based Eye Tracking: Augmented Robustness and Adaptive Temporal Modeling [0.0]
Event-based eye tracking has become a pivotal technology for augmented reality and human-computer interaction.
Existing methods struggle with real-world challenges such as abrupt eye movements and environmental noise.
We introduce two key advancements. First, a robust data augmentation pipeline incorporating temporal shift, spatial flip, and event deletion improves model resilience.
Second, we propose KnightPupil, a hybrid architecture combining an EfficientNet-B3 backbone for spatial feature extraction, a bidirectional GRU for contextual temporal modeling, and a Linear Time-Varying State-Space Module.
arXiv Detail & Related papers (2025-04-14T07:57:22Z) - Corner-Grasp: Multi-Action Grasp Detection and Active Gripper Adaptation for Grasping in Cluttered Environments [0.3565151496245486]
We propose a method for effectively grasping in cluttered bin-picking environments.
We utilize a multi-functional gripper that combines both suction and finger grasping.
We also present an active gripper adaptation strategy to minimize collisions between the gripper hardware and the surrounding environment.
arXiv Detail & Related papers (2025-04-02T16:12:28Z) - TetraGrip: Sensor-Driven Multi-Suction Reactive Object Manipulation in Cluttered Scenes [2.941766061060261]
tetra is a novel vacuum-based grasping strategy featuring four suction cups mounted on linear actuators.
We evaluate tetra in a warehouse-style setting, demonstrating its ability to manipulate objects in stacked and obstructed configurations.
Our results show that our RL-based policy improves picking success in stacked-object scenarios by 22.86% compared to a single-suction gripper.
arXiv Detail & Related papers (2025-03-12T00:53:52Z) - Object-Centric Latent Action Learning [70.3173534658611]
We propose a novel object-centric latent action learning approach, based on VideoSaur and LAPO.
This method effectively disentangles causal agent-object interactions from irrelevant background noise and reduces the performance degradation caused by distractors.
Our preliminary experiments with the Distracting Control Suite show that latent action pretraining based on object decompositions improve the quality of inferred latent actions by x2.7 and efficiency of downstream fine-tuning with a small set of labeled actions, increasing return by x2.6 on average.
arXiv Detail & Related papers (2025-02-13T11:27:05Z) - Grammarization-Based Grasping with Deep Multi-Autoencoder Latent Space Exploration by Reinforcement Learning Agent [0.0]
We propose a novel framework for robotic grasping based on the idea of compressing high-dimensional target and gripper features in a common latent space.
Our approach simplifies grasping by using three autoencoders dedicated to the target, the gripper, and a third one that fuses their latent representations.
arXiv Detail & Related papers (2024-11-13T12:26:08Z) - Effective and Efficient Adversarial Detection for Vision-Language Models via A Single Vector [97.92369017531038]
We build a new laRge-scale Adervsarial images dataset with Diverse hArmful Responses (RADAR)
We then develop a novel iN-time Embedding-based AdveRSarial Image DEtection (NEARSIDE) method, which exploits a single vector that distilled from the hidden states of Visual Language Models (VLMs) to achieve the detection of adversarial images against benign ones in the input.
arXiv Detail & Related papers (2024-10-30T10:33:10Z) - Practical Video Object Detection via Feature Selection and Aggregation [18.15061460125668]
Video object detection (VOD) needs to concern the high across-frame variation in object appearance, and the diverse deterioration in some frames.
Most of contemporary aggregation methods are tailored for two-stage detectors, suffering from high computational costs.
This study invents a very simple yet potent strategy of feature selection and aggregation, gaining significant accuracy at marginal computational expense.
arXiv Detail & Related papers (2024-07-29T02:12:11Z) - LoLep: Single-View View Synthesis with Locally-Learned Planes and
Self-Attention Occlusion Inference [66.45326873274908]
We propose a novel method, LoLep, which regresses Locally-Learned planes from a single RGB image to represent scenes accurately.
Compared to MINE, our approach has an LPIPS reduction of 4.8%-9.0% and an RV reduction of 73.9%-83.5%.
arXiv Detail & Related papers (2023-07-23T03:38:55Z) - H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding
Object Articulations from Interactions [62.510951695174604]
"Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR) is a probabilistic generative framework that generates hypotheses about how objects articulate given input observations.
We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework.
We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models.
arXiv Detail & Related papers (2022-10-22T18:39:33Z) - A Large-scale Multiple-objective Method for Black-box Attack against
Object Detection [70.00150794625053]
We propose to minimize the true positive rate and maximize the false positive rate, which can encourage more false positive objects to block the generation of new true positive bounding boxes.
We extend the standard Genetic Algorithm with Random Subset selection and Divide-and-Conquer, called GARSDC, which significantly improves the efficiency.
Compared with the state-of-art attack methods, GARSDC decreases by an average 12.0 in the mAP and queries by about 1000 times in extensive experiments.
arXiv Detail & Related papers (2022-09-16T08:36:42Z) - Self-Attentive Pooling for Efficient Deep Learning [6.822466048176652]
We propose a novel non-local self-attentive pooling method that can be used as a drop-in replacement to the standard pooling layers.
We surpass the test accuracy of existing pooling techniques on different variants of MobileNet-V2 on ImageNet by an average of 1.2%.
Our approach achieves 1.43% higher test accuracy compared to SOTA techniques with iso-memory footprints.
arXiv Detail & Related papers (2022-09-16T00:35:14Z) - Efficient Decoder-free Object Detection with Transformers [75.00499377197475]
Vision transformers (ViTs) are changing the landscape of object detection approaches.
We propose a decoder-free fully transformer-based (DFFT) object detector.
DFFT_SMALL achieves high efficiency in both training and inference stages.
arXiv Detail & Related papers (2022-06-14T13:22:19Z) - A DCNN-based Arbitrarily-Oriented Object Detector for Quality Control
and Inspection Application [10.076629346147639]
A lightweight neural network is exploited to achieve oriented detection results using a regression method.
The first stage of the proposed method is capable of detecting the small targets considered in the two scenarios.
In the second stage, despite the simplicity, it is efficient to detect elongated targets while maintaining high running efficiency.
arXiv Detail & Related papers (2021-01-19T00:23:27Z) - X-Ray: Mechanical Search for an Occluded Object by Minimizing Support of
Learned Occupancy Distributions [44.39286120613235]
We introduce X-Ray, an algorithm based on learned occupancy distributions.
X-Ray minimizes support of the learned distribution as part of a mechanical search policy in both simulated and real environments.
Results suggest that X-Ray is significantly more efficient, as it succeeds in extracting the target object 82% of the time.
arXiv Detail & Related papers (2020-04-20T03:25:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.