Related papers: Label-Efficient Grasp Joint Prediction with Point-JEPA

Related papers

LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics [53.247652209132376]
Joint-Embedding Predictive Architectures (JEPAs) offer a promising blueprint, but lack of practical guidance and theory has led to ad-hoc R&D.<n>We present a comprehensive theory of JEPAs and instantiate it in bf LeJEPA, a lean, scalable, and theoretically grounded training objective.
arXiv Detail & Related papers (2025-11-11T18:21:55Z)
Gaussian Embeddings: How JEPAs Secretly Learn Your Data Density [51.15085346971361]
Joint Embedding Predictive Architectures (JEPAs) learn representations able to solve numerous downstream tasks out-of-the-box.<n>JEPAs combine two objectives: (i) a latent-space prediction term, i.e., the representation of a slightly perturbed sample must be predictable from the original sample's representation, and (ii) an anti-collapse term, i.e., not all samples should have the same representation.
arXiv Detail & Related papers (2025-10-07T14:06:30Z)
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning [71.30276778807068]
We propose a unified framework that strategically coordinates sample pruning and token pruning.<n>Q-Tuning achieves a +38% average improvement over the full-data SFT baseline using only 12.5% of the original training data.
arXiv Detail & Related papers (2025-09-28T13:27:38Z)
Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness [61.45587642780908]
We propose a three-step approach for parameter-efficient fine-tuning of image-text foundation models.<n>Our method improves its two key components: minority samples identification and the robust training algorithm.<n>Our theoretical analysis shows that our PPA enhances minority group identification and is Bayes optimal for minimizing the balanced group error.
arXiv Detail & Related papers (2025-03-12T15:46:12Z)
EPContrast: Effective Point-level Contrastive Learning for Large-scale Point Cloud Understanding [27.596165112950935]
This paper proposes an Effective Point-level Contrastive Learning method for large-scale point cloud understanding dubbed textbfEPContrast EPContrast constructs positive and negative pairs based on asymmetric embedding, while ChannelContrast imposes contrastive supervision between channel feature maps. The efficacy of EPContrast is substantiated through comprehensive validation on S3DIS and ScanNetV2, encompassing tasks such as semantic segmentation, instance segmentation, and object detection.
arXiv Detail & Related papers (2024-10-22T17:27:16Z)
How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks [14.338754598043968]
Two competing paradigms exist for self-supervised learning of data representations. Joint Embedding Predictive Architecture (JEPA) is a class of architectures in which semantically similar inputs are encoded into representations that are predictive of each other.
arXiv Detail & Related papers (2024-07-03T19:43:12Z)
Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review. A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods. We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z)
Well-calibrated Confidence Measures for Multi-label Text Classification with a Large Number of Labels [1.1833906227033337]
We present a novel approach for addressing the computational inefficiency of the Label Powerset (LP) ICP, arrising when dealing with a high number of unique labels. We apply the LP-ICP on three deep Artificial Neural Network (ANN) classifiers of two types: one based on contextualised (bert) and two on non-contextualised (word2vec) word-embeddings. Our approach deals with the increased computational burden of LP by eliminating from consideration a significant number of label-sets that will surely have p-values below the specified significance level.
arXiv Detail & Related papers (2023-12-14T19:17:42Z)
Unifying Token and Span Level Supervisions for Few-Shot Sequence Labeling [18.24907067631541]
Few-shot sequence labeling aims to identify novel classes based on only a few labeled samples. We propose a Consistent Dual Adaptive Prototypical (CDAP) network for few-shot sequence labeling. Our model achieves new state-of-the-art results on three benchmark datasets.
arXiv Detail & Related papers (2023-07-16T04:50:52Z)
Implicit and Efficient Point Cloud Completion for 3D Single Object Tracking [9.372859423951349]
We introduce two novel modules, i.e., Adaptive Refine Prediction (ARP) and Target Knowledge Transfer (TKT) Our model achieves state-of-the-art performance while maintaining a lower computational consumption.
arXiv Detail & Related papers (2022-09-01T15:11:06Z)
Learning to Register Unbalanced Point Pairs [10.369750912567714]
Recent 3D registration methods can effectively handle large-scale or partially overlapping point pairs. We present a novel 3D registration method, called UPPNet, for the unbalanced point pairs.
arXiv Detail & Related papers (2022-07-09T08:03:59Z)
Disentangle Your Dense Object Detector [82.22771433419727]
Deep learning-based dense object detectors have achieved great success in the past few years and have been applied to numerous multimedia applications such as video understanding. However, the current training pipeline for dense detectors is compromised to lots of conjunctions that may not hold. We propose Disentangled Dense Object Detector (DDOD), in which simple and effective disentanglement mechanisms are designed and integrated into the current state-of-the-art detectors.
arXiv Detail & Related papers (2021-07-07T00:52:16Z)
WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD) An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images. The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z)
EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system. It can be trained in one shot on both fully and weakly-annotated data. It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.