Related papers: MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation

MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation

URL: http://arxiv.org/abs/2403.11576v1
Date: Mon, 18 Mar 2024 08:52:23 GMT
Title: MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation
Authors: Chih-Chung Hsu, Chia-Ming Lee,
Abstract summary: The strategic integration of a visual prior into the training dataset emerges as a potential solution to enhance congruity with the testing data distribution. Our empirical evaluations underscore the efficacy of MISS, demonstrating commendable performance in scenarios characterized by limited data availability and memory constraints.
Score: 8.727456619750983
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Instance segmentation, a cornerstone task in computer vision, has wide-ranging applications in diverse industries. The advent of deep learning and artificial intelligence has underscored the criticality of training effective models, particularly in data-scarce scenarios - a concern that resonates in both academic and industrial circles. A significant impediment in this domain is the resource-intensive nature of procuring high-quality, annotated data for instance segmentation, a hurdle that amplifies the challenge of developing robust models under resource constraints. In this context, the strategic integration of a visual prior into the training dataset emerges as a potential solution to enhance congruity with the testing data distribution, consequently reducing the dependency on computational resources and the need for highly complex models. However, effectively embedding a visual prior into the learning process remains a complex endeavor. Addressing this challenge, we introduce the MISS (Memory-efficient Instance Segmentation System) framework. MISS leverages visual inductive prior flow propagation, integrating intrinsic prior knowledge from the Synergy-basketball dataset at various stages: data preprocessing, augmentation, training, and inference. Our empirical evaluations underscore the efficacy of MISS, demonstrating commendable performance in scenarios characterized by limited data availability and memory constraints.

Related papers

AdvKT: An Adversarial Multi-Step Training Framework for Knowledge Tracing [64.79967583649407]
Knowledge Tracing (KT) monitors students' knowledge states and simulates their responses to question sequences. Existing KT models typically follow a single-step training paradigm, which leads to significant error accumulation. We propose a novel Adversarial Multi-Step Training Framework for Knowledge Tracing (AdvKT) which focuses on the multi-step KT task.
arXiv Detail & Related papers (2025-04-07T03:31:57Z)
Leveraging Language Models for Analyzing Longitudinal Experiential Data in Education [0.8026406775824594]
We propose a novel approach to leveraging pre-trained language models (LMs) for early forecasting of academic trajectories in STEM students. Key challenges in handling such data include high rates of missing values, limited dataset size due to costly data collection, and complex temporal variability across modalities.
arXiv Detail & Related papers (2025-03-27T15:37:23Z)
Paving the way for scientific foundation models: enhancing generalization and robustness in PDEs with constraint-aware pre-training [49.8035317670223]
A scientific foundation model (SciFM) is emerging as a promising tool for learning transferable representations across diverse domains. We propose incorporating PDE residuals into pre-training either as the sole learning signal or in combination with data loss to compensate for limited or infeasible training data. Our results show that pre-training with PDE constraints significantly enhances generalization, outperforming models trained solely on solution data.
arXiv Detail & Related papers (2025-03-24T19:12:39Z)
Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining [55.262510814326035]
Existing reweighting strategies primarily focus on group-level data importance. We introduce novel algorithms for dynamic, instance-level data reweighting. Our framework allows us to devise reweighting strategies deprioritizing redundant or uninformative data.
arXiv Detail & Related papers (2025-02-10T17:57:15Z)
Online Continual Learning: A Systematic Literature Review of Approaches, Challenges, and Benchmarks [1.3631535881390204]
Online Continual Learning (OCL) is a critical area in machine learning. This study conducts the first comprehensive Systematic Literature Review on OCL.
arXiv Detail & Related papers (2025-01-09T01:03:14Z)
HRVMamba: High-Resolution Visual State Space Model for Dense Prediction [60.80423207808076]
State Space Models (SSMs) with efficient hardware-aware designs have demonstrated significant potential in computer vision tasks. These models have been constrained by three key challenges: insufficient inductive bias, long-range forgetting, and low-resolution output representation. We introduce the Dynamic Visual State Space (DVSS) block, which employs deformable convolution to mitigate the long-range forgetting problem. We also introduce High-Resolution Visual State Space Model (HRVMamba) based on the DVSS block, which preserves high-resolution representations throughout the entire process.
arXiv Detail & Related papers (2024-10-04T06:19:29Z)
A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance. We propose a simple yet effective data augmentation approach by leveraging advancements in generative models. Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z)
VIRL: Volume-Informed Representation Learning towards Few-shot Manufacturability Estimation [0.0]
This work introduces VIRL, a Volume-Informed Representation Learning approach to pre-train a 3D geometric encoder. The model pre-trained by VIRL shows substantial enhancements on demonstrating improved generalizability with limited data.
arXiv Detail & Related papers (2024-06-18T05:30:26Z)
Adaptive Affinity-Based Generalization For MRI Imaging Segmentation Across Resource-Limited Settings [1.5703963908242198]
This paper introduces a novel relation-based knowledge framework by seamlessly combining adaptive affinity-based and kernel-based distillation. To validate our innovative approach, we conducted experiments on publicly available multi-source prostate MRI data.
arXiv Detail & Related papers (2024-04-03T13:35:51Z)
Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes [7.765333471208582]
In Visual Inductive Priors challenge (VIPriors2023), participants must train a model capable of precisely locating individuals on a basketball court. We propose memory effIciency inStance framework based on visual inductive prior flow propagation. Experiments demonstrate our model promising performance even under limited data and memory constraints.
arXiv Detail & Related papers (2024-03-18T08:44:40Z)
CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation [128.00940554196976]
Vision-Language Continual Pretraining (VLCP) has shown impressive results on diverse downstream tasks by offline training on large-scale datasets. To support the study of Vision-Language Continual Pretraining (VLCP), we first contribute a comprehensive and unified benchmark dataset P9D. The data from each industry as an independent task supports continual learning and conforms to the real-world long-tail nature to simulate pretraining on web data.
arXiv Detail & Related papers (2023-08-14T13:53:18Z)
ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP) ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective. We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z)
Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization [3.6393183544320236]
Speech recognition has become an important challenge when using deep learning (DL) It requires large-scale training datasets and high computational and storage resources. Deep transfer learning (DTL) has been introduced to overcome these issues.
arXiv Detail & Related papers (2023-04-27T21:08:05Z)
Consecutive Pretraining: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain [25.84756140221655]
ConSecutive PreTraining (CSPT) is proposed based on the idea of not stopping pretraining in natural language processing (NLP) The proposed CSPT also can release the huge potential of unlabeled data for task-aware model training. The results show that by utilizing the proposed CSPT for task-aware model training, almost all downstream tasks in RSD can outperform the previous method of supervised pretraining-then-fine-tuning.
arXiv Detail & Related papers (2022-07-08T12:32:09Z)
DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference [85.02494022662505]
DANCE is an automated simultaneous data-network co-optimization for efficient segmentation model training and inference. It integrates automated data slimming which adaptively downsamples/drops input images and controls their corresponding contribution to the training loss guided by the images' spatial complexity. Experiments and ablating studies demonstrate that DANCE can achieve "all-win" towards efficient segmentation.
arXiv Detail & Related papers (2021-07-16T04:58:58Z)
Supercharging Imbalanced Data Learning With Energy-based Contrastive Representation Transfer [72.5190560787569]
In computer vision, learning from long tailed datasets is a recurring theme, especially for natural image datasets. Our proposal posits a meta-distributional scenario, where the data generating mechanism is invariant across the label-conditional feature distributions. This allows us to leverage a causal data inflation procedure to enlarge the representation of minority classes.
arXiv Detail & Related papers (2020-11-25T00:13:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.