Deep Learning Techniques for Visual Counting
- URL: http://arxiv.org/abs/2206.03033v2
- Date: Wed, 8 Jun 2022 16:29:22 GMT
- Title: Deep Learning Techniques for Visual Counting
- Authors: Luca Ciampi
- Abstract summary: We investigated and enhanced Deep Learning (DL) techniques for counting objects in still images or video frames.
In particular, we tackled the challenge related to the lack of data needed for training current DL-based solutions.
- Score: 0.13537117504260618
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this dissertation, we investigated and enhanced Deep Learning (DL)
techniques for counting objects, like pedestrians, cells or vehicles, in still
images or video frames. In particular, we tackled the challenge related to the
lack of data needed for training current DL-based solutions. Given that the
budget for labeling is limited, data scarcity still represents an open problem
that prevents the scalability of existing solutions based on the supervised
learning of neural networks and that is responsible for a significant drop in
performance at inference time when new scenarios are presented to these
algorithms. We introduced solutions addressing this issue from several
complementary sides, collecting datasets gathered from virtual environments
automatically labeled, proposing Domain Adaptation strategies aiming at
mitigating the domain gap existing between the training and test data
distributions, and presenting a counting strategy in a weakly labeled data
scenario, i.e., in the presence of non-negligible disagreement between multiple
annotators. Moreover, we tackled the non-trivial engineering challenges coming
out of the adoption of Convolutional Neural Network-based techniques in
environments with limited power resources, introducing solutions for counting
vehicles and pedestrians directly onboard embedded vision systems, i.e.,
devices equipped with constrained computational capabilities that can capture
images and elaborate them.
Related papers
- OpenGU: A Comprehensive Benchmark for Graph Unlearning [24.605943688948038]
Graph Unlearning (GU) has emerged as a critical solution for privacy-sensitive applications.
We present OpenGU, the first GU benchmark, where 16 SOTA GU algorithms and 37 multi-domain datasets are integrated.
We draw $8$ crucial conclusions about existing GU methods, while also gaining valuable insights into their limitations.
arXiv Detail & Related papers (2025-01-06T02:57:32Z) - Learning for Cross-Layer Resource Allocation in MEC-Aided Cell-Free Networks [71.30914500714262]
Cross-layer resource allocation over mobile edge computing (MEC)-aided cell-free networks can sufficiently exploit the transmitting and computing resources to promote the data rate.
Joint subcarrier allocation and beamforming optimization are investigated for the MEC-aided cell-free network from the perspective of deep learning.
arXiv Detail & Related papers (2024-12-21T10:18:55Z) - Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents.
Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z) - A Proper Orthogonal Decomposition approach for parameters reduction of
Single Shot Detector networks [0.0]
We propose a dimensionality reduction framework based on Proper Orthogonal Decomposition, a classical model order reduction technique.
We have applied such framework to SSD300 architecture using PASCAL VOC dataset, demonstrating a reduction of the network dimension and a remarkable speedup in the fine-tuning of the network in a transfer learning context.
arXiv Detail & Related papers (2022-07-27T14:43:14Z) - Federated Deep Learning Meets Autonomous Vehicle Perception: Design and
Verification [168.67190934250868]
Federated learning empowered connected autonomous vehicle (FLCAV) has been proposed.
FLCAV preserves privacy while reducing communication and annotation costs.
It is challenging to determine the network resources and road sensor poses for multi-stage training.
arXiv Detail & Related papers (2022-06-03T23:55:45Z) - Exploring Data Aggregation and Transformations to Generalize across
Visual Domains [0.0]
This thesis contributes to research on Domain Generalization (DG), Domain Adaptation (DA) and their variations.
We propose new frameworks for Domain Generalization and Domain Adaptation which make use of feature aggregation strategies and visual transformations.
We show how our proposed solutions outperform competitive state-of-the-art approaches in established DG and DA benchmarks.
arXiv Detail & Related papers (2021-08-20T14:58:14Z) - Visual Domain Adaptation for Monocular Depth Estimation on
Resource-Constrained Hardware [3.7399856406582086]
We address the problem of training deep neural networks on resource-constrained hardware in the context of visual domain adaptation.
We present an adversarial learning approach that is adapted for training on the device with limited resources.
Our experiments show that visual domain adaptation is relevant only for efficient network architectures and training sets.
arXiv Detail & Related papers (2021-08-05T15:10:00Z) - Learning a Domain-Agnostic Visual Representation for Autonomous Driving
via Contrastive Loss [25.798361683744684]
Domain-Agnostic Contrastive Learning (DACL) is a two-stage unsupervised domain adaptation framework with cyclic adversarial training and contrastive loss.
Our proposed approach achieves better performance in the monocular depth estimation task compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-10T07:06:03Z) - A Review of Single-Source Deep Unsupervised Visual Domain Adaptation [81.07994783143533]
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks.
In many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data.
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
arXiv Detail & Related papers (2020-09-01T00:06:50Z) - Binary Neural Networks: A Survey [126.67799882857656]
The binary neural network serves as a promising technique for deploying deep models on resource-limited devices.
The binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network.
We present a survey of these algorithms, mainly categorized into the native solutions directly conducting binarization, and the optimized ones using techniques like minimizing the quantization error, improving the network loss function, and reducing the gradient error.
arXiv Detail & Related papers (2020-03-31T16:47:20Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.