UESegNet: Context Aware Unconstrained ROI Segmentation Networks for Ear
Biometric
- URL: http://arxiv.org/abs/2010.03990v1
- Date: Thu, 8 Oct 2020 14:05:15 GMT
- Title: UESegNet: Context Aware Unconstrained ROI Segmentation Networks for Ear
Biometric
- Authors: Aman Kamboj, Rajneesh Rani, Aditya Nigam, Ranjeet Ranjan Jha
- Abstract summary: ear biometrics possess a great level of difficulties in the unconstrained environment.
To address the problem of ear localization in the wild, we have proposed two high-performance region of interest (ROI) segmentation models UESegNet-1 and UESegNet-2.
To test the model's generalization, they are evaluated on six different benchmark datasets.
- Score: 8.187718963808484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Biometric-based personal authentication systems have seen a strong demand
mainly due to the increasing concern in various privacy and security
applications. Although the use of each biometric trait is problem dependent,
the human ear has been found to have enough discriminating characteristics to
allow its use as a strong biometric measure. To locate an ear in a 2D side face
image is a challenging task, numerous existing approaches have achieved
significant performance, but the majority of studies are based on the
constrained environment. However, ear biometrics possess a great level of
difficulties in the unconstrained environment, where pose, scale, occlusion,
illuminations, background clutter etc. varies to a great extent. To address the
problem of ear localization in the wild, we have proposed two high-performance
region of interest (ROI) segmentation models UESegNet-1 and UESegNet-2, which
are fundamentally based on deep convolutional neural networks and primarily
uses contextual information to localize ear in the unconstrained environment.
Additionally, we have applied state-of-the-art deep learning models viz; FRCNN
(Faster Region Proposal Network) and SSD (Single Shot MultiBox Detecor) for ear
localization task. To test the model's generalization, they are evaluated on
six different benchmark datasets viz; IITD, IITK, USTB-DB3, UND-E, UND-J2 and
UBEAR, all of which contain challenging images. The performance of the models
is compared on the basis of object detection performance measure parameters
such as IOU (Intersection Over Union), Accuracy, Precision, Recall, and
F1-Score. It has been observed that the proposed models UESegNet-1 and
UESegNet-2 outperformed the FRCNN and SSD at higher values of IOUs i.e. an
accuracy of 100\% is achieved at IOU 0.5 on majority of the databases.
Related papers
- EMWaveNet: Physically Explainable Neural Network Based on Microwave Propagation for SAR Target Recognition [4.251056028888424]
This study proposes a physically explainable framework for complex-valued SAR image recognition.
The network architecture is fully parameterized, with all learnable parameters with clear physical meanings, and the computational process is completed entirely in the frequency domain.
The results demonstrate that the proposed method possesses a strong physical decision logic, high physical explainability and robustness, as well as excellent dealiasing capabilities.
arXiv Detail & Related papers (2024-10-13T07:04:49Z) - Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments.
We propose UOIS-SAM, a data-efficient solution for the UOIS task.
UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z) - FSD: Fully-Specialized Detector via Neural Architecture Search [2.149718433100702]
We first propose and examine a fully-automatic pipeline to design a fully-specialized detector (FSD)
On the DeepLesion dataset, extensive results show that FSD can achieve 3.1 mAP gain while using approximately 40% fewer parameters on binary lesion detection task.
arXiv Detail & Related papers (2023-05-26T05:41:20Z) - SAN: a robust end-to-end ASR model architecture [0.0]
Siamese Adversarial Network (SAN) architecture for automatic speech recognition.
SAN constructs two sub-networks to differentiate the audio feature input and then introduces a loss to unify the output distribution of these sub-networks.
We conduct numerical experiments with the SAN model on several datasets for the automatic speech recognition task.
arXiv Detail & Related papers (2022-10-27T09:36:25Z) - Deep Multi-Scale U-Net Architecture and Noise-Robust Training Strategies
for Histopathological Image Segmentation [6.236433671063744]
We propose to explicitly add multi-scale feature maps in each convolutional module of the U-Net encoder to improve segmentation of histology images.
In experiments on a private dataset of breast cancer lymph nodes, we observed substantial improvement over a U-Net baseline based on the two proposed augmentations.
arXiv Detail & Related papers (2022-05-03T21:00:44Z) - Priming Cross-Session Motor Imagery Classification with A Universal Deep
Domain Adaptation Framework [3.6824205556465834]
Motor imagery (MI) is a common brain computer interface (BCI) paradigm.
We propose a Siamese deep domain adaptation (SDDA) framework for cross-session MI classification based on mathematical models in domain adaptation theory.
The proposed framework can be easily applied to most existing artificial neural networks without altering the network structure.
arXiv Detail & Related papers (2022-02-19T09:30:08Z) - AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance.
We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations.
AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z) - Efficient Person Search: An Anchor-Free Approach [86.45858994806471]
Person search aims to simultaneously localize and identify a query person from realistic, uncropped images.
To achieve this goal, state-of-the-art models typically add a re-id branch upon two-stage detectors like Faster R-CNN.
In this work, we present an anchor-free approach to efficiently tackling this challenging task, by introducing the following dedicated designs.
arXiv Detail & Related papers (2021-09-01T07:01:33Z) - UniCon: Unified Context Network for Robust Active Speaker Detection [111.90529347692723]
We introduce a new efficient framework, the Unified Context Network (UniCon), for robust active speaker detection (ASD)
Our solution is a novel, unified framework that focuses on jointly modeling multiple types of contextual information.
A thorough ablation study is performed on several challenging ASD benchmarks under different settings.
arXiv Detail & Related papers (2021-08-05T13:25:44Z) - KiU-Net: Towards Accurate Segmentation of Biomedical Images using
Over-complete Representations [59.65174244047216]
We propose an over-complete architecture (Ki-Net) which involves projecting the data onto higher dimensions.
This network, when augmented with U-Net, results in significant improvements in the case of segmenting small anatomical landmarks.
We evaluate the proposed method on the task of brain anatomy segmentation from 2D Ultrasound of preterm neonates.
arXiv Detail & Related papers (2020-06-08T18:59:24Z) - Deep Speaker Embeddings for Far-Field Speaker Recognition on Short
Utterances [53.063441357826484]
Speaker recognition systems based on deep speaker embeddings have achieved significant performance in controlled conditions.
Speaker verification on short utterances in uncontrolled noisy environment conditions is one of the most challenging and highly demanded tasks.
This paper presents approaches aimed to achieve two goals: a) improve the quality of far-field speaker verification systems in the presence of environmental noise, reverberation and b) reduce the system qualitydegradation for short utterances.
arXiv Detail & Related papers (2020-02-14T13:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.