End-to-end Evaluation of Practical Video Analytics Systems for Face
Detection and Recognition
- URL: http://arxiv.org/abs/2310.06945v1
- Date: Tue, 10 Oct 2023 19:06:10 GMT
- Title: End-to-end Evaluation of Practical Video Analytics Systems for Face
Detection and Recognition
- Authors: Praneet Singh, Edward J. Delp, Amy R. Reibman
- Abstract summary: Video analytics systems are deployed in bandwidth constrained environments like autonomous vehicles.
In an end-to-end face analytics system, inputs are first compressed using popular video codecs like HEVC.
We demonstrate how independent task evaluations, dataset imbalances, and inconsistent annotations can lead to incorrect system performance estimates.
- Score: 9.942007083253479
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Practical video analytics systems that are deployed in bandwidth constrained
environments like autonomous vehicles perform computer vision tasks such as
face detection and recognition. In an end-to-end face analytics system, inputs
are first compressed using popular video codecs like HEVC and then passed onto
modules that perform face detection, alignment, and recognition sequentially.
Typically, the modules of these systems are evaluated independently using
task-specific imbalanced datasets that can misconstrue performance estimates.
In this paper, we perform a thorough end-to-end evaluation of a face analytics
system using a driving-specific dataset, which enables meaningful
interpretations. We demonstrate how independent task evaluations, dataset
imbalances, and inconsistent annotations can lead to incorrect system
performance estimates. We propose strategies to create balanced evaluation
subsets of our dataset and to make its annotations consistent across multiple
analytics tasks and scenarios. We then evaluate the end-to-end system
performance sequentially to account for task interdependencies. Our experiments
show that our approach provides consistent, accurate, and interpretable
estimates of the system's performance which is critical for real-world
applications.
Related papers
- Stochastic Encodings for Active Feature Acquisition [100.47043816019888]
Active Feature Acquisition is an instance-wise, sequential decision making problem.<n>The aim is to dynamically select which feature to measure based on current observations, independently for each test instance.<n>Common approaches either use Reinforcement Learning, which experiences training difficulties, or greedily maximize the conditional mutual information of the label and unobserved features, which makes myopic.<n>We introduce a latent variable model, trained in a supervised manner. Acquisitions are made by reasoning about the features across many possible unobserved realizations in a latent space.
arXiv Detail & Related papers (2025-08-03T23:48:46Z) - Adapting Vision-Language Models for Evaluating World Models [24.813041196394582]
We present UNIVERSE, a method for adapting Vision-language Evaluator for Rollouts in Simulated Environments under data and compute constraints.<n>We conduct a large-scale study comparing full, partial, and parameter-efficient finetuning across task formats, context lengths, sampling strategies, and data compositions.<n>The resulting unified evaluator matches the performance of task-specific baselines using a single checkpoint.
arXiv Detail & Related papers (2025-06-22T09:53:28Z) - Bridging Subjective and Objective QoE: Operator-Level Aggregation Using LLM-Based Comment Analysis and Network MOS Comparison [1.5817271400571666]
This paper introduces a dual-layer framework for network operator-side quality of experience (QoE) assessment.<n>On the objective side, we develop a machine learning model trained on mean opinion scores (MOS) computed via the ITU-T P.1203 reference implementation.<n>On the subjective side, we present a semantic filtering and scoring pipeline that processes user comments from live streams to extract performance-related feedback.
arXiv Detail & Related papers (2025-06-01T09:31:55Z) - Model Monitoring in the Absence of Labeled Data via Feature Attributions Distributions [5.167069404528051]
This thesis explores machine learning model monitoring ML before the predictions impact real-world decisions or users.
The thesis is structured around two main themes: (i) AI alignment, measuring if AI models behave in a manner consistent with human values and (ii) performance monitoring, measuring if the models achieve specific accuracy goals or desires.
arXiv Detail & Related papers (2025-01-18T14:07:37Z) - A Control-Centric Benchmark for Video Prediction [69.22614362800692]
We propose a benchmark for action-conditioned video prediction in the form of a control benchmark.
Our benchmark includes simulated environments with 11 task categories and 310 task instance definitions.
We then leverage our benchmark to study the effects of scaling model size, quantity of training data, and model ensembling.
arXiv Detail & Related papers (2023-04-26T17:59:45Z) - Mitigating Representation Bias in Action Recognition: Algorithms and
Benchmarks [76.35271072704384]
Deep learning models perform poorly when applied to videos with rare scenes or objects.
We tackle this problem from two different angles: algorithm and dataset.
We show that the debiased representation can generalize better when transferred to other datasets and tasks.
arXiv Detail & Related papers (2022-09-20T00:30:35Z) - Formalizing and Evaluating Requirements of Perception Systems for
Automated Vehicles using Spatio-Temporal Perception Logic [25.070876549371693]
We present a logic that enables reasoning over perception data using spatial and temporal operators.
One major advantage ofSTPL is that it facilitates basic sanity checks on the functional performance of the perception system.
arXiv Detail & Related papers (2022-06-29T02:36:53Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z) - Injecting Planning-Awareness into Prediction and Detection Evaluation [42.228191984697006]
We take a step back and critically assess current evaluation metrics, proposing task-aware metrics as a better measure of performance in systems where they are deployed.
Experiments on an illustrative simulation as well as real-world autonomous driving data validate that our proposed task-aware metrics are able to account for outcome asymmetry and provide a better estimate of a model's closed-loop performance.
arXiv Detail & Related papers (2021-10-07T08:52:48Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - Robust Learning Through Cross-Task Consistency [92.42534246652062]
We propose a broadly applicable and fully computational method for augmenting learning with Cross-Task Consistency.
We observe that learning with cross-task consistency leads to more accurate predictions and better generalization to out-of-distribution inputs.
arXiv Detail & Related papers (2020-06-07T09:24:33Z) - A Revised Generative Evaluation of Visual Dialogue [80.17353102854405]
We propose a revised evaluation scheme for the VisDial dataset.
We measure consensus between answers generated by the model and a set of relevant answers.
We release these sets and code for the revised evaluation scheme as DenseVisDial.
arXiv Detail & Related papers (2020-04-20T13:26:45Z) - A Visual Analytics Framework for Reviewing Streaming Performance Data [20.61348106852359]
We introduce a visual analytic framework comprising of three modules: data management, analysis, and interactive visualization.
In particular, we introduce a set of online and progressive analysis methods for not only controlling the computational costs but also helping analysts better follow the critical aspects of the analysis results.
arXiv Detail & Related papers (2020-01-26T04:34:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.