Related papers: Information-driven design of imaging systems

Related papers

How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing [56.60465182650588]
We introduce three-level interaction hierarchy that captures deictic grounding, morphological manipulation, and causal reasoning.<n>We propose a robust LMM-as-a-judge evaluation framework with task-specific metrics to enable scalable and fine-grained assessment.<n>We find that proprietary models exhibit early-stage visual instruction-following capabilities and consistently outperform open-source models.
arXiv Detail & Related papers (2026-02-02T09:24:45Z)
Computationally Efficient Information-Driven Optical Design with Interchanging Optimization [2.519074131450768]
Information-Driven Analysis Learning (I) was proposed to automate this process through gradient-based optimization.<n>I suffers from high memory usage, long runtimes, and a potentially mismatched objective function due to end-to-end differentiability requirements.<n>We introduce I with Interchanging Optimization (I-IO), a method that scalable decouples density estimation from optical parameter optimization.
arXiv Detail & Related papers (2025-07-10T14:14:08Z)
A Comparative Study of Scanpath Models in Graph-Based Visualization [7.592272924252313]
Eye-tracking (ET) data presents challenges related to cost, privacy, and scalability.<n>In our study, we conducted an ET experiment with 40 participants who analyzed graphs.<n>We compared human scanpaths with synthetic ones generated by models such as DeepGaze, UMSS, and Gazeformer.
arXiv Detail & Related papers (2025-03-31T14:43:42Z)
Data Quality Matters: Quantifying Image Quality Impact on Machine Learning Performance [0.1398098625978622]
This paper presents a framework to evaluate the impact of image modifications on machine learning tasks.<n>The LPIPS metric achieves the highest correlation between image deviation and machine learning performance.
arXiv Detail & Related papers (2025-03-28T12:28:44Z)
Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents [62.616106562146776]
We propose a textbfVisual-Centric textbfSelection approach via textbfAgents Collaboration (ViSA) Our approach consists of 1) an image information quantification method via visual agents collaboration to select images with rich visual information, and 2) a visual-centric instruction quality assessment method to select high-quality instruction data related to high-quality images.
arXiv Detail & Related papers (2025-02-27T09:37:30Z)
Rewards-based image analysis in microscopy [2.906546126874626]
Analyzing imaging and hyperspectral data is crucial across scientific fields, including biology, medicine, chemistry, and physics. Currently, this task relies on complex, human-designed iterative steps such as denoising, spatial sampling, keypoint detection, feature generation, clustering, dimensionality reduction, and physics-based deconvolutions. The introduction of machine learning over the past decade has accelerated tasks like image segmentation and object detection via supervised learning, and dimensionality reduction via unsupervised methods. Here, we discuss advances in reward-based, which adopt expert decision-making principles and demonstrate strong transfer learning across
arXiv Detail & Related papers (2025-02-23T19:19:38Z)
Estimating Task-based Performance Bounds for Accelerated MRI Image Reconstruction Methods by Use of Learned-Ideal Observers [7.765750378590293]
The performance of the ideal observer (IO) acting on imaging measurements has long been advocated as a figure-of-merit to guide the optimization of imaging systems.<n> estimation of IO performance can provide valuable guidance when designing under-sampled data-acquisition techniques.
arXiv Detail & Related papers (2025-01-16T01:09:30Z)
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations [64.07859467542664]
Capturing geometric and material information from images remains a fundamental challenge in computer vision and graphics.<n>Traditional optimization-based methods often require hours of computational time to reconstruct geometry, material properties, and environmental lighting from dense multi-view inputs.<n>We introduce IDArb, a diffusion-based model designed to perform intrinsic decomposition on an arbitrary number of images under varying illuminations.
arXiv Detail & Related papers (2024-12-16T18:52:56Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization. We introduce a benchmark comprising eight different synthetic and real-world datasets. We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z)
Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate. We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data. Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z)
End-to-end Evaluation of Practical Video Analytics Systems for Face Detection and Recognition [9.942007083253479]
Video analytics systems are deployed in bandwidth constrained environments like autonomous vehicles. In an end-to-end face analytics system, inputs are first compressed using popular video codecs like HEVC. We demonstrate how independent task evaluations, dataset imbalances, and inconsistent annotations can lead to incorrect system performance estimates.
arXiv Detail & Related papers (2023-10-10T19:06:10Z)
Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects. We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z)
Privacy-Preserving Graph Machine Learning from Data to Computation: A Survey [67.7834898542701]
We focus on reviewing privacy-preserving techniques of graph machine learning. We first review methods for generating privacy-preserving graph data. Then we describe methods for transmitting privacy-preserved information.
arXiv Detail & Related papers (2023-07-10T04:30:23Z)
Combining Variational Autoencoders and Physical Bias for Improved Microscopy Data Analysis [0.0]
We present a physics augmented machine learning method which disentangles factors of variability within the data. Our method is applied to various materials, including NiO-LSMO, BiFeO3, and graphene. The results demonstrate the effectiveness of our approach in extracting meaningful information from large volumes of imaging data.
arXiv Detail & Related papers (2023-02-08T17:35:38Z)
Monocular Depth Estimation Using Cues Inspired by Biological Vision Systems [22.539300644593936]
Monocular depth estimation (MDE) aims to transform an RGB image of a scene into a pixelwise depth map from the same camera view. Part of the MDE task is to learn which visual cues in the image can be used for depth estimation, and how. We demonstrate that explicitly injecting visual cue information into the model is beneficial for depth estimation.
arXiv Detail & Related papers (2022-04-21T19:42:36Z)
A workflow for segmenting soil and plant X-ray CT images with deep learning in Googles Colaboratory [45.99558884106628]
We develop a modular workflow for applying convolutional neural networks to X-ray microCT images. We show how parameters can be optimized to achieve best results using example scans from walnut leaves, almond flower buds, and a soil aggregate.
arXiv Detail & Related papers (2022-03-18T00:47:32Z)
Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation. The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z)
Human-in-the-Loop Disinformation Detection: Stance, Sentiment, or Something Else? [93.91375268580806]
Both politics and pandemics have recently provided ample motivation for the development of machine learning-enabled disinformation (a.k.a. fake news) detection algorithms. Existing literature has focused primarily on the fully-automated case, but the resulting techniques cannot reliably detect disinformation on the varied topics, sources, and time scales required for military applications. By leveraging an already-available analyst as a human-in-the-loop, canonical machine learning techniques of sentiment analysis, aspect-based sentiment analysis, and stance detection become plausible methods to use for a partially-automated disinformation detection system.
arXiv Detail & Related papers (2021-11-09T13:30:34Z)
DONet: Learning Category-Level 6D Object Pose and Size Estimation from Depth Observation [53.55300278592281]
We propose a method of Category-level 6D Object Pose and Size Estimation (COPSE) from a single depth image. Our framework makes inferences based on the rich geometric information of the object in the depth channel alone. Our framework competes with state-of-the-art approaches that require labeled real-world images.
arXiv Detail & Related papers (2021-06-27T10:41:50Z)
A parameter refinement method for Ptychography based on Deep Learning concepts [55.41644538483948]
coarse parametrisation in propagation distance, position errors and partial coherence frequently menaces the experiment viability. A modern Deep Learning framework is used to correct autonomously the setup incoherences, thus improving the quality of a ptychography reconstruction. We tested our system on both synthetic datasets and also on real data acquired at the TwinMic beamline of the Elettra synchrotron facility.
arXiv Detail & Related papers (2021-05-18T10:15:17Z)
Factors of Influence for Transfer Learning across Diverse Appearance Domains and Task Types [50.1843146606122]
A simple form of transfer learning is common in current state-of-the-art computer vision models. Previous systematic studies of transfer learning have been limited and the circumstances in which it is expected to work are not fully understood. In this paper we carry out an extensive experimental exploration of transfer learning across vastly different image domains.
arXiv Detail & Related papers (2021-03-24T16:24:20Z)
Exploiting Raw Images for Real-Scene Super-Resolution [105.18021110372133]
We study the problem of real-scene single image super-resolution to bridge the gap between synthetic data and real captured images. We propose a method to generate more realistic training data by mimicking the imaging process of digital cameras. We also develop a two-branch convolutional neural network to exploit the radiance information originally-recorded in raw images.
arXiv Detail & Related papers (2021-02-02T16:10:15Z)
A Survey on Deep Learning Methods for Semantic Image Segmentation in Real-Time [0.0]
In many areas, such as robotics and autonomous vehicles, semantic image segmentation is crucial. The success of medical diagnosis and treatment relies on the extremely accurate understanding of the data under consideration. Recent developments in deep learning have provided a host of tools to tackle this problem efficiently and with increased accuracy.
arXiv Detail & Related papers (2020-09-27T20:30:10Z)
Fed-Sim: Federated Simulation for Medical Imaging [131.56325440976207]
We introduce a physics-driven generative approach that consists of two learnable neural modules. We show that our data synthesis framework improves the downstream segmentation performance on several datasets.
arXiv Detail & Related papers (2020-09-01T19:17:46Z)
Online Graph Completion: Multivariate Signal Recovery in Computer Vision [29.89364298411089]
We study the "completion" problem defined on graphs, where requests for additional measurements must be made sequentially. We design the optimization model in the Fourier domain of the graph describing how ideas based on adaptive submodularity provide algorithms that work well in practice. On a large set of images collected from Imgur, we see promising results on images that are otherwise difficult to categorize.
arXiv Detail & Related papers (2020-08-12T01:34:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.