BigSmall: Efficient Multi-Task Learning for Disparate Spatial and
Temporal Physiological Measurements
- URL: http://arxiv.org/abs/2303.11573v2
- Date: Fri, 17 Nov 2023 09:33:22 GMT
- Title: BigSmall: Efficient Multi-Task Learning for Disparate Spatial and
Temporal Physiological Measurements
- Authors: Girish Narayanswamy, Yujia Liu, Yuzhe Yang, Chengqian Ma, Xin Liu,
Daniel McDuff, Shwetak Patel
- Abstract summary: We present BigSmall, an efficient architecture for physiological and behavioral measurement.
We propose a multi-branch network with wrapping temporal shift modules that yields both accuracy and efficiency gains.
- Score: 28.573472322978507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding of human visual perception has historically inspired the design
of computer vision architectures. As an example, perception occurs at different
scales both spatially and temporally, suggesting that the extraction of salient
visual information may be made more effective by paying attention to specific
features at varying scales. Visual changes in the body due to physiological
processes also occur at different scales and with modality-specific
characteristic properties. Inspired by this, we present BigSmall, an efficient
architecture for physiological and behavioral measurement. We present the first
joint camera-based facial action, cardiac, and pulmonary measurement model. We
propose a multi-branch network with wrapping temporal shift modules that yields
both accuracy and efficiency gains. We observe that fusing low-level features
leads to suboptimal performance, but that fusing high level features enables
efficiency gains with negligible loss in accuracy. Experimental results
demonstrate that BigSmall significantly reduces the computational costs.
Furthermore, compared to existing task-specific models, BigSmall achieves
comparable or better results on multiple physiological measurement tasks
simultaneously with a unified model.
Related papers
- When Medical Imaging Met Self-Attention: A Love Story That Didn't Quite Work Out [8.113092414596679]
We extend two widely adopted convolutional architectures with different self-attention variants on two different medical datasets.
We observe no significant improvement in balanced accuracy over fully convolutional models.
We also find that important features, such as dermoscopic structures in skin lesion images, are still not learned by employing self-attention.
arXiv Detail & Related papers (2024-04-18T16:18:41Z) - What Matters When Repurposing Diffusion Models for General Dense Perception Tasks? [49.84679952948808]
Recent works show promising results by simply fine-tuning T2I diffusion models for dense perception tasks.
We conduct a thorough investigation into critical factors that affect transfer efficiency and performance when using diffusion priors.
Our work culminates in the development of GenPercept, an effective deterministic one-step fine-tuning paradigm tailed for dense visual perception tasks.
arXiv Detail & Related papers (2024-03-10T04:23:24Z) - Understanding Self-attention Mechanism via Dynamical System Perspective [58.024376086269015]
Self-attention mechanism (SAM) is widely used in various fields of artificial intelligence.
We show that intrinsic stiffness phenomenon (SP) in the high-precision solution of ordinary differential equations (ODEs) also widely exists in high-performance neural networks (NN)
We show that the SAM is also a stiffness-aware step size adaptor that can enhance the model's representational ability to measure intrinsic SP.
arXiv Detail & Related papers (2023-08-19T08:17:41Z) - Predicting Biomedical Interactions with Probabilistic Model Selection
for Graph Neural Networks [5.156812030122437]
Current biological networks are noisy, sparse, and incomplete. Experimental identification of such interactions is both time-consuming and expensive.
Deep graph neural networks have shown their effectiveness in modeling graph-structured data and achieved good performance in biomedical interaction prediction.
Our proposed method enables the graph convolutional networks to dynamically adapt their depths to accommodate an increasing number of interactions.
arXiv Detail & Related papers (2022-11-22T20:44:28Z) - Impact of spiking neurons leakages and network recurrences on
event-based spatio-temporal pattern recognition [0.0]
Spiking neural networks coupled with neuromorphic hardware and event-based sensors are getting increased interest for low-latency and low-power inference at the edge.
We explore the impact of synaptic and membrane leakages in spiking neurons.
arXiv Detail & Related papers (2022-11-14T21:34:02Z) - Perception Over Time: Temporal Dynamics for Robust Image Understanding [5.584060970507506]
Deep learning surpasses human-level performance in narrow and specific vision tasks.
Human visual perception is orders of magnitude more robust to changes in the input stimulus.
We introduce a novel method of incorporating temporal dynamics into static image understanding.
arXiv Detail & Related papers (2022-03-11T21:11:59Z) - Deep Collaborative Multi-Modal Learning for Unsupervised Kinship
Estimation [53.62256887837659]
Kinship verification is a long-standing research challenge in computer vision.
We propose a novel deep collaborative multi-modal learning (DCML) to integrate the underlying information presented in facial properties.
Our DCML method is always superior to some state-of-the-art kinship verification methods.
arXiv Detail & Related papers (2021-09-07T01:34:51Z) - On the Robustness of Pretraining and Self-Supervision for a Deep
Learning-based Analysis of Diabetic Retinopathy [70.71457102672545]
We compare the impact of different training procedures for diabetic retinopathy grading.
We investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions.
Our results indicate that models from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions.
arXiv Detail & Related papers (2021-06-25T08:32:45Z) - Emergent Hand Morphology and Control from Optimizing Robust Grasps of
Diverse Objects [63.89096733478149]
We introduce a data-driven approach where effective hand designs naturally emerge for the purpose of grasping diverse objects.
We develop a novel Bayesian Optimization algorithm that efficiently co-designs the morphology and grasping skills jointly.
We demonstrate the effectiveness of our approach in discovering robust and cost-efficient hand morphologies for grasping novel objects.
arXiv Detail & Related papers (2020-12-22T17:52:29Z) - Video-based Remote Physiological Measurement via Cross-verified Feature
Disentangling [121.50704279659253]
We propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations.
We then use the distilled physiological features for robust multi-task physiological measurements.
The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and r signals.
arXiv Detail & Related papers (2020-07-16T09:39:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.