Applicability of scaling laws to vision encoding models
- URL: http://arxiv.org/abs/2308.00678v1
- Date: Tue, 1 Aug 2023 17:31:14 GMT
- Title: Applicability of scaling laws to vision encoding models
- Authors: Takuya Matsuyama, Kota S Sasaki, Shinji Nishimoto
- Abstract summary: We investigated how to build a high-performance vision encoding model to predict brain activity as part of our participation in the Algonauts Project 2023 Challenge.
The challenge provided brain activity recorded by functional MRI (fMRI) while participants viewed images.
Several vision models with parameter sizes ranging from 86M to 4.3B were used to build predictive models.
- Score: 0.7734726150561089
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we investigated how to build a high-performance vision
encoding model to predict brain activity as part of our participation in the
Algonauts Project 2023 Challenge. The challenge provided brain activity
recorded by functional MRI (fMRI) while participants viewed images. Several
vision models with parameter sizes ranging from 86M to 4.3B were used to build
predictive models. To build highly accurate models, we focused our analysis on
two main aspects: (1) How does the sample size of the fMRI training set change
the prediction accuracy? (2) How does the prediction accuracy across the visual
cortex vary with the parameter size of the vision models? The results show that
as the sample size used during training increases, the prediction accuracy
improves according to the scaling law. Similarly, we found that as the
parameter size of the vision models increases, the prediction accuracy improves
according to the scaling law. These results suggest that increasing the sample
size of the fMRI training set and the parameter size of visual models may
contribute to more accurate visual models of the brain and lead to a better
understanding of visual neuroscience.
Related papers
- Scaling Laws for Task-Optimized Models of the Primate Visual Ventral Stream [3.4526439922541705]
We evaluate scaling laws for modeling the primate visual ventral stream (VVS)
We observe that while behavioral alignment continues to scale with larger models, neural alignment saturates.
Increased scaling is especially beneficial for higher-level visual areas, where small models trained on few samples exhibit only poor alignment.
arXiv Detail & Related papers (2024-11-08T17:13:53Z) - Scaling Laws For Dense Retrieval [22.76001461620846]
We investigate whether the performance of dense retrieval models follows the scaling law as other neural models.
Results indicate that, under our settings, the performance of dense retrieval models follows a precise power-law scaling related to the model size and the number of annotations.
arXiv Detail & Related papers (2024-03-27T15:27:36Z) - Understanding Calibration of Deep Neural Networks for Medical Image
Classification [3.461503547789351]
This study explores model performance and calibration under different training regimes.
We consider fully supervised training, as well as rotation-based self-supervised method with and without transfer learning.
Our study reveals that factors such as weight distributions and the similarity of learned representations correlate with the calibration trends observed in the models.
arXiv Detail & Related papers (2023-09-22T18:36:07Z) - Scaling laws for language encoding models in fMRI [47.498241053872924]
We tested whether larger open-source models are better at predicting brain responses recorded using fMRI.
Similar logarithmic behavior was observed when scaling the size of the fMRI training set.
These results suggest that increasing scale in both models and data will yield incredibly effective models of language processing in the brain.
arXiv Detail & Related papers (2023-05-19T17:53:03Z) - Advancing Plain Vision Transformer Towards Remote Sensing Foundation
Model [97.9548609175831]
We resort to plain vision transformers with about 100 million parameters and make the first attempt to propose large vision models customized for remote sensing tasks.
Specifically, to handle the large image size and objects of various orientations in RS images, we propose a new rotated varied-size window attention.
Experiments on detection tasks demonstrate the superiority of our model over all state-of-the-art models, achieving 81.16% mAP on the DOTA-V1.0 dataset.
arXiv Detail & Related papers (2022-08-08T09:08:40Z) - MoEfication: Conditional Computation of Transformer Models for Efficient
Inference [66.56994436947441]
Transformer-based pre-trained language models can achieve superior performance on most NLP tasks due to large parameter capacity, but also lead to huge computation cost.
We explore to accelerate large-model inference by conditional computation based on the sparse activation phenomenon.
We propose to transform a large model into its mixture-of-experts (MoE) version with equal model size, namely MoEfication.
arXiv Detail & Related papers (2021-10-05T02:14:38Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z) - Are Visual Explanations Useful? A Case Study in Model-in-the-Loop
Prediction [49.254162397086006]
We study explanations based on visual saliency in an image-based age prediction task.
We find that presenting model predictions improves human accuracy.
However, explanations of various kinds fail to significantly alter human accuracy or trust in the model.
arXiv Detail & Related papers (2020-07-23T20:39:40Z) - Modelling the Distribution of 3D Brain MRI using a 2D Slice VAE [66.63629641650572]
We propose a method to model 3D MR brain volumes distribution by combining a 2D slice VAE with a Gaussian model that captures the relationships between slices.
We also introduce a novel evaluation method for generated volumes that quantifies how well their segmentations match those of true brain anatomy.
arXiv Detail & Related papers (2020-07-09T13:23:15Z) - Anatomical Predictions using Subject-Specific Medical Data [7.635279671482444]
We present a method that predicts how a brain MRI for an individual will change over time.
Given a predicted deformation field, a baseline scan can be warped to give a prediction of the brain scan at a future time.
arXiv Detail & Related papers (2020-05-29T21:30:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.