Do Deep Neural Networks Always Perform Better When Eating More Data?
- URL: http://arxiv.org/abs/2205.15187v1
- Date: Mon, 30 May 2022 15:40:33 GMT
- Title: Do Deep Neural Networks Always Perform Better When Eating More Data?
- Authors: Jiachen Yang, Zhuo Zhang, Yicheng Gong, Shukun Ma, Xiaolan Guo, Yue
Yang, Shuai Xiao, Jiabao Wen, Yang Li, Xinbo Gao, Wen Lu and Qinggang Meng
- Abstract summary: We design experiments from Identically Independent Distribution(IID) and Out of Distribution(OOD)
Under IID condition, the amount of information determines the effectivity of each sample, the contribution of samples and difference between classes determine the amount of class information.
Under OOD condition, the cross-domain degree of samples determine the contributions, and the bias-fitting caused by irrelevant elements is a significant factor of cross-domain.
- Score: 82.6459747000664
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data has now become a shortcoming of deep learning. Researchers in their own
fields share the thinking that "deep neural networks might not always perform
better when they eat more data," which still lacks experimental validation and
a convincing guiding theory. Here to fill this lack, we design experiments from
Identically Independent Distribution(IID) and Out of Distribution(OOD), which
give powerful answers. For the purpose of guidance, based on the discussion of
results, two theories are proposed: under IID condition, the amount of
information determines the effectivity of each sample, the contribution of
samples and difference between classes determine the amount of sample
information and the amount of class information; under OOD condition, the
cross-domain degree of samples determine the contributions, and the
bias-fitting caused by irrelevant elements is a significant factor of
cross-domain. The above theories provide guidance from the perspective of data,
which can promote a wide range of practical applications of artificial
intelligence.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.