Ethical Considerations and Statistical Analysis of Industry Involvement
in Machine Learning Research
- URL: http://arxiv.org/abs/2006.04541v2
- Date: Mon, 19 Oct 2020 13:01:51 GMT
- Title: Ethical Considerations and Statistical Analysis of Industry Involvement
in Machine Learning Research
- Authors: Thilo Hagendorff, Kristof Meding
- Abstract summary: We have examined all papers of the main ML conferences NeurIPS, CVPR, and ICML of the last 5 years.
Our statistical approach focuses on conflicts of interest, innovation and gender equality.
- Score: 3.4773470589069473
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Industry involvement in the machine learning (ML) community seems to be
increasing. However, the quantitative scale and ethical implications of this
influence are rather unknown. For this purpose, we have not only carried out an
informed ethical analysis of the field, but have inspected all papers of the
main ML conferences NeurIPS, CVPR, and ICML of the last 5 years - almost 11,000
papers in total. Our statistical approach focuses on conflicts of interest,
innovation and gender equality. We have obtained four main findings: (1)
Academic-corporate collaborations are growing in numbers. At the same time, we
found that conflicts of interest are rarely disclosed. (2) Industry publishes
papers about trending ML topics on average two years earlier than academia
does. (3) Industry papers are not lagging behind academic papers in regard to
social impact considerations. (4) Finally, we demonstrate that industrial
papers fall short of their academic counterparts with respect to the ratio of
gender diversity. We believe that this work is a starting point for an informed
debate within and outside of the ML community.
Related papers
- Mapping the Increasing Use of LLMs in Scientific Papers [99.67983375899719]
We conduct the first systematic, large-scale analysis across 950,965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals.
Our findings reveal a steady increase in LLM usage, with the largest and fastest growth observed in Computer Science papers.
arXiv Detail & Related papers (2024-04-01T17:45:15Z) - Position: AI/ML Influencers Have a Place in the Academic Process [82.2069685579588]
We investigate the role of social media influencers in enhancing the visibility of machine learning research.
We have compiled a comprehensive dataset of over 8,000 papers, spanning tweets from December 2018 to October 2023.
Our statistical and causal inference analysis reveals a significant increase in citations for papers endorsed by these influencers.
arXiv Detail & Related papers (2024-01-24T20:05:49Z) - How should the advent of large language models affect the practice of
science? [51.62881233954798]
How should the advent of large language models affect the practice of science?
We have invited four diverse groups of scientists to reflect on this query, sharing their perspectives and engaging in debate.
arXiv Detail & Related papers (2023-12-05T10:45:12Z) - Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers [1.5362868418787874]
Large language models (LLMs) are dramatically influencing AI research, spurring discussions on what has changed so far and how to shape the field's future.
To clarify such questions, we analyze a new dataset of 16,979 LLM-related arXiv papers, focusing on recent trends in 2023 vs. 2018-2022.
An influx of new authors -- half of all first authors in 2023 -- are entering from non-NLP fields of AI, driving disciplinary expansion.
Surprisingly, industry accounts for a smaller publication share in 2023, largely due to reduced output from Google and other Big Tech companies.
arXiv Detail & Related papers (2023-07-20T08:45:00Z) - The Technological Emergence of AutoML: A Survey of Performant Software
and Applications in the Context of Industry [72.10607978091492]
Automated/Autonomous Machine Learning (AutoML/AutonoML) is a relatively young field.
This review makes two primary contributions to knowledge around this topic.
It provides the most up-to-date and comprehensive survey of existing AutoML tools, both open-source and commercial.
arXiv Detail & Related papers (2022-11-08T10:42:08Z) - Fairness in Recommender Systems: Research Landscape and Future
Directions [119.67643184567623]
We review the concepts and notions of fairness that were put forward in the area in the recent past.
We present an overview of how research in this field is currently operationalized.
Overall, our analysis of recent works points to certain research gaps.
arXiv Detail & Related papers (2022-05-23T08:34:25Z) - Exploring ML testing in practice -- Lessons learned from an interactive
rapid review with Axis Communications [4.875319458066472]
There is a growing interest in industry and academia in machine learning (ML) testing.
We believe that industry and academia need to learn together to produce rigorous and relevant knowledge.
arXiv Detail & Related papers (2022-03-30T12:01:43Z) - Industry and Academic Research in Computer Vision [5.634825161148484]
This work aims to study the dynamic between research in the industry and academia in computer vision.
The results are demonstrated on a set of top-5 vision conferences that are representative of the field.
arXiv Detail & Related papers (2021-07-10T20:09:52Z) - The Values Encoded in Machine Learning Research [6.11644847221881]
We analyze 100 highly cited machine learning papers published at premier conferences, ICML and NeurIPS.
We identify 67 values that are uplifted in machine learning research.
We find increasingly close ties between these highly cited papers and tech companies and elite universities.
arXiv Detail & Related papers (2021-06-29T17:24:14Z) - Gender bias in magazines oriented to men and women: a computational
approach [58.720142291102135]
We compare the content of a women-oriented magazine with that of a men-oriented one, both produced by the same editorial group over a decade.
With Topic Modelling techniques we identify the main themes discussed in the magazines and quantify how much the presence of these topics differs between magazines over time.
Our results show that the frequency of appearance of the topics Family, Business and Women as sex objects, present an initial bias that tends to disappear over time.
arXiv Detail & Related papers (2020-11-24T14:02:49Z) - (Non)-neutrality of science and algorithms: Machine Learning between
fundamental physics and society [0.0]
We will deal with different aspects of the issue, from a bibliometric analysis of the publications to a detailed discussion of the literature.
The analysis will be conducted on the basis of three key elements: the non-neutrality of science, understood as its intrinsic relationship with history and society.
The deconstruction of the presumed universality of scientific thought from the inside becomes in this perspective a necessary first step also for any social and political discussion.
arXiv Detail & Related papers (2020-05-27T09:43:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.