Position: Embracing Negative Results in Machine Learning
- URL: http://arxiv.org/abs/2406.03980v1
- Date: Thu, 6 Jun 2024 11:51:12 GMT
- Title: Position: Embracing Negative Results in Machine Learning
- Authors: Florian Karl, Lukas Malte Kemeter, Gabriel Dax, Paulina Sierak,
- Abstract summary: We argue that predictive performance alone is not a good indicator for the worth of a publication.
We present the advantages of publishing negative results and provide concrete measures for the community to move towards a paradigm where their publication is normalized.
- Score: 0.7499722271664147
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Publications proposing novel machine learning methods are often primarily rated by exhibited predictive performance on selected problems. In this position paper we argue that predictive performance alone is not a good indicator for the worth of a publication. Using it as such even fosters problems like inefficiencies of the machine learning research community as a whole and setting wrong incentives for researchers. We therefore put out a call for the publication of "negative" results, which can help alleviate some of these problems and improve the scientific output of the machine learning research community. To substantiate our position, we present the advantages of publishing negative results and provide concrete measures for the community to move towards a paradigm where their publication is normalized.
Related papers
- Reduced-Rank Multi-objective Policy Learning and Optimization [57.978477569678844]
In practice, causal researchers do not have a single outcome in mind a priori.
In government-assisted social benefit programs, policymakers collect many outcomes to understand the multidimensional nature of poverty.
We present a data-driven dimensionality-reduction methodology for multiple outcomes in the context of optimal policy learning.
arXiv Detail & Related papers (2024-04-29T08:16:30Z) - Too Good To Be True: performance overestimation in (re)current practices
for Human Activity Recognition [49.1574468325115]
sliding windows for data segmentation followed by standard random k-fold cross validation produce biased results.
It is important to raise awareness in the scientific community about this problem, whose negative effects are being overlooked.
Several experiments with different types of datasets and different types of classification models allow us to exhibit the problem and show it persists independently of the method or dataset.
arXiv Detail & Related papers (2023-10-18T13:24:05Z) - In Search of Insights, Not Magic Bullets: Towards Demystification of the
Model Selection Dilemma in Heterogeneous Treatment Effect Estimation [92.51773744318119]
This paper empirically investigates the strengths and weaknesses of different model selection criteria.
We highlight that there is a complex interplay between selection strategies, candidate estimators and the data used for comparing them.
arXiv Detail & Related papers (2023-02-06T16:55:37Z) - Generating Summaries for Scientific Paper Review [29.12631698162247]
The increase of submissions for top venues in machine learning and NLP has caused a problem of excessive burden on reviewers.
An automatic system for assisting with the reviewing process could be a solution for ameliorating the problem.
In this paper, we explore automatic review summary generation for scientific papers.
arXiv Detail & Related papers (2021-09-28T21:43:53Z) - DeepZensols: Deep Natural Language Processing Framework [23.56171046067646]
This work is a framework that is able to reproduce consistent results.
It provides a means of easily creating, training, and evaluating natural language processing (NLP) deep learning (DL) models.
arXiv Detail & Related papers (2021-09-08T01:16:05Z) - Unsupervised Learning of Debiased Representations with Pseudo-Attributes [85.5691102676175]
We propose a simple but effective debiasing technique in an unsupervised manner.
We perform clustering on the feature embedding space and identify pseudoattributes by taking advantage of the clustering results.
We then employ a novel cluster-based reweighting scheme for learning debiased representation.
arXiv Detail & Related papers (2021-08-06T05:20:46Z) - Some Ethical Issues in the Review Process of Machine Learning
Conferences [0.38073142980733]
Recent successes in the Machine Learning community have led to a steep increase in the number of papers submitted to conferences.
This increase made more prominent some of the issues that affect the current review process used by these conferences.
We study the problem of reviewers' recruitment, infringements of the double-blind process, fraudulent behaviors, biases in numerical ratings, and the appendix phenomenon.
arXiv Detail & Related papers (2021-06-01T21:22:41Z) - A Measure of Research Taste [91.3755431537592]
We present a citation-based measure that rewards both productivity and taste.
The presented measure, CAP, balances the impact of publications and their quantity.
We analyze the characteristics of CAP for highly-cited researchers in biology, computer science, economics, and physics.
arXiv Detail & Related papers (2021-05-17T18:01:47Z) - Individual Explanations in Machine Learning Models: A Survey for
Practitioners [69.02688684221265]
The use of sophisticated statistical models that influence decisions in domains of high societal relevance is on the rise.
Many governments, institutions, and companies are reluctant to their adoption as their output is often difficult to explain in human-interpretable ways.
Recently, the academic literature has proposed a substantial amount of methods for providing interpretable explanations to machine learning models.
arXiv Detail & Related papers (2021-04-09T01:46:34Z) - Poincare: Recommending Publication Venues via Treatment Effect
Estimation [40.60905158071766]
We use a bias correction method to estimate the potential impact of choosing a publication venue effectively.
We highlight the effectiveness of our method using paper data from computer science conferences.
arXiv Detail & Related papers (2020-10-19T00:50:48Z) - Evolving Methods for Evaluating and Disseminating Computing Research [4.0318506932466445]
Social and technical trends have significantly changed methods for evaluating and disseminating computing research.
Traditional venues for reviewing and publishing, such as conferences and journals, worked effectively in the past.
Many conferences have seen large increases in the number of submissions.
Dis dissemination of research ideas has become dramatically through publication venues such as arXiv.org and social media networks.
arXiv Detail & Related papers (2020-07-02T16:50:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.