Comments on Leo Breiman's paper 'Statistical Modeling: The Two Cultures'
(Statistical Science, 2001, 16(3), 199-231)
- URL: http://arxiv.org/abs/2103.11327v1
- Date: Sun, 21 Mar 2021 07:40:37 GMT
- Title: Comments on Leo Breiman's paper 'Statistical Modeling: The Two Cultures'
(Statistical Science, 2001, 16(3), 199-231)
- Authors: Jelena Bradic and Yinchu Zhu
- Abstract summary: Breiman challenged statisticians to think more broadly, to step into the unknown, model-free learning world.
A new frontier has emerged; the one where the role, impact, or stability of the it learning algorithms is no longer measured by prediction quality, but by inferential one.
- Score: 1.2183405753834562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Breiman challenged statisticians to think more broadly, to step into the
unknown, model-free learning world, with him paving the way forward. Statistics
community responded with slight optimism, some skepticism, and plenty of
disbelief. Today, we are at the same crossroad anew. Faced with the enormous
practical success of model-free, deep, and machine learning, we are naturally
inclined to think that everything is resolved. A new frontier has emerged; the
one where the role, impact, or stability of the {\it learning} algorithms is no
longer measured by prediction quality, but an inferential one -- asking the
questions of {\it why} and {\it if} can no longer be safely ignored.
Related papers
- A comparative study of conformal prediction methods for valid uncertainty quantification in machine learning [0.0]
dissertation tries to further the quest for a world where everyone is aware of uncertainty, of how important it is and how to embrace it instead of fearing it.
A specific, though general, framework that allows anyone to obtain accurate uncertainty estimates is singled out and analysed.
arXiv Detail & Related papers (2024-05-03T13:19:33Z) - R-Tuning: Instructing Large Language Models to Say `I Don't Know' [66.11375475253007]
Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges.
Previous instruction tuning methods force the model to complete a sentence no matter whether the model knows the knowledge or not.
We present a new approach called Refusal-Aware Instruction Tuning (R-Tuning)
Experimental results demonstrate R-Tuning effectively improves a model's ability to answer known questions and refrain from answering unknown questions.
arXiv Detail & Related papers (2023-11-16T08:45:44Z) - Improving the Reliability of Large Language Models by Leveraging
Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination"
We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z) - Overcoming Overconfidence for Active Learning [1.2776312584227847]
We present two novel methods to address the problem of overconfidence that arises in the active learning scenario.
The first is an augmentation strategy named Cross-Mix-and-Mix (CMaM), which aims to calibrate the model by expanding the limited training distribution.
The second is a selection strategy named Ranked Margin Sampling (RankedMS), which prevents choosing data that leads to overly confident predictions.
arXiv Detail & Related papers (2023-08-21T09:04:54Z) - Sources of Uncertainty in Machine Learning -- A Statisticians' View [3.1498833540989413]
The paper aims to formalize the two types of uncertainty associated with machine learning.
Drawing parallels between statistical concepts and uncertainty in machine learning, we also demonstrate the role of data and their influence on uncertainty.
arXiv Detail & Related papers (2023-05-26T07:44:19Z) - Do Not Trust a Model Because It is Confident: Uncovering and
Characterizing Unknown Unknowns to Student Success Predictors in Online-Based
Learning [10.120425915106727]
Student success models might be prone to develop weak spots, i.e., examples hard to accurately classify.
This weakness is one of the main factors undermining users' trust, since model predictions could for instance lead an instructor to not intervene on a student in need.
In this paper, we unveil the need of detecting and characterizing unknown unknowns in student success prediction.
arXiv Detail & Related papers (2022-12-16T15:32:49Z) - What killed the Convex Booster ? [70.04715330065275]
A landmark negative result of Long and Servedio established a worst-case spectacular failure of a supervised learning trio.
We argue that the source of the negative result lies in the dark side of a pervasive -- and otherwise prized -- aspect of ML: textit parameterisation.
arXiv Detail & Related papers (2022-05-19T15:42:20Z) - Efficient First-Order Contextual Bandits: Prediction, Allocation, and
Triangular Discrimination [82.52105963476703]
A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise.
First-order guarantees are relatively well understood in statistical and online learning.
We show that the logarithmic loss and an information-theoretic quantity called the triangular discrimination play a fundamental role in obtaining first-order guarantees.
arXiv Detail & Related papers (2021-07-05T19:20:34Z) - Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware
Regression [91.3373131262391]
Uncertainty is the only certainty there is.
Traditionally, the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions.
How to model the uncertainty within the present-day technologies for regression remains an open issue.
arXiv Detail & Related papers (2021-03-25T06:56:09Z) - Constrained Learning with Non-Convex Losses [119.8736858597118]
Though learning has become a core technology of modern information processing, there is now ample evidence that it can lead to biased, unsafe, and prejudiced solutions.
arXiv Detail & Related papers (2021-03-08T23:10:33Z) - Bridging Breiman's Brook: From Algorithmic Modeling to Statistical
Learning [6.837936479339647]
In 2001, Leo Breiman wrote of a divide between "data modeling" and "algorithmic modeling" cultures.
Twenty years later this division feels far more ephemeral, both in terms of assigning individuals to camps, and in terms of intellectual boundaries.
We argue that this is largely due to the "data modelers" incorporating algorithmic methods into their toolbox.
arXiv Detail & Related papers (2021-02-23T03:38:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.