Use and Misuse of Machine Learning in Anthropology
- URL: http://arxiv.org/abs/2209.02811v1
- Date: Tue, 6 Sep 2022 20:32:24 GMT
- Title: Use and Misuse of Machine Learning in Anthropology
- Authors: Jeff Calder, Reed Coil, Annie Melton, Peter J. Olver, Gilbert
Tostevin, Katrina Yezzi-Woodley
- Abstract summary: We will focus on the field of paleoanthropology, which seeks to understand the evolution of the human species based on biological and cultural evidence.
The aim of this paper is to provide a brief introduction to some of the ways in which ML has been applied within paleoanthropology.
We discuss a series of missteps, errors, and violations of correct protocols of ML methods that appear disconcertingly often within the accumulating body of anthropological literature.
- Score: 0.9786690381850356
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML), being now widely accessible to the research community
at large, has fostered a proliferation of new and striking applications of
these emergent mathematical techniques across a wide range of disciplines. In
this paper, we will focus on a particular case study: the field of
paleoanthropology, which seeks to understand the evolution of the human species
based on biological and cultural evidence. As we will show, the easy
availability of ML algorithms and lack of expertise on their proper use among
the anthropological research community has led to foundational misapplications
that have appeared throughout the literature. The resulting unreliable results
not only undermine efforts to legitimately incorporate ML into anthropological
research, but produce potentially faulty understandings about our human
evolutionary and behavioral past.
The aim of this paper is to provide a brief introduction to some of the ways
in which ML has been applied within paleoanthropology; we also include a survey
of some basic ML algorithms for those who are not fully conversant with the
field, which remains under active development. We discuss a series of missteps,
errors, and violations of correct protocols of ML methods that appear
disconcertingly often within the accumulating body of anthropological
literature. These mistakes include use of outdated algorithms and practices;
inappropriate train/test splits, sample composition, and textual explanations;
as well as an absence of transparency due to the lack of data/code sharing, and
the subsequent limitations imposed on independent replication. We assert that
expanding samples, sharing data and code, re-evaluating approaches to peer
review, and, most importantly, developing interdisciplinary teams that include
experts in ML are all necessary for progress in future research incorporating
ML within anthropology.
Related papers
- Retrieval-Enhanced Machine Learning: Synthesis and Opportunities [60.34182805429511]
Retrieval-enhancement can be extended to a broader spectrum of machine learning (ML)
This work introduces a formal framework of this paradigm, Retrieval-Enhanced Machine Learning (REML), by synthesizing the literature in various domains in ML with consistent notations which is missing from the current literature.
The goal of this work is to equip researchers across various disciplines with a comprehensive, formally structured framework of retrieval-enhanced models, thereby fostering interdisciplinary future research.
arXiv Detail & Related papers (2024-07-17T20:01:21Z) - Ontology Embedding: A Survey of Methods, Applications and Resources [54.3453925775069]
Ontologies are widely used for representing domain knowledge and meta data.
One straightforward solution is to integrate statistical analysis and machine learning.
Numerous papers have been published on embedding, but a lack of systematic reviews hinders researchers from gaining a comprehensive understanding of this field.
arXiv Detail & Related papers (2024-06-16T14:49:19Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Aligning Large Language Models with Human: A Survey [53.6014921995006]
Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks.
Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect information.
This survey presents a comprehensive overview of these alignment technologies, including the following aspects.
arXiv Detail & Related papers (2023-07-24T17:44:58Z) - A Survey on Few-Shot Class-Incremental Learning [11.68962265057818]
Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks.
This paper provides a comprehensive survey on FSCIL.
FSCIL has achieved impressive achievements in various fields of computer vision.
arXiv Detail & Related papers (2023-04-17T10:15:08Z) - Knowledge-augmented Graph Machine Learning for Drug Discovery: A Survey [6.288056740658763]
Graph Machine Learning (GML) has gained considerable attention for its exceptional ability to model graph-structured biomedical data.
Recent studies have proposed integrating external biomedical knowledge into the GML pipeline to realise more precise and interpretable drug discovery.
arXiv Detail & Related papers (2023-02-16T12:38:01Z) - Lost in Translation: Reimagining the Machine Learning Life Cycle in
Education [12.802237736747077]
Machine learning (ML) techniques are increasingly prevalent in education.
There is a pressing need to investigate how ML techniques support long-standing education principles and goals.
In this work, we shed light on this complex landscape drawing on qualitative insights from interviews with education experts.
arXiv Detail & Related papers (2022-09-08T17:14:01Z) - The worst of both worlds: A comparative analysis of errors in learning
from data in psychology and machine learning [17.336655978572583]
Recent concerns that machine learning (ML) may be facing a misdiagnosis and replication crisis suggest that some published claims in ML research cannot be taken at face value.
A deeper understanding of what concerns in research in supervised ML have in common with the replication crisis in experimental science can put the new concerns in perspective.
arXiv Detail & Related papers (2022-03-12T18:26:24Z) - Understanding the Usability Challenges of Machine Learning In
High-Stakes Decision Making [67.72855777115772]
Machine learning (ML) is being applied to a diverse and ever-growing set of domains.
In many cases, domain experts -- who often have no expertise in ML or data science -- are asked to use ML predictions to make high-stakes decisions.
We investigate the ML usability challenges present in the domain of child welfare screening through a series of collaborations with child welfare screeners.
arXiv Detail & Related papers (2021-03-02T22:50:45Z) - Interpretable Machine Learning -- A Brief History, State-of-the-Art and
Challenges [0.8029049649310213]
We present a brief history of the field of interpretable machine learning (IML), give an overview of state-of-the-art interpretation methods, and discuss challenges.
As young as the field is, it has over 200 years old roots in regression modeling and rule-based machine learning, starting in the 1960s.
Many new IML methods have been proposed, many of them model-agnostic, but also interpretation techniques specific to deep learning and tree-based ensembles.
arXiv Detail & Related papers (2020-10-19T09:20:03Z) - Heterogeneous Representation Learning: A Review [66.12816399765296]
Heterogeneous Representation Learning (HRL) brings some unique challenges.
We present a unified learning framework which is able to model most existing learning settings with the heterogeneous inputs.
We highlight the challenges that are less-touched in HRL and present future research directions.
arXiv Detail & Related papers (2020-04-28T05:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.