Data Science Education in Undergraduate Physics: Lessons Learned from a Community of Practice
- URL: http://arxiv.org/abs/2403.00961v2
- Date: Sun, 16 Jun 2024 16:47:56 GMT
- Title: Data Science Education in Undergraduate Physics: Lessons Learned from a Community of Practice
- Authors: Karan Shah, Julie Butler, Alexis Knaub, Anıl Zenginoğlu, William Ratcliff, Mohammad Soltanieh-ha,
- Abstract summary: We present insights and experiences from the Data Science Education Community of Practice (DSECOP)
DSECOP brings together graduate students and physics educators from different institutions to share best practices and lessons learned from integrating data science into undergraduate physics education.
Our goal is to provide guidance and inspiration to educators who seek to integrate data science into their teaching, helping to prepare the next generation of physicists for a data-driven world.
- Score: 0.6597195879147557
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is becoming increasingly important that physics educators equip their students with the skills to work with data effectively. However, many educators may lack the necessary training and expertise in data science to teach these skills. To address this gap, we created the Data Science Education Community of Practice (DSECOP), bringing together graduate students and physics educators from different institutions and backgrounds to share best practices and lessons learned from integrating data science into undergraduate physics education. In this article we present insights and experiences from this community of practice, highlighting key strategies and challenges in incorporating data science into the introductory physics curriculum. Our goal is to provide guidance and inspiration to educators who seek to integrate data science into their teaching, helping to prepare the next generation of physicists for a data-driven world.
Related papers
- Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for
Language Models [153.14575887549088]
We introduce Generalized Instruction Tuning (called GLAN), a general and scalable method for instruction tuning of Large Language Models (LLMs)
GLAN exclusively utilizes a pre-curated taxonomy of human knowledge and capabilities as input and generates large-scale synthetic instruction data across all disciplines.
With the fine-grained key concepts detailed in every class session of the syllabus, we are able to generate diverse instructions with a broad coverage across the entire spectrum of human knowledge and skills.
arXiv Detail & Related papers (2024-02-20T15:00:35Z) - SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models [57.96527452844273]
We introduce SciInstruct, a suite of scientific instructions for training scientific language models capable of college-level scientific reasoning.
We curated a diverse and high-quality dataset encompassing physics, chemistry, math, and formal proofs.
To verify the effectiveness of SciInstruct, we fine-tuned different language models with SciInstruct, i.e., ChatGLM3 (6B and 32B), Llama3-8B-Instruct, and Mistral-7B: MetaMath.
arXiv Detail & Related papers (2024-01-15T20:22:21Z) - Motivation, inclusivity, and realism should drive data science education [0.0]
Data science education provides tremendous opportunities but remains inaccessible to many communities.
Increasing the accessibility of data science to these communities not only benefits the individuals entering data science, but also increases the field's innovation and potential impact as a whole.
Our group has led education efforts for a variety of audiences: from professional scientists to high school students to lay audiences.
arXiv Detail & Related papers (2023-05-09T17:46:41Z) - Modeling Information Change in Science Communication with Semantically
Matched Paraphrases [50.67030449927206]
SPICED is the first paraphrase dataset of scientific findings annotated for degree of information change.
SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers.
Models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims.
arXiv Detail & Related papers (2022-10-24T07:44:38Z) - Motivating Data Science Students to Participate and Learn [0.0]
Data science education is increasingly involving human subjects and societal issues such as privacy, ethics, and fairness.
In this paper, we offer insights into how to structure our data science classes so that they motivate students to deeply engage with material about societal context.
We describe a novel assessment tool called participation portfolios, which is motivated by a framework that promotes student autonomy, self reflection, and the building of a learning community.
arXiv Detail & Related papers (2022-04-28T01:26:16Z) - Opinionated practices for teaching reproducibility: motivation, guided
instruction and practice [0.0]
Predictive modelling is often one of the most interesting topics to novices in data science.
Students are not as intrinsically motivated to learn this topic, and it is not an easy one for them to learn.
Providing extra motivation, guided instruction and lots of practice are key to effectively teaching this topic.
arXiv Detail & Related papers (2021-09-17T19:15:41Z) - Advanced Multi-Variate Analysis Methods for New Physics Searches at the
Large Hadron Collider [72.34476433304168]
"AMVA4NewPhysics" studied the customization and application of advanced multivariate analysis methods and statistical learning tools to high-energy physics problems.
Many of those methods were successfully used to improve the sensitivity of data analyses performed by the ATLAS and CMS experiments at CERN.
Several others, still in the testing phase, promise to further improve the precision of measurements of fundamental physics parameters and the reach of searches for new phenomena.
arXiv Detail & Related papers (2021-05-16T22:20:30Z) - Interleaving Computational and Inferential Thinking: Data Science for
Undergraduates at Berkeley [81.01051375191828]
The undergraduate data science curriculum at the University of California, Berkeley is anchored in five new courses.
These courses emphasize computational thinking, inferential thinking, and working on real-world problems.
These courses have become some of the most popular on campus and have led to a surging interest in a new undergraduate major and minor program in data science.
arXiv Detail & Related papers (2021-02-13T22:51:24Z) - Data Science for Engineers: A Teaching Ecosystem [59.00739310930656]
We describe an ecosystem for teaching data science to engineers at the Faculty of Physical and Mathematical Sciences, Universidad de Chile.
This initiative has been motivated by the increasing demand for DS qualifications both from academic and professional environments.
By sharing our teaching principles and the innovative components of our approach to teaching DS, we hope our experience can be useful to those developing their own DS programmes and ecosystems.
arXiv Detail & Related papers (2021-01-14T14:17:57Z) - Computational Skills by Stealth in Secondary School Data Science [16.960800464621993]
We discuss a proposal for the stealth development of computational skills in students' first exposure to data science.
The intent of this approach is to support students, regardless of interest and self-efficacy in coding, in becoming data-driven learners.
arXiv Detail & Related papers (2020-10-08T09:11:51Z) - A fresh look at introductory data science [0.0]
We present a case study of an introductory undergraduate course in data science that is designed to address these needs.
This course has no pre-requisites and serves a wide audience of aspiring statistics and data science majors as well as humanities, social sciences, and natural sciences students.
We discuss the unique set of challenges posed by offering such a course and in light of these challenges, we present a detailed discussion into the pedagogical design elements, content, structure, computational infrastructure, and the assessment methodology of the course.
arXiv Detail & Related papers (2020-08-01T18:39:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.