Effective Data Stewardship in Higher Education: Skills, Competences, and the Emerging Role of Open Data Stewards
- URL: http://arxiv.org/abs/2410.20361v1
- Date: Sun, 27 Oct 2024 07:37:36 GMT
- Title: Effective Data Stewardship in Higher Education: Skills, Competences, and the Emerging Role of Open Data Stewards
- Authors: Panos Fitsilis, Vyron Damasiotis, Charalampos Dervenis, Vasileios Kyriatzis, Paraskevi Tsoutsa,
- Abstract summary: The significance of open data in higher education stems from the changing tendencies towards open science.
The paper proposes a structured training framework and comprehensive curriculum for data stewardship.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The significance of open data in higher education stems from the changing tendencies towards open science, and open research in higher education encourages new ways of making scientific inquiry more transparent, collaborative and accessible. This study focuses on the critical role of open data stewards in this transition, essential for managing and disseminating research data effectively in universities, while it also highlights the increasing demand for structured training and professional policies for data stewards in academic settings. Building upon this context, the paper investigates the essential skills and competences required for effective data stewardship in higher education institutions by elaborating on a critical literature review, coupled with practical engagement in open data stewardship at universities, provided insights into the roles and responsibilities of data stewards. In response to these identified needs, the paper proposes a structured training framework and comprehensive curriculum for data stewardship, a direct response to the gaps identified in the literature. It addresses five key competence categories for open data stewards, aligning them with current trends and essential skills and knowledge in the field. By advocating for a structured approach to data stewardship education, this work sets the foundation for improved data management in universities and serves as a critical step towards professionalizing the role of data stewards in higher education. The emphasis on the role of open data stewards is expected to advance data accessibility and sharing practices, fostering increased transparency, collaboration, and innovation in academic research. This approach contributes to the evolution of universities into open ecosystems, where there is free flow of data for global education and research advancement.
Related papers
- A Data Literacy Competence Model for Higher Education and Research [0.0]
Data Literacy Initiative (DaLI) at TH K"oln develops competence model for promoting data literacy in higher education.
Based on interdisciplinary collaboration and empirical research, the DaLI model defines seven overarching competence areas.
Intended for use across disciplines, the model supports the strategic integration of data literacy into university programs.
arXiv Detail & Related papers (2025-04-22T08:14:23Z) - Data Stewardship Decoded: Mapping Its Diverse Manifestations and Emerging Relevance at a time of AI [0.21756081703275998]
Data stewardship has become a critical component of modern data governance, especially with the growing use of artificial intelligence (AI)
Despite its increasing importance, the concept of data stewardship remains ambiguous and varies in its application.
This paper explores four distinct manifestations of data stewardship to clarify its emerging position in the data governance landscape.
arXiv Detail & Related papers (2025-01-20T16:24:22Z) - Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - A Survey on Data Selection for Language Models [148.300726396877]
Data selection methods aim to determine which data points to include in a training dataset.
Deep learning is mostly driven by empirical evidence and experimentation on large-scale data is expensive.
Few organizations have the resources for extensive data selection research.
arXiv Detail & Related papers (2024-02-26T18:54:35Z) - Open Datasheets: Machine-readable Documentation for Open Datasets and Responsible AI Assessments [9.125552623625806]
This paper introduces a no-code, machine-readable documentation framework for open datasets.
The framework aims to improve comprehensibility, and usability of open datasets.
The framework is expected to enhance the quality and reliability of data used in research and decision-making.
arXiv Detail & Related papers (2023-12-11T06:41:14Z) - Data Management For Training Large Language Models: A Survey [64.18200694790787]
Data plays a fundamental role in training Large Language Models (LLMs)
This survey aims to provide a comprehensive overview of current research in data management within both the pretraining and supervised fine-tuning stages of LLMs.
arXiv Detail & Related papers (2023-12-04T07:42:16Z) - On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms [56.119374302685934]
There have been severe concerns over the trustworthiness of AI technologies.
Machine and deep learning algorithms depend heavily on the data used during their development.
We propose a framework to evaluate the datasets through a responsible rubric.
arXiv Detail & Related papers (2023-10-24T14:01:53Z) - Assessing Scientific Contributions in Data Sharing Spaces [64.16762375635842]
This paper introduces the SCIENCE-index, a blockchain-based metric measuring a researcher's scientific contributions.
To incentivize researchers to share their data, the SCIENCE-index is augmented to include a data-sharing parameter.
Our model is evaluated by comparing the distribution of its output for geographically diverse researchers to that of the h-index.
arXiv Detail & Related papers (2023-03-18T19:17:47Z) - Big Data and Analytics Implementation in Tertiary Institutions to
Predict Students Performance in Nigeria [0.0]
The term Big Data has been coined to refer to the gargantuan bulk of data that cannot be dealt with by traditional data-handling techniques.
This paper explores the attributes of big data that are relevant to educational institutions.
It investigates the factors influencing the adoption of big data and analytics in learning institutions.
arXiv Detail & Related papers (2022-07-29T13:52:24Z) - Data Governance in the Age of Large-Scale Data-Driven Language
Technology [79.92626780294258]
This work proposes an approach to global language data governance that attempts to organize data management amongst stakeholders, values, and rights.
The framework we present is a multi-party international governance structure focused on language data, and incorporating technical and organizational tools needed to support its work.
arXiv Detail & Related papers (2022-05-04T00:44:35Z) - We Need to Talk About Data: The Importance of Data Readiness in Natural
Language Processing [3.096615629099618]
We argue that there is a gap between academic research in NLP and its application to problems outside academia.
We propose a method for improving the communication between researchers and external stakeholders regarding the accessibility, validity, and utility of data.
arXiv Detail & Related papers (2021-10-11T17:55:07Z) - A fresh look at introductory data science [0.0]
We present a case study of an introductory undergraduate course in data science that is designed to address these needs.
This course has no pre-requisites and serves a wide audience of aspiring statistics and data science majors as well as humanities, social sciences, and natural sciences students.
We discuss the unique set of challenges posed by offering such a course and in light of these challenges, we present a detailed discussion into the pedagogical design elements, content, structure, computational infrastructure, and the assessment methodology of the course.
arXiv Detail & Related papers (2020-08-01T18:39:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.