Unlocking the Potential of Open Government Data: Exploring the Strategic, Technical, and Application Perspectives of High-Value Datasets Opening in Taiwan
- URL: http://arxiv.org/abs/2403.09216v1
- Date: Thu, 14 Mar 2024 09:31:20 GMT
- Title: Unlocking the Potential of Open Government Data: Exploring the Strategic, Technical, and Application Perspectives of High-Value Datasets Opening in Taiwan
- Authors: Hsien-Lee Tseng, Anastasija Nikiforova,
- Abstract summary: The aim of the paper is to understand and evaluate the lifecycle of high-value dataset publishing in one of the world's leading producers of information and communication technology (ICT) products - Taiwan.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Today, data has an unprecedented value as it forms the basis for data-driven decision-making, including serving as an input for AI models, where the latter is highly dependent on the availability of the data. However, availability of data in an open data format creates a little added value, where the value of these data, i.e., their relevance to the real needs of the end user, is key. This is where the concept of high-value dataset (HVD) comes into play, which has become popular in recent years. Defining and opening HVD is an ongoing process consisting of a set of interrelated steps, the implementation of which may vary from one country or region to another. Therefore, there has recently been a call to conduct research in a country or region setting considered to be of greatest national value. So far, only a few studies have been conducted at the regional or national level, most of which consider only one step of the process, such as identifying HVD or measuring their impact. With this study, we answer this call and examine the national case of Taiwan by exploring the entire lifecycle of HVD opening. The aim of the paper is to understand and evaluate the lifecycle of high-value dataset publishing in one of the world's leading producers of information and communication technology (ICT) products - Taiwan. To do this, we conduct a qualitative study with exploratory interviews with representatives from government agencies in Taiwan responsible for HVD opening, exploring HVD opening lifecycle. As such, we examine (1) strategic aspects related to the HVD determination process, (2) technical aspects, and (3) application aspects.
Related papers
- Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - Automating the Identification of High-Value Datasets in Open Government Data Portals [0.0]
High-Value datasets (HVDs) play a crucial role in the broader Open Government Data (OGD) movement.
Identifying HVDs on OGD portals presents a resource-intensive and complex challenge due to the nuanced nature of data value.
Our proposal aims to automate the identification of HVDs on OGD portals using a quantitative approach based on a detailed analysis of user interest.
arXiv Detail & Related papers (2024-06-15T07:54:37Z) - A Survey on Deep Active Learning: Recent Advances and New Frontiers [27.07154361976248]
This work aims to serve as a useful and quick guide for researchers in overcoming difficulties in deep learning-based active learning (DAL)
This technique has gained increasing popularity due to its broad applicability, yet its survey papers, especially for deep learning-based active learning (DAL), remain scarce.
arXiv Detail & Related papers (2024-05-01T05:54:33Z) - Capture the Flag: Uncovering Data Insights with Large Language Models [90.47038584812925]
This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data.
We propose a new evaluation methodology based on a "capture the flag" principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset.
arXiv Detail & Related papers (2023-12-21T14:20:06Z) - Federated Learning for Generalization, Robustness, Fairness: A Survey
and Benchmark [55.898771405172155]
Federated learning has emerged as a promising paradigm for privacy-preserving collaboration among different parties.
We provide a systematic overview of the important and recent developments of research on federated learning.
arXiv Detail & Related papers (2023-11-12T06:32:30Z) - Towards High-Value Datasets determination for data-driven development: a
systematic literature review [0.0]
'High-value dataset' (HVD) recognized as a key trend in the Open Data Directive area in 2022.
There is no standardized approach to assist chief data officers in this.
arXiv Detail & Related papers (2023-05-17T14:22:02Z) - Mapping Climate Change Research via Open Repositories & AI: advantages
and limitations for an evidence-based R&D policy-making [0.0]
In the last few years, several initiatives have been starting to offer access to research outputs data and metadata in an open fashion.
These platforms are opening up scientific production to the wider public and they can be an invaluable asset for evidence-based policy-making.
To gain a comprehensive view of entire STI ecosystems, the information provided by each of these resources should be combined and analysed accordingly.
Here, we study whether this is the case for the case for mapping Climate Action research in the whole Denmark STI ecosystem, by using 4 popular open access STI data sources.
arXiv Detail & Related papers (2022-09-19T12:56:30Z) - Towards Complex Document Understanding By Discrete Reasoning [77.91722463958743]
Document Visual Question Answering (VQA) aims to understand visually-rich documents to answer questions in natural language.
We introduce a new Document VQA dataset, named TAT-DQA, which consists of 3,067 document pages and 16,558 question-answer pairs.
We develop a novel model named MHST that takes into account the information in multi-modalities, including text, layout and visual image, to intelligently address different types of questions.
arXiv Detail & Related papers (2022-07-25T01:43:19Z) - A Survey of Knowledge Tracing: Models, Variants, and Applications [70.69281873057619]
Knowledge Tracing is one of the fundamental tasks for student behavioral data analysis.
We present three types of fundamental KT models with distinct technical routes.
We discuss potential directions for future research in this rapidly growing field.
arXiv Detail & Related papers (2021-05-06T13:05:55Z) - Domain Generalization: A Survey [146.68420112164577]
Domain generalization (DG) aims to achieve OOD generalization by only using source domain data for model learning.
For the first time, a comprehensive literature review is provided to summarize the ten-year development in DG.
arXiv Detail & Related papers (2021-03-03T16:12:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.