Data Science: Nature and Pitfalls
- URL: http://arxiv.org/abs/2006.16964v1
- Date: Sun, 28 Jun 2020 02:06:54 GMT
- Title: Data Science: Nature and Pitfalls
- Authors: Longbing Cao
- Abstract summary: A critical matter for the healthy development of data science in its early stages is to deeply understand the nature of data and data science.
These important issues motivate the discussions in this article.
- Score: 42.98602883069444
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data science is creating very exciting trends as well as significant
controversy. A critical matter for the healthy development of data science in
its early stages is to deeply understand the nature of data and data science,
and to discuss the various pitfalls. These important issues motivate the
discussions in this article.
Related papers
- Modeling Information Change in Science Communication with Semantically
Matched Paraphrases [50.67030449927206]
SPICED is the first paraphrase dataset of scientific findings annotated for degree of information change.
SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers.
Models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims.
arXiv Detail & Related papers (2022-10-24T07:44:38Z) - Opinionated practices for teaching reproducibility: motivation, guided
instruction and practice [0.0]
Predictive modelling is often one of the most interesting topics to novices in data science.
Students are not as intrinsically motivated to learn this topic, and it is not an easy one for them to learn.
Providing extra motivation, guided instruction and lots of practice are key to effectively teaching this topic.
arXiv Detail & Related papers (2021-09-17T19:15:41Z) - Biases in Data Science Lifecycle [0.0]
The aim of this study is to provide a practical guideline to data scientists and increase their awareness.
In this work, we reviewed different sources of biases and grouped them under different stages of the data science lifecycle.
arXiv Detail & Related papers (2020-09-10T13:41:48Z) - A Survey on Data Pricing: from Economics to Data Science [61.72030615854597]
We examine various motivations behind data pricing and understand the economics of data pricing.
We discuss both digital products and data products.
We consider a series of challenges and directions for future work.
arXiv Detail & Related papers (2020-09-09T19:31:38Z) - Data Science: A Comprehensive Overview [42.98602883069444]
The twenty-first century has ushered in the age of big data and data economy, in which data DNA has become an intrinsic constituent of all data-based organisms.
An appropriate understanding of data DNA and its organisms relies on the new field of data science and its keystone, analytics.
This article is the first in the field to draw a comprehensive big picture, in addition to offering rich observations, lessons and thinking about data science and analytics.
arXiv Detail & Related papers (2020-07-01T02:33:58Z) - Data Science: Challenges and Directions [42.98602883069444]
We review hundreds of pieces of literature which include data science in their titles.
We find that the majority of the discussions essentially concern statistics, data mining, machine learning, big data, or broadly data analytics.
We focus on the research and innovation challenges inspired by the nature of data science problems as complex systems.
arXiv Detail & Related papers (2020-06-28T01:49:00Z) - A Data Scientist's Guide to Streamflow Prediction [55.22219308265945]
We focus on the element of hydrologic rainfall--runoff models and their application to forecast floods and predict streamflow.
This guide aims to help interested data scientists gain an understanding of the problem, the hydrologic concepts involved, and the details that come up along the way.
arXiv Detail & Related papers (2020-06-05T08:04:37Z) - Ten Research Challenge Areas in Data Science [4.670305538969914]
Data science builds on knowledge from computer science, mathematics, statistics, and other disciplines.
This article starts with meta-questions about data science as a discipline and then elaborates on ten ideas for the basis of a research agenda for data science.
arXiv Detail & Related papers (2020-01-27T21:39:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.