Predictive analytics using Social Big Data and machine learning
- URL: http://arxiv.org/abs/2104.12591v1
- Date: Wed, 21 Apr 2021 19:30:45 GMT
- Title: Predictive analytics using Social Big Data and machine learning
- Authors: Bilal Abu-Salih, Pornpit Wongthongtham, Dengya Zhu, Kit Yan Chan, Amit
Rudra
- Abstract summary: This chapter sheds the light on core aspects that lay the foundations for social big data analytics.
Various predictive analytical algorithms are introduced with their usage in several important application and top-tier tools and APIs.
- Score: 6.142272540492935
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ever-increase in the quality and quantity of data generated from
day-to-day businesses operations in conjunction with the continuously imported
related social data have made the traditional statistical approaches inadequate
to tackle such data floods. This has dictated researchers to design and develop
advance and sophisticated analytics that can be incorporated to gain valuable
insights that benefit the business domain. This chapter sheds the light on core
aspects that lay the foundations for social big data analytics. In particular,
the significance of predictive analytics in the context of SBD is discussed
fortified with presenting a framework for SBD predictive analytics. Then,
various predictive analytical algorithms are introduced with their usage in
several important application and top-tier tools and APIs. A case study on
using predictive analytics to social data is provided supported with
experiments to substantiate the significance and utility of predictive
analytics.
Related papers
- F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - Benchmarking Data Science Agents [11.582116078653968]
Large Language Models (LLMs) have emerged as promising aids as data science agents, assisting humans in data analysis and processing.
Yet their practical efficacy remains constrained by the varied demands of real-world applications and complicated analytical process.
We introduce DSEval -- a novel evaluation paradigm, as well as a series of innovative benchmarks tailored for assessing the performance of these agents.
arXiv Detail & Related papers (2024-02-27T03:03:06Z) - Capture the Flag: Uncovering Data Insights with Large Language Models [90.47038584812925]
This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data.
We propose a new evaluation methodology based on a "capture the flag" principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset.
arXiv Detail & Related papers (2023-12-21T14:20:06Z) - rTisane: Externalizing conceptual models for data analysis increases
engagement with domain knowledge and improves statistical model quality [11.156807472212165]
Statistical models should accurately reflect analysts' domain knowledge about variables and their relationships.
Recent tools let analysts express these assumptions and use them to produce a resulting statistical model.
It remains unclear what analysts want to express and how externalization impacts statistical model quality.
arXiv Detail & Related papers (2023-10-25T00:32:52Z) - Analytical Engines With Context-Rich Processing: Towards Efficient
Next-Generation Analytics [12.317930859033149]
We envision an analytical engine co-optimized with components that enable context-rich analysis.
We aim for a holistic pipeline cost- and rule-based optimization across relational and model-based operators.
arXiv Detail & Related papers (2022-12-14T21:46:33Z) - Towards a Taxonomy for the Use of Synthetic Data in Advanced Analytics [0.0]
We present a taxonomy highlighting the various facets of deploying synthetic data for advanced analytics systems.
We identify typical application scenarios for synthetic data to assess the current state of adoption.
arXiv Detail & Related papers (2022-12-05T22:13:58Z) - A Prescriptive Learning Analytics Framework: Beyond Predictive Modelling
and onto Explainable AI with Prescriptive Analytics and ChatGPT [0.0]
This study proposes a novel framework that unifies both transparent machine learning as well as techniques for enabling prescriptive analytics.
This work practically demonstrates the proposed framework using predictive models for identifying at-risk learners of programme non-completion.
arXiv Detail & Related papers (2022-08-31T00:57:17Z) - Measuring Causal Effects of Data Statistics on Language Model's
`Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models.
We provide a language for describing how training data influences predictions, through a causal framework.
Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z) - Distributed intelligence on the Edge-to-Cloud Continuum: A systematic
literature review [62.997667081978825]
This review aims at providing a comprehensive vision of the main state-of-the-art libraries and frameworks for machine learning and data analytics available today.
The main simulation, emulation, deployment systems, and testbeds for experimental research on the Edge-to-Cloud Continuum available today are also surveyed.
arXiv Detail & Related papers (2022-04-29T08:06:05Z) - Topology-based Clusterwise Regression for User Segmentation and Demand
Forecasting [63.78344280962136]
Using a public and a novel proprietary data set of commercial data, this research shows that the proposed system enables analysts to both cluster their user base and plan demand at a granular level.
This work seeks to introduce TDA-based clustering of time series and clusterwise regression with matrix factorization methods as viable tools for the practitioner.
arXiv Detail & Related papers (2020-09-08T12:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.