Related papers: Predictive analytics using Social Big Data and machine learning

Predictive analytics using Social Big Data and machine learning

URL: http://arxiv.org/abs/2104.12591v1
Date: Wed, 21 Apr 2021 19:30:45 GMT
Title: Predictive analytics using Social Big Data and machine learning
Authors: Bilal Abu-Salih, Pornpit Wongthongtham, Dengya Zhu, Kit Yan Chan, Amit Rudra
Abstract summary: This chapter sheds the light on core aspects that lay the foundations for social big data analytics. Various predictive analytical algorithms are introduced with their usage in several important application and top-tier tools and APIs.
Score: 6.142272540492935
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The ever-increase in the quality and quantity of data generated from day-to-day businesses operations in conjunction with the continuously imported related social data have made the traditional statistical approaches inadequate to tackle such data floods. This has dictated researchers to design and develop advance and sophisticated analytics that can be incorporated to gain valuable insights that benefit the business domain. This chapter sheds the light on core aspects that lay the foundations for social big data analytics. In particular, the significance of predictive analytics in the context of SBD is discussed fortified with presenting a framework for SBD predictive analytics. Then, various predictive analytical algorithms are introduced with their usage in several important application and top-tier tools and APIs. A case study on using predictive analytics to social data is provided supported with experiments to substantiate the significance and utility of predictive analytics.

Related papers

Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study [55.09905978813599]
Large Language Models (LLMs) hold promise in automating data analysis tasks.<n>Yet open-source models face significant limitations in these kinds of reasoning-intensive scenarios.<n>In this work, we investigate strategies to enhance the data analysis capabilities of open-source LLMs.
arXiv Detail & Related papers (2025-06-24T17:04:23Z)
A Novel, Human-in-the-Loop Computational Grounded Theory Framework for Big Social Data [8.695136686770772]
We argue that confidence in the credibility and robustness of results depends on adopting a 'human-in-the-loop' methodology.<n>We propose a novel methodological framework for Computational Grounded Theory (CGT) that supports the analysis of large qualitative datasets.
arXiv Detail & Related papers (2025-06-06T13:43:12Z)
Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context. We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters. Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings.
arXiv Detail & Related papers (2024-10-24T17:56:08Z)
Achieving Fairness in Predictive Process Analytics via Adversarial Learning [50.31323204077591]
This paper addresses the challenge of integrating a debiasing phase into predictive business process analytics. Our framework leverages on adversial debiasing is evaluated on four case studies, showing a significant reduction in the contribution of biased variables to the predicted value.
arXiv Detail & Related papers (2024-10-03T15:56:03Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
Visual Data Diagnosis and Debiasing with Concept Graphs [50.84781894621378]
We present ConBias, a framework for diagnosing and mitigating Concept co-occurrence Biases in visual datasets. We show that by employing a novel clique-based concept balancing strategy, we can mitigate these imbalances, leading to enhanced performance on downstream tasks.
arXiv Detail & Related papers (2024-09-26T16:59:01Z)
Benchmarking Data Science Agents [11.582116078653968]
Large Language Models (LLMs) have emerged as promising aids as data science agents, assisting humans in data analysis and processing. Yet their practical efficacy remains constrained by the varied demands of real-world applications and complicated analytical process. We introduce DSEval -- a novel evaluation paradigm, as well as a series of innovative benchmarks tailored for assessing the performance of these agents.
arXiv Detail & Related papers (2024-02-27T03:03:06Z)
Capture the Flag: Uncovering Data Insights with Large Language Models [90.47038584812925]
This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data. We propose a new evaluation methodology based on a "capture the flag" principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset.
arXiv Detail & Related papers (2023-12-21T14:20:06Z)
rTisane: Externalizing conceptual models for data analysis increases engagement with domain knowledge and improves statistical model quality [11.156807472212165]
Statistical models should accurately reflect analysts' domain knowledge about variables and their relationships. Recent tools let analysts express these assumptions and use them to produce a resulting statistical model. It remains unclear what analysts want to express and how externalization impacts statistical model quality.
arXiv Detail & Related papers (2023-10-25T00:32:52Z)
Analytical Engines With Context-Rich Processing: Towards Efficient Next-Generation Analytics [12.317930859033149]
We envision an analytical engine co-optimized with components that enable context-rich analysis. We aim for a holistic pipeline cost- and rule-based optimization across relational and model-based operators.
arXiv Detail & Related papers (2022-12-14T21:46:33Z)
Towards a Taxonomy for the Use of Synthetic Data in Advanced Analytics [0.0]
We present a taxonomy highlighting the various facets of deploying synthetic data for advanced analytics systems. We identify typical application scenarios for synthetic data to assess the current state of adoption.
arXiv Detail & Related papers (2022-12-05T22:13:58Z)
A Prescriptive Learning Analytics Framework: Beyond Predictive Modelling and onto Explainable AI with Prescriptive Analytics and ChatGPT [0.0]
This study proposes a novel framework that unifies both transparent machine learning as well as techniques for enabling prescriptive analytics. This work practically demonstrates the proposed framework using predictive models for identifying at-risk learners of programme non-completion.
arXiv Detail & Related papers (2022-08-31T00:57:17Z)
Distributed intelligence on the Edge-to-Cloud Continuum: A systematic literature review [62.997667081978825]
This review aims at providing a comprehensive vision of the main state-of-the-art libraries and frameworks for machine learning and data analytics available today. The main simulation, emulation, deployment systems, and testbeds for experimental research on the Edge-to-Cloud Continuum available today are also surveyed.
arXiv Detail & Related papers (2022-04-29T08:06:05Z)
Topology-based Clusterwise Regression for User Segmentation and Demand Forecasting [63.78344280962136]
Using a public and a novel proprietary data set of commercial data, this research shows that the proposed system enables analysts to both cluster their user base and plan demand at a granular level. This work seeks to introduce TDA-based clustering of time series and clusterwise regression with matrix factorization methods as viable tools for the practitioner.
arXiv Detail & Related papers (2020-09-08T12:10:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.