DeepCausalMMM: A Deep Learning Framework for Marketing Mix Modeling with Causal Inference
- URL: http://arxiv.org/abs/2510.13087v2
- Date: Thu, 23 Oct 2025 02:49:44 GMT
- Title: DeepCausalMMM: A Deep Learning Framework for Marketing Mix Modeling with Causal Inference
- Authors: Aditya Puttaparthi Tirumala,
- Abstract summary: DeepCausalMMM is a Python package that addresses limitations by combining deep learning, causal inference, and advanced marketing science.<n>The package uses Gated Recurrent Units (GRUs) to automatically learn temporal patterns such as adstock (carryover effects) and lag.<n>It also implements Hill equation-based saturation curves to model diminishing returns and optimize budget allocation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Marketing Mix Modeling (MMM) is a statistical technique used to estimate the impact of marketing activities on business outcomes such as sales, revenue, or customer visits. Traditional MMM approaches often rely on linear regression or Bayesian hierarchical models that assume independence between marketing channels and struggle to capture complex temporal dynamics and non-linear saturation effects [@Chan2017; @Hanssens2005; @Ng2021Bayesian]. **DeepCausalMMM** is a Python package that addresses these limitations by combining deep learning, causal inference, and advanced marketing science. The package uses Gated Recurrent Units (GRUs) to automatically learn temporal patterns such as adstock (carryover effects) and lag, while simultaneously learning statistical dependencies and potential causal structures between marketing channels through Directed Acyclic Graph (DAG) learning [@Zheng2018NOTEARS; @Gong2024CausalMMM]. Additionally, it implements Hill equation-based saturation curves to model diminishing returns and optimize budget allocation. Key features include: (1) a data-driven design where hyperparameters and transformations (e.g., adstock decay, saturation curves) are learned or estimated from data with sensible defaults, rather than requiring fixed heuristics or manual specification, (2) multi-region modeling with both shared and region-specific parameters, (3) robust statistical methods including Huber loss and advanced regularization, (4) comprehensive response curve analysis for understanding channel saturation.
Related papers
- Model Merging via Multi-Teacher Knowledge Distillation [11.543771846135021]
We introduce a novel flatness-aware PAC-Bayes generalization bound specifically for the model merging setting.<n>We frame model merging as multi-teacher knowledge distillation on scarce, unlabeled data.<n>We formally demonstrate that minimizing the student-teacher Kullback-Leibler divergence directly tightens the upper bound on the merged model's excess risk.
arXiv Detail & Related papers (2025-12-24T17:10:44Z) - Adapformer: Adaptive Channel Management for Multivariate Time Series Forecasting [49.40321003932633]
Adapformer is an advanced Transformer-based framework that merges the benefits of CI and CD methodologies through effective channel management.<n>Adapformer achieves superior performance over existing models, enhancing both predictive accuracy and computational efficiency.
arXiv Detail & Related papers (2025-11-18T16:24:05Z) - MaGNet: A Mamba Dual-Hypergraph Network for Stock Prediction via Temporal-Causal and Global Relational Learning [3.2859360081297715]
This work introduces MaGNet, a novel Mamba dual-hyperGraph Network for stock prediction.<n>MaGNet integrates a MAGE block, Feature-wise and Stock-wise 2D Spatiotemporal Attention modules, and a dual hypergraph framework.<n>Experiments on six major stock indices demonstrate MaGNet outperforms state-of-the-art methods in both superior predictive performance and exceptional investment returns.
arXiv Detail & Related papers (2025-10-29T20:47:16Z) - Nonparametric Data Attribution for Diffusion Models [57.820618036556084]
Data attribution for generative models seeks to quantify the influence of individual training examples on model outputs.<n>We propose a nonparametric attribution method that operates entirely on data, measuring influence via patch-level similarity between generated and training images.
arXiv Detail & Related papers (2025-10-16T03:37:16Z) - Confounder-Free Continual Learning via Recursive Feature Normalization [8.644711503479988]
Confounders are extraneous variables that affect both the input and the target, resulting in spurious correlations and biased predictions.<n>We introduce the Recursive MDN layer, which can be integrated into any deep learning architecture.
arXiv Detail & Related papers (2025-07-11T21:25:31Z) - Learning Time-Aware Causal Representation for Model Generalization in Evolving Domains [50.66049136093248]
We develop a time-aware structural causal model (SCM) that incorporates dynamic causal factors and the causal mechanism drifts.<n>We show that our method can yield the optimal causal predictor for each time domain.<n>Results on both synthetic and real-world datasets exhibit that SYNC can achieve superior temporal generalization performance.
arXiv Detail & Related papers (2025-06-21T14:05:37Z) - Reinforced Model Merging [53.84354455400038]
We present an innovative framework termed Reinforced Model Merging (RMM), which encompasses an environment and agent tailored for merging tasks.<n>By utilizing data subsets during the evaluation process, we addressed the bottleneck in the reward feedback phase, thereby accelerating RMM by up to 100 times.
arXiv Detail & Related papers (2025-03-27T08:52:41Z) - Spatial Reasoning with Denoising Models [49.83744014336816]
We introduce a framework to perform reasoning over sets of continuous variables via denoising generative models.<n>For the first time, that order of generation can successfully be predicted by the denoising network itself.<n>Using these findings, we can increase the accuracy of specific reasoning tasks from 1% to >50%.
arXiv Detail & Related papers (2025-02-28T14:08:30Z) - MCI-GRU: Stock Prediction Model Based on Multi-Head Cross-Attention and Improved GRU [29.760324699979417]
This paper proposes a stock prediction model, MCI-GRU, based on a multi-head cross-attention mechanism and an improved GRU.<n> Experiments on four main stock markets show that the proposed method outperforms SOTA techniques across multiple metrics.
arXiv Detail & Related papers (2024-09-25T14:37:49Z) - Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning [0.0]
patio-temporal forecasting plays a crucial role in various sectors such as transportation systems, logistics, and supply chain management.
We introduce a hybrid approach that combines the strengths of open-source large and small-scale language models (LLMs and LMs) with traditional forecasting methods.
arXiv Detail & Related papers (2024-08-26T16:11:53Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL [57.202733701029594]
We propose Decision Mamba, a novel multi-grained state space model (SSM) with a self-evolving policy learning strategy.<n>To address these challenges, we propose Decision Mamba, a novel multi-grained state space model (SSM) with a self-evolving policy learning strategy.<n>To mitigate the overfitting issue on noisy trajectories, a self-evolving policy is proposed by using progressive regularization.
arXiv Detail & Related papers (2024-06-08T10:12:00Z) - CMamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecasting [18.50360049235537]
Mamba, a state space model, has emerged with robust sequence and feature mixing capabilities.
Capturing cross-channel dependencies is critical in enhancing performance of time series prediction.
We introduce a refined Mamba variant tailored for time series forecasting.
arXiv Detail & Related papers (2024-06-08T01:32:44Z) - Federated Bayesian Deep Learning: The Application of Statistical Aggregation Methods to Bayesian Models [0.9940108090221528]
Aggregation strategies have been developed to pool or fuse the weights and biases of distributed deterministic models.
We show that simple application of the aggregation methods associated with FL schemes for deterministic models is either impossible or results in sub-optimal performance.
arXiv Detail & Related papers (2024-03-22T15:02:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.