HiMoE: Heterogeneity-Informed Mixture-of-Experts for Fair Spatial-Temporal Forecasting
- URL: http://arxiv.org/abs/2412.00316v2
- Date: Wed, 22 Jan 2025 08:43:28 GMT
- Title: HiMoE: Heterogeneity-Informed Mixture-of-Experts for Fair Spatial-Temporal Forecasting
- Authors: Shaohan Yu, Pan Deng, Yu Zhao, Junting Liu, Zi'ang Wang,
- Abstract summary: We propose a novel Heterogeneity-informed Mixture-of-Experts (HiMoE) for fair spatial-temporal forecasting.
HiMoE achieves the state-of-the-art performance, outperforming the best baseline with at lease 9.22% in all metrics.
- Score: 8.055360119228606
- License:
- Abstract: Achieving fair prediction performance across nodes is crucial in the spatial-temporal domain, as it ensures the validity and reliability of forecasting outcomes. However, existing models focus primarily on improving the overall accuracy of the prediction, often neglecting the goal of achieving uniformity in the predictions. This task becomes particularly challenging due to the inherent spatial-temporal heterogeneity of the nodes. To address this issue, we propose a novel Heterogeneity-informed Mixture-of-Experts (HiMoE) for fair spatial-temporal forecasting. In particular, we design the Heterogeneity-Informed Graph Convolutional Network (HiGCN), which leverages the fusion of multi-graph and edge masking to flexibly model spatial dependencies. Moreover, we introduce the Node-wise Mixture-of-Experts (NMoE), which allocates prediction tasks of different nodes to suitable experts through graph decoupling routing. To further improve the model, fairness-aware loss and evaluation functions are proposed, optimizing the model with fairness and accuracy as objectives. Experiments on four datasets from different real-world scenarios demonstrate that HiMoE achieves the state-of-the-art performance, outperforming the best baseline with at lease 9.22% in all metrics.
Related papers
- MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - SFANet: Spatial-Frequency Attention Network for Weather Forecasting [54.470205739015434]
Weather forecasting plays a critical role in various sectors, driving decision-making and risk management.
Traditional methods often struggle to capture the complex dynamics of meteorological systems.
We propose a novel framework designed to address these challenges and enhance the accuracy of weather prediction.
arXiv Detail & Related papers (2024-05-29T08:00:15Z) - From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks [0.0]
We reinvigorate maximum likelihood estimation (MLE) for macroeconomic density forecasting through a novel neural network architecture with dedicated mean and variance hemispheres.
Our Hemisphere Neural Network (HNN) provides proactive volatility forecasts based on leading indicators when it can, and reactive volatility based on the magnitude of previous prediction errors when it must.
arXiv Detail & Related papers (2023-11-27T21:37:50Z) - Fairer and More Accurate Tabular Models Through NAS [14.147928131445852]
We propose using multi-objective Neural Architecture Search (NAS) and Hyperparameter Optimization (HPO) in the first application to the very challenging domain of tabular data.
We show that models optimized solely for accuracy with NAS often fail to inherently address fairness concerns.
We produce architectures that consistently dominate state-of-the-art bias mitigation methods either in fairness, accuracy or both.
arXiv Detail & Related papers (2023-10-18T17:56:24Z) - Precision-Recall Divergence Optimization for Generative Modeling with
GANs and Normalizing Flows [54.050498411883495]
We develop a novel training method for generative models, such as Generative Adversarial Networks and Normalizing Flows.
We show that achieving a specified precision-recall trade-off corresponds to minimizing a unique $f$-divergence from a family we call the textitPR-divergences.
Our approach improves the performance of existing state-of-the-art models like BigGAN in terms of either precision or recall when tested on datasets such as ImageNet.
arXiv Detail & Related papers (2023-05-30T10:07:17Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Efficient Graph Neural Network Inference at Large Scale [54.89457550773165]
Graph neural networks (GNNs) have demonstrated excellent performance in a wide range of applications.
Existing scalable GNNs leverage linear propagation to preprocess the features and accelerate the training and inference procedure.
We propose a novel adaptive propagation order approach that generates the personalized propagation order for each node based on its topological information.
arXiv Detail & Related papers (2022-11-01T14:38:18Z) - Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes.
We propose a metric that quantifies the ability of a graph to mix the current gradients.
Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z) - Bayesian Spatial Predictive Synthesis [8.66529877559667]
spatial dependence is a prevalent and critical issue in spatial data analysis and prediction.
We propose a novel Bayesian ensemble methodology that captures spatially-varying model uncertainty.
We show that our method provides a finite sample theoretical guarantee for its predictive performance.
arXiv Detail & Related papers (2022-03-10T07:16:29Z) - HYPER: Learned Hybrid Trajectory Prediction via Factored Inference and
Adaptive Sampling [27.194900145235007]
We introduce HYPER, a general and expressive hybrid prediction framework.
By modeling traffic agents as a hybrid discrete-continuous system, our approach is capable of predicting discrete intent changes over time.
We train and validate our model on the Argoverse dataset, and demonstrate its effectiveness through comprehensive ablation studies and comparisons with state-of-the-art models.
arXiv Detail & Related papers (2021-10-05T20:20:10Z) - GraphTCN: Spatio-Temporal Interaction Modeling for Human Trajectory
Prediction [5.346782918364054]
We propose a novel CNN-based spatial-temporal graph framework GraphCNT to support more efficient and accurate trajectory predictions.
In contrast to conventional models, both the spatial and temporal modeling of our model are computed within each local time window.
Our model achieves better performance in terms of both efficiency and accuracy as compared with state-of-the-art models on various trajectory prediction benchmark datasets.
arXiv Detail & Related papers (2020-03-16T12:56:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.