Related papers: Meta-Statistical Learning: Supervised Learning of Statistical Inference

Meta-Statistical Learning: Supervised Learning of Statistical Inference

URL: http://arxiv.org/abs/2502.12088v2
Date: Wed, 19 Feb 2025 22:12:49 GMT
Title: Meta-Statistical Learning: Supervised Learning of Statistical Inference
Authors: Maxime Peyrard, Kyunghyun Cho,
Abstract summary: This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks.<n>We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
Score: 59.463430294611626
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks, where the goal is to predict properties of the data-generating distribution rather than labels for individual datapoints. These tasks encompass statistical inference problems such as parameter estimation, hypothesis testing, or mutual information estimation. Framing these tasks within traditional machine learning pipelines is challenging, as supervision is typically tied to individual datapoint. We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems. In this approach, entire datasets are treated as single inputs to neural networks, which predict distribution-level parameters. Transformer-based architectures, without positional encoding, provide a natural fit due to their permutation-invariance properties. By training on large-scale synthetic datasets, meta-statistical models can leverage the scalability and optimization infrastructure of Transformer-based LLMs. We demonstrate the framework's versatility with applications in hypothesis testing and mutual information estimation, showing strong performance, particularly for small datasets where traditional neural methods struggle.

Related papers

Interpretable Feature Interaction via Statistical Self-supervised Learning on Tabular Data [22.20955211690874]
Spofe is a novel self-supervised machine learning pipeline that captures principled representation to achieve clear interpretability with statistical rigor. Underpinning our approach is a robust theoretical framework that delivers precise error bounds and rigorous false discovery rate (FDR) control. Experiments on diverse real-world datasets demonstrate the effectiveness of Spofe.
arXiv Detail & Related papers (2025-03-23T12:27:42Z)
Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.<n>We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.<n>We propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark.
arXiv Detail & Related papers (2024-10-24T17:56:08Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data. For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z)
Statistical inference using machine learning and classical techniques based on accumulated local effects (ALE) [0.0]
Accumulated Local Effects (ALE) is a model-agnostic approach for global explanations of machine learning algorithms. There are at least three challenges with conducting statistical inference based on ALE. We introduce innovative tools and techniques for statistical inference using ALE.
arXiv Detail & Related papers (2023-10-15T16:17:21Z)
Multi-Task Learning with Summary Statistics [4.871473117968554]
We propose a flexible multi-task learning framework utilizing summary statistics from various sources. We also present an adaptive parameter selection approach based on a variant of Lepski's method. This work offers a more flexible tool for training related models across various domains, with practical implications in genetic risk prediction.
arXiv Detail & Related papers (2023-07-05T15:55:23Z)
An Entropy-Based Model for Hierarchical Learning [3.1473798197405944]
A common feature among real-world datasets is that data domains are multiscale. We propose a learning model that exploits this multiscale data structure. The hierarchical learning model is inspired by the logical and progressive easy-to-hard learning mechanism of human beings.
arXiv Detail & Related papers (2022-12-30T13:14:46Z)
Learning Prototype-oriented Set Representations for Meta-Learning [85.19407183975802]
Learning from set-structured data is a fundamental problem that has recently attracted increasing attention. This paper provides a novel optimal transport based way to improve existing summary networks. We further instantiate it to the cases of few-shot classification and implicit meta generative modeling.
arXiv Detail & Related papers (2021-10-18T09:49:05Z)
Learning Neural Models for Natural Language Processing in the Face of Distributional Shift [10.990447273771592]
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications. It builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. It is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime
arXiv Detail & Related papers (2021-09-03T14:29:20Z)
Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples. We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries. We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
Meta-learning framework with applications to zero-shot time-series forecasting [82.61728230984099]
This work provides positive evidence using a broad meta-learning framework. residual connections act as a meta-learning adaptation mechanism. We show that it is viable to train a neural network on a source TS dataset and deploy it on a different target TS dataset without retraining.
arXiv Detail & Related papers (2020-02-07T16:39:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.