Related papers: Tri-Level Navigator: LLM-Empowered Tri-Level Learning for Time Series OOD Generalization

Tri-Level Navigator: LLM-Empowered Tri-Level Learning for Time Series OOD Generalization

URL: http://arxiv.org/abs/2410.07018v2
Date: Sat, 2 Nov 2024 00:58:53 GMT
Title: Tri-Level Navigator: LLM-Empowered Tri-Level Learning for Time Series OOD Generalization
Authors: Chengtao Jian, Kai Yang, Yang Jiao,
Abstract summary: We investigate time series OOD generalization via pre-trained Large Language Models. We first propose a novel textbfTri-level learning framework for textbfSeries textbfOOD generalization, termed TTSO. We develop a stratified localization algorithm tailored for this tri-level optimization problem, theoretically demonstrating the guaranteed convergence of the proposed algorithm.
Score: 9.95894026392039
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Out-of-Distribution (OOD) generalization in machine learning is a burgeoning area of study. Its primary goal is to enhance the adaptability and resilience of machine learning models when faced with new, unseen, and potentially adversarial data that significantly diverges from their original training datasets. In this paper, we investigate time series OOD generalization via pre-trained Large Language Models (LLMs). We first propose a novel \textbf{T}ri-level learning framework for \textbf{T}ime \textbf{S}eries \textbf{O}OD generalization, termed TTSO, which considers both sample-level and group-level uncertainties. This formula offers a fresh theoretic perspective for formulating and analyzing OOD generalization problem. In addition, we provide a theoretical analysis to justify this method is well motivated. We then develop a stratified localization algorithm tailored for this tri-level optimization problem, theoretically demonstrating the guaranteed convergence of the proposed algorithm. Our analysis also reveals that the iteration complexity to obtain an $\epsilon$-stationary point is bounded by O($\frac{1}{\epsilon^{2}}$). Extensive experiments on real-world datasets have been conducted to elucidate the effectiveness of the proposed method.

Related papers

A Theory of Inference Compute Scaling: Reasoning through Directed Stochastic Skill Search [15.387256204743407]
Large language models (LLMs) demand considerable computational, energy, and financial resources during both training and deployment.<n>Inference costs now represent a significant and growing component of the overall resource burden.<n>We introduce directed skill search (DS3), a general framework that represents inference as expressive over a learned skill graph.
arXiv Detail & Related papers (2025-06-10T14:47:48Z)
Learning via Surrogate PAC-Bayes [13.412960492870996]
PAC-Bayes learning is a comprehensive setting for studying the generalisation ability of learning algorithms. We introduce a novel principled strategy for building an iterative learning algorithm via the optimisation of a sequence of surrogate training objectives. On top of providing that generic recipe for learning via surrogate PAC-Bayes bounds, we (i) contribute theoretical results establishing that iteratively optimising our surrogates implies the optimisation of the original generalisation bounds.
arXiv Detail & Related papers (2024-10-14T07:45:50Z)
On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling problem on a continuous domain for (multimodal) self-supervised representation learning. We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning.
arXiv Detail & Related papers (2024-10-11T18:02:46Z)
Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation [105.23631749213729]
We propose a novel method for unsupervised pre-training in low-data regimes. Inspired by the recently successful prompting technique, we introduce a new method, Unsupervised Pre-training with Language-Vision Prompts. We show that our method can converge faster and perform better than CNN-based models in low-data regimes.
arXiv Detail & Related papers (2024-05-22T06:48:43Z)
Understanding Generalization of Federated Learning via Stability: Heterogeneity Matters [1.4502611532302039]
Generalization performance is a key metric in evaluating machine learning models when applied to real-world applications. Generalization performance is a key metric in evaluating machine learning models when applied to real-world applications.
arXiv Detail & Related papers (2023-06-06T16:12:35Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity. We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level. Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z)
On the Stability and Generalization of Triplet Learning [55.75784102837832]
Triplet learning, i.e. learning from triplet data, has attracted much attention in computer vision tasks. This paper investigates the generalization guarantees of triplet learning by leveraging the stability analysis.
arXiv Detail & Related papers (2023-02-20T07:32:50Z)
Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z)
GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making. We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation. We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z)
Deep Active Ensemble Sampling For Image Classification [8.31483061185317]
Active learning frameworks aim to reduce the cost of data annotation by actively requesting the labeling for the most informative data points. Some proposed approaches include uncertainty-based techniques, geometric methods, implicit combination of uncertainty-based and geometric approaches. We present an innovative integration of recent progress in both uncertainty-based and geometric frameworks to enable an efficient exploration/exploitation trade-off in sample selection strategy. Our framework provides two advantages: (1) accurate posterior estimation, and (2) tune-able trade-off between computational overhead and higher accuracy.
arXiv Detail & Related papers (2022-10-11T20:20:20Z)
An Information-Theoretic Framework for Unifying Active Learning Problems [44.758281991246825]
This paper presents an information-theoretic framework for unifying active learning problems. We first introduce a novel active learning criterion that subsumes an existing LSE algorithm. By exploiting the relationship between LSE and BO, we design a competitive information-theoretic acquisition function for BO.
arXiv Detail & Related papers (2020-12-19T14:22:48Z)
A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton [49.23948907229656]
Bi-level Descent Aggregation is a flexible and modularized algorithmic framework for generic bi-level optimization. We derive a new methodology to prove the convergence of BDA without the LLS condition. Our investigations also demonstrate that BDA is indeed compatible to a verify of particular first-order computation modules.
arXiv Detail & Related papers (2020-06-07T05:18:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.