Rethinking Multi-Objective Learning through Goal-Conditioned Supervised Learning
- URL: http://arxiv.org/abs/2412.08911v2
- Date: Sat, 18 Jan 2025 00:35:16 GMT
- Title: Rethinking Multi-Objective Learning through Goal-Conditioned Supervised Learning
- Authors: Shijun Li, Hilaf Hasson, Jing Hu, Joydeep Ghosh,
- Abstract summary: Multi-objective learning aims to optimize multiple objectives simultaneously with a single model.
It suffers from the difficulty to formalize and conduct the exact learning process.
We propose a general framework for automatically learning to achieve multiple objectives based on the existing sequential data.
- Score: 8.593384839118658
- License:
- Abstract: Multi-objective learning aims to optimize multiple objectives simultaneously with a single model for achieving a balanced and satisfying performance on all these objectives. However, it suffers from the difficulty to formalize and conduct the exact learning process, especially considering the possible conflicts between objectives. Existing approaches explores to resolve this primarily in two directions: adapting modeling structure or constraining optimization with certain assumptions. However, a primary issue is that their presuppositions for the effectiveness of their design are insufficient to guarantee the its generality in real-world applications. What's worse, the high space and computation complexity issue makes it even harder to apply them in large-scale, complicated environment such as the recommender systems. To address these issues, we propose a general framework for automatically learning to achieve multiple objectives based on the existing sequential data. We apply the goal-conditioned supervised learning (GCSL) framework to multi-objective learning, by extending the definition of goals from one-dimensional scalar to multi-dimensional vector that perfectly disentangle the representation of different objectives. Meanwhile, GCSL enables the model to simultaneously learn to achieve each objective in a concise supervised learning way, simply guided by existing sequences in the offline data. No additional constraint, special model structure design, or complex optimization algorithms are further required. Apart from that, we formally analyze the property of the goals in GCSL and then firstly propose a goal-generation framework to gain achievable and reasonable goals for inference. Extensive experiments are conducted on real-world recommendation datasets, demonstrating the effectiveness of the proposed method and exploring the feasibility of the goal-generation strategies in GCSL.
Related papers
- Benchmarking General-Purpose In-Context Learning [19.40952728849431]
In-context learning (ICL) empowers generative models to address new tasks effectively and efficiently on the fly.
In this paper, we study extending ICL to address a broader range of tasks with an extended learning horizon and higher improvement potential.
We introduce two benchmarks specifically crafted to train and evaluate GPICL functionalities.
arXiv Detail & Related papers (2024-05-27T14:50:42Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Cycle Consistency Driven Object Discovery [75.60399804639403]
We introduce a method that explicitly optimize the constraint that each object in a scene should be associated with a distinct slot.
By integrating these consistency objectives into various existing slot-based object-centric methods, we showcase substantial improvements in object-discovery performance.
Our results suggest that the proposed approach not only improves object discovery, but also provides richer features for downstream tasks.
arXiv Detail & Related papers (2023-06-03T21:49:06Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Unified Algorithms for RL with Decision-Estimation Coefficients: PAC, Reward-Free, Preference-Based Learning, and Beyond [28.118197762236953]
We develop a unified algorithm framework for a large class of learning goals.
Our framework handles many learning goals such as no-regret RL, PAC RL, reward-free learning, model estimation, and preference-based learning.
As applications, we propose "decouplable representation" as a natural sufficient condition for bounding generalized DECs.
arXiv Detail & Related papers (2022-09-23T17:47:24Z) - Learning Multi-Objective Curricula for Deep Reinforcement Learning [55.27879754113767]
Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL)
In this paper, we propose a unified automatic curriculum learning framework to create multi-objective but coherent curricula.
In addition to existing hand-designed curricula paradigms, we further design a flexible memory mechanism to learn an abstract curriculum.
arXiv Detail & Related papers (2021-10-06T19:30:25Z) - From STL Rulebooks to Rewards [4.859570041295978]
We propose a principled approach to shaping rewards for reinforcement learning from multiple objectives.
We first equip STL with a novel quantitative semantics allowing to automatically evaluate individual requirements.
We then develop a method for systematically combining evaluations of multiple requirements into a single reward.
arXiv Detail & Related papers (2021-10-06T14:16:59Z) - Provable Multi-Objective Reinforcement Learning with Generative Models [98.19879408649848]
We study the problem of single policy MORL, which learns an optimal policy given the preference of objectives.
Existing methods require strong assumptions such as exact knowledge of the multi-objective decision process.
We propose a new algorithm called model-based envelop value (EVI) which generalizes the enveloped multi-objective $Q$-learning algorithm.
arXiv Detail & Related papers (2020-11-19T22:35:31Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - A Distributional View on Multi-Objective Policy Optimization [24.690800846837273]
We propose an algorithm for multi-objective reinforcement learning that enables setting desired preferences for objectives in a scale-invariant way.
We show that setting different preferences in our framework allows us to trace out the space of nondominated solutions.
arXiv Detail & Related papers (2020-05-15T13:02:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.