Related papers: Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning

Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning

URL: http://arxiv.org/abs/2405.00516v1
Date: Wed, 1 May 2024 13:51:45 GMT
Title: Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning
Authors: Lucas-Andreï Thil, Mirela Popa, Gerasimos Spanakis,
Abstract summary: Supervised learning (SL) approaches have achieved impressive performance while utilizing significantly less training data compared to previous methods. We propose a novel approach that combines SL and RL techniques over the MiniWoB benchmark to leverage the strengths of both methods. Our experiments demonstrate that our approach outperforms previous SL methods on certain tasks using less data and narrows the performance gap with RL models.
Score: 6.404122934568861
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements in language models have demonstrated remarkable improvements in various natural language processing (NLP) tasks such as web navigation. Supervised learning (SL) approaches have achieved impressive performance while utilizing significantly less training data compared to previous methods. However, these SL-based models fall short when compared to reinforcement learning (RL) approaches, which have shown superior results. In this paper, we propose a novel approach that combines SL and RL techniques over the MiniWoB benchmark to leverage the strengths of both methods. We also address a critical limitation in previous models' understanding of HTML content, revealing a tendency to memorize target elements rather than comprehend the underlying structure. To rectify this, we propose methods to enhance true understanding and present a new baseline of results. Our experiments demonstrate that our approach outperforms previous SL methods on certain tasks using less data and narrows the performance gap with RL models, achieving 43.58\% average accuracy in SL and 36.69\% when combined with a multimodal RL approach. This study sets a new direction for future web navigation and offers insights into the limitations and potential of language modeling for computer tasks.

Related papers

Scaling DRL for Decision Making: A Survey on Data, Network, and Training Budget Strategies [66.83950068218033]
Scaling Laws demonstrate that scaling model parameters and training data enhances learning performance.<n>Despite its potential to improve performance, the integration of scaling laws into deep reinforcement learning has not been fully realized.<n>This review addresses this gap by systematically analyzing scaling strategies in three dimensions: data, network, and training budget.
arXiv Detail & Related papers (2025-08-05T08:03:12Z)
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking [21.23826888841565]
We present a novel approach for training small language models for reasoning-intensive document ranking. We use web data and a teacher LLM to automatically generate high-quality training examples with relevance explanations. Our model ranks third on the leaderboard while using substantially fewer parameters than other approaches.
arXiv Detail & Related papers (2025-04-04T21:27:48Z)
Achieving Peak Performance for Large Language Models: A Systematic Review [0.0]
Large language models (LLMs) have achieved remarkable success in natural language processing (NLP) As models grow into the trillion- parameter range, computational and memory costs increase significantly. This makes it difficult for many researchers to access the resources needed to train or apply these models.
arXiv Detail & Related papers (2024-09-07T13:57:41Z)
TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning [41.28933724210434]
Language model continual learning (CL) has recently attracted significant interest for its ability to adapt large language models (LLMs) to dynamic real-world scenarios without retraining. Existing approaches commonly utilize multiple parameter-efficient fine-tuning (PEFT) blocks to acquire task-specific knowledge, yet these methods are inefficient and fail to leverage potential knowledge transfer across tasks. We introduce a novel CL framework for language models, named Task Skill Localization and Consolidation (TaSL), which boosts knowledge transfer without depending on memory replay.
arXiv Detail & Related papers (2024-08-09T17:44:45Z)
Augmentations vs Algorithms: What Works in Self-Supervised Learning [9.194402355758164]
We study the relative effects of data augmentations, pretraining algorithms, and model architectures in Self-Supervised Learning (SSL) We propose a new framework which unifies many seemingly disparate SSL methods into a single shared template.
arXiv Detail & Related papers (2024-03-08T23:42:06Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
A Survey of Large Language Models [81.06947636926638]
Language modeling has been widely studied for language understanding and generation in the past two decades. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.
arXiv Detail & Related papers (2023-03-31T17:28:46Z)
A Cohesive Distillation Architecture for Neural Language Models [0.0]
A recent trend in Natural Language Processing is the exponential growth in Language Model (LM) size. This study investigates methods for Knowledge Distillation (KD) to provide efficient alternatives to large-scale models.
arXiv Detail & Related papers (2023-01-12T08:01:53Z)
Implicit Offline Reinforcement Learning via Supervised Learning [83.8241505499762]
Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels. We show how implicit models can leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets.
arXiv Detail & Related papers (2022-10-21T21:59:42Z)
Few-shot Prompting Towards Controllable Response Generation [49.479958672988566]
We first explored the combination of prompting and reinforcement learning (RL) to steer models' generation without accessing any of the models' parameters. We apply multi-task learning to make the model learn to generalize to new tasks better. Experiment results show that our proposed method can successfully control several state-of-the-art (SOTA) dialogue models without accessing their parameters.
arXiv Detail & Related papers (2022-06-08T14:48:06Z)
PaLM: Scaling Language Modeling with Pathways [180.69584031908113]
We trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks.
arXiv Detail & Related papers (2022-04-05T16:11:45Z)
Efficient Nearest Neighbor Language Models [114.40866461741795]
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore. We show how to achieve up to a 6x speed-up in inference speed while retaining comparable performance.
arXiv Detail & Related papers (2021-09-09T12:32:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.