Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents
- URL: http://arxiv.org/abs/2508.01858v1
- Date: Sun, 03 Aug 2025 17:17:52 GMT
- Title: Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents
- Authors: Yuhan Guo, Cong Guo, Aiwen Sun, Hongliang He, Xinyu Yang, Yue Lu, Yingji Zhang, Xuntao Guo, Dong Zhang, Jianzhuang Liu, Jiang Duan, Yijia Xiao, Liangjian Wen, Hai-Ming Xu, Yong Dai,
- Abstract summary: We decompose a web agent's capabilities into two essential stages: knowledge content learning and cognitive processes.<n>To facilitate knowledge acquisition, we construct the Web-CogDataset, a structured resource curated from 14 real-world websites.<n>Building on this foundation, we operationalize these processes through a novel knowledge-driven Chain-of-Thought (CoT) reasoning framework.
- Score: 49.88380945341337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal large-scale models have significantly advanced the development of web agents, enabling perception and interaction with digital environments akin to human cognition. In this paper, we argue that web agents must first acquire sufficient knowledge to effectively engage in cognitive reasoning. Therefore, we decompose a web agent's capabilities into two essential stages: knowledge content learning and cognitive processes. To formalize this, we propose Web-CogKnowledge Framework, categorizing knowledge as Factual, Conceptual, and Procedural. In this framework, knowledge content learning corresponds to the agent's processes of Memorizing and Understanding, which rely on the first two knowledge types, representing the "what" of learning. Conversely, cognitive processes correspond to Exploring, grounded in Procedural knowledge, defining the "how" of reasoning and action. To facilitate knowledge acquisition, we construct the Web-CogDataset, a structured resource curated from 14 real-world websites, designed to systematically instill core knowledge necessary for web agent. This dataset serves as the agent's conceptual grounding-the "nouns" upon which comprehension is built-as well as the basis for learning how to reason and act. Building on this foundation, we operationalize these processes through a novel knowledge-driven Chain-of-Thought (CoT) reasoning framework, developing and training our proposed agent, the Web-CogReasoner. Extensive experimentation reveals its significant superiority over existing models, especially in generalizing to unseen tasks where structured knowledge is decisive. To enable rigorous evaluation, we introduce the Web-CogBench, a comprehensive evaluation suite designed to assess and compare agent performance across the delineated knowledge domains and cognitive capabilities. Our code and data is open sourced at https://github.com/Gnonymous/Web-CogReasoner
Related papers
- Cognitive Duality for Adaptive Web Agents [3.0069922338220825]
We propose a principled decomposition into fast System 1 and slow System 2 cognitive processes.<n>We implement this framework in CogniWeb, a modular agent architecture that adaptively toggles between fast intuitive processing and deliberate reasoning based on task complexity.
arXiv Detail & Related papers (2025-08-07T07:05:22Z) - KnowCoder-V2: Deep Knowledge Analysis [64.63893361811968]
We propose a textbfKnowledgeable textbfDeep textbfResearch (textbfKDR) framework that empowers deep research with deep knowledge analysis capability.<n>It introduces an independent knowledge organization phase to preprocess large-scale, domain-relevant data into systematic knowledge offline.<n>It then extends deep research with an additional kind of reasoning steps that perform complex knowledge computation in an online manner.
arXiv Detail & Related papers (2025-06-07T18:01:25Z) - Enhanced Question-Answering for Skill-based learning using Knowledge-based AI and Generative AI [2.6699230340282796]
We introduce Ivy, an intelligent agent that generates explanations that embody teleological, causal, and compositional principles.<n>This can potentially ensure learners develop a comprehensive understanding of skills crucial for effective problem-solving in online environments.
arXiv Detail & Related papers (2025-04-10T05:25:52Z) - Beyond Factuality: A Comprehensive Evaluation of Large Language Models
as Knowledge Generators [78.63553017938911]
Large language models (LLMs) outperform information retrieval techniques for downstream knowledge-intensive tasks.
However, community concerns abound regarding the factuality and potential implications of using this uncensored knowledge.
We introduce CONNER, designed to evaluate generated knowledge from six important perspectives.
arXiv Detail & Related papers (2023-10-11T08:22:37Z) - Worth of knowledge in deep learning [3.132595571344153]
We present a framework inspired by interpretable machine learning to evaluate the worth of knowledge.
Our findings elucidate the complex relationship between data and knowledge, including dependence, synergistic, and substitution effects.
Our model-agnostic framework can be applied to a variety of common network architectures, providing a comprehensive understanding of the role of prior knowledge in deep learning models.
arXiv Detail & Related papers (2023-07-03T02:25:19Z) - Learning by Applying: A General Framework for Mathematical Reasoning via
Enhancing Explicit Knowledge Learning [47.96987739801807]
We propose a framework to enhance existing models (backbones) in a principled way by explicit knowledge learning.
In LeAp, we perform knowledge learning in a novel problem-knowledge-expression paradigm.
We show that LeAp improves all backbones' performances, learns accurate knowledge, and achieves a more interpretable reasoning process.
arXiv Detail & Related papers (2023-02-11T15:15:41Z) - Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain.
It tackles the problem from two aspects: extracting knowledge and memorizing knowledge.
It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z) - A Data-Driven Study of Commonsense Knowledge using the ConceptNet
Knowledge Base [8.591839265985412]
Acquiring commonsense knowledge and reasoning is recognized as an important frontier in achieving general Artificial Intelligence (AI)
In this paper, we propose and conduct a systematic study to enable a deeper understanding of commonsense knowledge by doing an empirical and structural analysis of the ConceptNet knowledge base.
Detailed experimental results on three carefully designed research questions, using state-of-the-art unsupervised graph representation learning ('embedding') and clustering techniques, reveal deep substructures in ConceptNet relations.
arXiv Detail & Related papers (2020-11-28T08:08:25Z) - Online Structured Meta-learning [137.48138166279313]
Current online meta-learning algorithms are limited to learn a globally-shared meta-learner.
We propose an online structured meta-learning (OSML) framework to overcome this limitation.
Experiments on three datasets demonstrate the effectiveness and interpretability of our proposed framework.
arXiv Detail & Related papers (2020-10-22T09:10:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.