Information Theory in Open-world Machine Learning Foundations, Frameworks, and Future Direction
- URL: http://arxiv.org/abs/2510.15422v1
- Date: Fri, 17 Oct 2025 08:20:56 GMT
- Title: Information Theory in Open-world Machine Learning Foundations, Frameworks, and Future Direction
- Authors: Lin Wang,
- Abstract summary: Open world Machine Learning (OWML) aims to develop intelligent systems capable of recognizing known categories, rejecting unknown samples, and continually learning from novel information.<n>Despite significant progress in open set recognition, novelty detection, and continual learning, the field still lacks a unified theoretical foundation.<n>This paper presents a comprehensive review of information theoretic approaches in open world machine learning.
- Score: 4.049865011707225
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open world Machine Learning (OWML) aims to develop intelligent systems capable of recognizing known categories, rejecting unknown samples, and continually learning from novel information. Despite significant progress in open set recognition, novelty detection, and continual learning, the field still lacks a unified theoretical foundation that can quantify uncertainty, characterize information transfer, and explain learning adaptability in dynamic, nonstationary environments. This paper presents a comprehensive review of information theoretic approaches in open world machine learning, emphasizing how core concepts such as entropy, mutual information, and Kullback Leibler divergence provide a mathematical language for describing knowledge acquisition, uncertainty suppression, and risk control under open world conditions. We synthesize recent studies into three major research axes: information theoretic open set recognition enabling safe rejection of unknowns, information driven novelty discovery guiding new concept formation, and information retentive continual learning ensuring stable long term adaptation. Furthermore, we discuss theoretical connections between information theory and provable learning frameworks, including PAC Bayes bounds, open-space risk theory, and causal information flow, to establish a pathway toward provable and trustworthy open world intelligence. Finally, the review identifies key open problems and future research directions, such as the quantification of information risk, development of dynamic mutual information bounds, multimodal information fusion, and integration of information theory with causal reasoning and world model learning.
Related papers
- Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation [77.10390725623125]
retrieval-augmented generation (RAG) is widely employed to expand their knowledge scope.<n>Since RAG has shown promise in knowledge-intensive tasks like open-domain question answering, its broader application to complex tasks and intelligent assistants has further advanced its utility.<n>We present a systematic investigation of the intrinsic mechanisms by which RAGs integrate internal (parametric) and external (retrieved) knowledge.
arXiv Detail & Related papers (2025-05-17T13:13:13Z) - How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training [92.88889953768455]
Large Language Models (LLMs) face a critical gap in understanding how they internalize new knowledge.<n>We identify computational subgraphs that facilitate knowledge storage and processing.
arXiv Detail & Related papers (2025-02-16T16:55:43Z) - A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Open-world machine learning: A review and new outlooks [117.33922838201993]
Article presents a holistic view of open-world machine learning.<n>It investigates unknown rejection, novelty discovery, and continual learning.<n>It aims to help researchers build more powerful AI systems in their respective fields.
arXiv Detail & Related papers (2024-03-04T06:25:26Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - Beyond Factuality: A Comprehensive Evaluation of Large Language Models
as Knowledge Generators [78.63553017938911]
Large language models (LLMs) outperform information retrieval techniques for downstream knowledge-intensive tasks.
However, community concerns abound regarding the factuality and potential implications of using this uncensored knowledge.
We introduce CONNER, designed to evaluate generated knowledge from six important perspectives.
arXiv Detail & Related papers (2023-10-11T08:22:37Z) - Detecting and Learning Out-of-Distribution Data in the Open world:
Algorithm and Theory [15.875140867859209]
This thesis makes contributions to the realm of machine learning, specifically in the context of open-world scenarios.
Research investigates two intertwined steps essential for open-world machine learning: Out-of-distribution (OOD) Detection and Open-world Representation Learning (ORL)
arXiv Detail & Related papers (2023-10-10T00:25:21Z) - To Compress or Not to Compress- Self-Supervised Learning and Information
Theory: A Review [30.87092042943743]
Deep neural networks excel in supervised learning tasks but are constrained by the need for extensive labeled data.
Self-supervised learning emerges as a promising alternative, allowing models to learn without explicit labels.
Information theory, and notably the information bottleneck principle, has been pivotal in shaping deep neural networks.
arXiv Detail & Related papers (2023-04-19T00:33:59Z) - Informed Learning by Wide Neural Networks: Convergence, Generalization
and Sampling Complexity [27.84415856657607]
We study how and why domain knowledge benefits the performance of informed learning.
We propose a generalized informed training objective to better exploit the benefits of knowledge and balance the label and knowledge imperfectness.
arXiv Detail & Related papers (2022-07-02T06:28:25Z) - A Critical Review of Information Bottleneck Theory and its Applications
to Deep Learning [0.0]
Deep neural networks have seen unparalleled improvements that continue to impact every aspect of today's society.
The information bottleneck theory has emerged as a promising approach to better understand the learning dynamics of neural networks.
The goal of this survey is to provide a comprehensive review of IB theory covering it's information theoretic roots and the recently proposed applications to understand deep learning models.
arXiv Detail & Related papers (2021-05-07T14:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.