Related papers: Fence Theorem: Towards Dual-Objective Semantic-Structure Isolation in Preprocessing Phase for 3D Anomaly Detection

Related papers

WavePhaseNet: A DFT-Based Method for Constructing Semantic Conceptual Hierarchy Structures (SCHS) [0.0]
This paper reformulates Transformer/Attention mechanisms in Large Language Models.<n> Dimensionality Reduction GPT-4's 24,576-dimensional embedding space exhibits a 1/f spectral structure based on language self-similarity and Zipf's law.<n>Cohomological Consistency Control The reduced embedding space, constructed via cohomological regularization over overlapping local windows, allows defining a graph structure and cochain complex.
arXiv Detail & Related papers (2026-02-16T03:07:41Z)
ELROND: Exploring and decomposing intrinsic capabilities of diffusion models [3.656403721249365]
A single text prompt passed to a diffusion model often yields a wide range of visual outputs determined solely by process.<n>We propose a framework to disentangle these semantic directions directly within the input embedding.
arXiv Detail & Related papers (2026-02-10T19:07:15Z)
Dynamics Within Latent Chain-of-Thought: An Empirical Study of Causal Structure [58.89643769707751]
We study latent chain-of-thought as a manipulable causal process in representation space.<n>We find that latent-step budgets behave less like homogeneous extra depth and more like staged functionality with non-local routing.<n>These results motivate mode-conditional and stability-aware analyses as more reliable tools for interpreting and improving latent reasoning systems.
arXiv Detail & Related papers (2026-02-09T15:25:12Z)
Universal Latent Homeomorphic Manifolds: Cross-Domain Representation Learning via Homeomorphism Verification [4.509161738293017]
We present a framework that unifies semantic representations and observation-driven machine representations into a single latent structure.<n>We use emphhomeomorphism as a criterion for determining when latent manifold induced by different semantic-observation pairs can be rigorously unified.<n>This criterion provides theoretical guarantees for three critical applications: (1) semantic-guided sparse recovery from incomplete observations, (2) cross-domain transfer learning with verified structural compatibility, and (3) zero-shot compositional learning via valid transfer from semantic to observation space.
arXiv Detail & Related papers (2026-01-13T23:08:16Z)
Foundations of Diffusion Models in General State Spaces: A Self-Contained Introduction [54.95522167029998]
This article is a self-contained primer on diffusion over general state spaces.<n>We develop the discrete-time view (forward noising via Markov kernels and learned reverse dynamics) alongside its continuous-time limits.<n>A common variational treatment yields the ELBO that underpins standard training losses.
arXiv Detail & Related papers (2025-12-04T18:55:36Z)
Geometrically-Constrained Agent for Spatial Reasoning [53.93718394870856]
Vision Language Models exhibit a fundamental semantic-to-geometric gap in spatial reasoning.<n>Current paradigms fail to bridge this gap.<n>We propose a training-free agentic paradigm that resolves this gap by introducing a formal task constraint.
arXiv Detail & Related papers (2025-11-27T17:50:37Z)
Adaptive Dual Uncertainty Optimization: Boosting Monocular 3D Object Detection under Test-Time Shifts [80.32933059529135]
Test-Time Adaptation (TTA) methods have emerged to adapt to target distributions during inference.<n>We propose Dual Uncertainty Optimization (DUO), the first TTA framework designed to jointly minimize both uncertainties for robust M3OD.<n>In parallel, we design a semantic-aware normal field constraint that preserves geometric coherence in regions with clear semantic cues.
arXiv Detail & Related papers (2025-08-28T07:09:21Z)
From Entanglement to Alignment: Representation Space Decomposition for Unsupervised Time Series Domain Adaptation [18.100665738436398]
We introduce DARSD, a novel UDA framework with theoretical explainability that explicitly realizes UDA tasks from the perspective of representation space decomposition.<n>DarSD consists of three synergistic components: (I) An adversarial learnable common invariant basis that projects original features into a domain-invariant subspace while preserving semantic content; (II) A pseudo-labeling mechanism that dynamically separates target features based on confidence, hindering error accumulation; (III) A hybrid contrastive optimization strategy that simultaneously enforces feature clustering and consistency while mitigating emerging distribution gaps.
arXiv Detail & Related papers (2025-07-28T16:26:28Z)
Transfinite Fixed Points in Alpay Algebra as Ordinal Game Equilibria in Dependent Type Theory [0.0]
This paper contributes to the Alpay Algebra by demonstrating that the stable outcome of a self referential process is identical to the unique equilibrium of an unbounded revision dialogue between a system and its environment.<n>By unifying concepts from fixed point theory, game semantics, ordinal analysis, and type theory, this research establishes a broadly accessible yet formally rigorous foundation for reasoning about infinite self referential systems.
arXiv Detail & Related papers (2025-07-25T13:12:55Z)
Escaping Plato's Cave: JAM for Aligning Independently Trained Vision and Language Models [29.59537209390697]
We introduce a framework that trains modality-specific autoencoders on latent representations of single modality models.<n>By analogy, this framework serves as a method to escape Plato's Cave, enabling the emergence of shared structure from disjoint inputs.
arXiv Detail & Related papers (2025-07-01T21:43:50Z)
The Unified Cognitive Consciousness Theory for Language Models: Anchoring Semantics, Thresholds of Activation, and Emergent Reasoning [2.0800882594868293]
Unified Cognitive Consciousness Theory (UCCT) casts them as vast unconscious pattern repositories.<n>UCCT formalizes this process as Bayesian competition between statistical priors learned in pre-training and context-driven target patterns.<n>We ground the theory in three principles: threshold crossing, modality, and density-distance predictive power.
arXiv Detail & Related papers (2025-06-02T18:12:43Z)
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding [58.38294408121273]
We propose Cross-modal and Uncertainty-aware Agglomeration for Open-vocabulary 3D Scene Understanding dubbed CUA-O3D. Our method addresses two key challenges: (1) incorporating semantic priors from VLMs alongside the geometric knowledge of spatially-aware vision foundation models, and (2) using a novel deterministic uncertainty estimation to capture model-specific uncertainties.
arXiv Detail & Related papers (2025-03-20T20:58:48Z)
ARC-Flow : Articulated, Resolution-Agnostic, Correspondence-Free Matching and Interpolation of 3D Shapes Under Flow Fields [4.706075725469252]
This work presents a unified framework for the unsupervised prediction of physically plausibles between two 3D articulated shapes. Interpolation is modelled as a diffeomorphic transformation using a smooth, time-varying flow field governed by Neural Ordinary Differential Equations (ODEs) Correspondence is recovered using an efficient Varifold formulation, that is effective on high-fidelity surfaces with differing parameterisations.
arXiv Detail & Related papers (2025-03-04T13:28:05Z)
3D Human Pose Analysis via Diffusion Synthesis [65.268245109828]
PADS represents the first diffusion-based framework for tackling general 3D human pose analysis within the inverse problem framework. Its performance has been validated on different benchmarks, signaling the adaptability and robustness of this pipeline.
arXiv Detail & Related papers (2024-01-17T02:59:34Z)
Exploiting Diffusion Prior for Generalizable Dense Prediction [85.4563592053464]
Recent advanced Text-to-Image (T2I) diffusion models are sometimes too imaginative for existing off-the-shelf dense predictors to estimate. We introduce DMP, a pipeline utilizing pre-trained T2I models as a prior for dense prediction tasks. Despite limited-domain training data, the approach yields faithful estimations for arbitrary images, surpassing existing state-of-the-art algorithms.
arXiv Detail & Related papers (2023-11-30T18:59:44Z)
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space. However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts. We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z)
Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning [53.68371566336254]
We argue that the key to better performance lies in meaningful latent modality structures instead of perfect modality alignment. Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization.
arXiv Detail & Related papers (2023-03-10T14:38:49Z)
Let us Build Bridges: Understanding and Extending Diffusion Generative Models [19.517597928769042]
Diffusion-based generative models have achieved promising results recently, but raise an array of open questions. This work tries to re-exam the overall framework in order to gain better theoretical understandings. We present 1) a first theoretical error analysis for learning diffusion generation models, and 2) a simple and unified approach to learning on data from different discrete and constrained domains.
arXiv Detail & Related papers (2022-08-31T08:58:10Z)
Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations. We derive suitable measures to quantify prediction uncertainty at both pose and joint level. We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z)
Shape Consistent 2D Keypoint Estimation under Domain Shift [35.15266729401601]
We present a novel deep adaptation framework for estimating keypoints under domain shift. Our method seamlessly combines three different components: feature alignment, adversarial training and self-supervision. Our approach outperforms state-of-the-art domain adaptation methods in the 2D keypoint prediction task.
arXiv Detail & Related papers (2020-08-04T14:32:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.