Feature-Based Lie Group Transformer for Real-World Applications
- URL: http://arxiv.org/abs/2506.04668v3
- Date: Mon, 09 Jun 2025 12:10:31 GMT
- Title: Feature-Based Lie Group Transformer for Real-World Applications
- Authors: Takayuki Komatsu, Yoshiyuki Ohmura, Kayato Nishitsunoi, Yasuo Kuniyoshi,
- Abstract summary: The main goal of representation learning is to acquire meaningful representations from real-world sensory inputs without supervision.<n>We propose a new method using group decomposition in Galois algebra theory.<n>Although this method is promising for defining a more general representation, it assumes pixel-to-pixel translation without feature extraction.<n>We provide a method to apply our group decomposition theory to a more realistic scenario by combining feature extraction and object segmentation.
- Score: 3.1936317340169817
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The main goal of representation learning is to acquire meaningful representations from real-world sensory inputs without supervision. Representation learning explains some aspects of human development. Various neural network (NN) models have been proposed that acquire empirically good representations. However, the formulation of a good representation has not been established. We recently proposed a method for categorizing changes between a pair of sensory inputs. A unique feature of this approach is that transformations between two sensory inputs are learned to satisfy algebraic structural constraints. Conventional representation learning often assumes that disentangled independent feature axes is a good representation; however, we found that such a representation cannot account for conditional independence. To overcome this problem, we proposed a new method using group decomposition in Galois algebra theory. Although this method is promising for defining a more general representation, it assumes pixel-to-pixel translation without feature extraction, and can only process low-resolution images with no background, which prevents real-world application. In this study, we provide a simple method to apply our group decomposition theory to a more realistic scenario by combining feature extraction and object segmentation. We replace pixel translation with feature translation and formulate object segmentation as grouping features under the same transformation. We validated the proposed method on a practical dataset containing both real-world object and background. We believe that our model will lead to a better understanding of human development of object recognition in the real world.
Related papers
- Efficient Fairness-Performance Pareto Front Computation [51.558848491038916]
We show that optimal fair representations possess several useful structural properties.
We then show that these approxing problems can be solved efficiently via concave programming methods.
arXiv Detail & Related papers (2024-09-26T08:46:48Z) - Flow Factorized Representation Learning [109.51947536586677]
We introduce a generative model which specifies a distinct set of latent probability paths that define different input transformations.
We show that our model achieves higher likelihoods on standard representation learning benchmarks while simultaneously being closer to approximately equivariant models.
arXiv Detail & Related papers (2023-09-22T20:15:37Z) - Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds.
Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z) - Subspace Nonnegative Matrix Factorization for Feature Representation [14.251799988700558]
Nonnegative matrix factorization (NMF) learns a new feature representation on the whole data space, which means treating all features equally.
This paper proposes a new NMF method by introducing adaptive weights to identify key features in the original space so that only a subspace involves generating the new representation.
Experimental results on several real-world datasets demonstrated that the proposed methods can generate a more accurate feature representation than existing methods.
arXiv Detail & Related papers (2022-04-18T16:07:06Z) - Leveraging Equivariant Features for Absolute Pose Regression [9.30597356471664]
We show that a translation and rotation equivariant Convolutional Neural Network directly induces representations of camera motions into the feature space.
We then show that this geometric property allows for implicitly augmenting the training data under a whole group of image plane-preserving transformations.
arXiv Detail & Related papers (2022-04-05T12:44:20Z) - Fair Interpretable Representation Learning with Correction Vectors [60.0806628713968]
We propose a new framework for fair representation learning that is centered around the learning of "correction vectors"
We show experimentally that several fair representation learning models constrained in such a way do not exhibit losses in ranking or classification performance.
arXiv Detail & Related papers (2022-02-07T11:19:23Z) - Fair Interpretable Learning via Correction Vectors [68.29997072804537]
We propose a new framework for fair representation learning centered around the learning of "correction vectors"
The corrections are then simply summed up to the original features, and can therefore be analyzed as an explicit penalty or bonus to each feature.
We show experimentally that a fair representation learning problem constrained in such a way does not impact performance.
arXiv Detail & Related papers (2022-01-17T10:59:33Z) - Self-Supervised Learning Disentangled Group Representation as Feature [82.07737719232972]
We show that existing Self-Supervised Learning (SSL) only disentangles simple augmentation features such as rotation and colorization.
We propose an iterative SSL algorithm: Iterative Partition-based Invariant Risk Minimization (IP-IRM)
We prove that IP-IRM converges to a fully disentangled representation and show its effectiveness on various benchmarks.
arXiv Detail & Related papers (2021-10-28T16:12:33Z) - GENESIS-V2: Inferring Unordered Object Representations without Iterative
Refinement [26.151968529063762]
We develop a new model, GENESIS-V2, which can infer a variable number of object representations without using RNNs or iterative refinement.
We show that GENESIS-V2 outperforms previous methods for unsupervised image segmentation and object-centric scene generation on established synthetic datasets.
arXiv Detail & Related papers (2021-04-20T14:59:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.