Related papers: On knot detection via picture recognition

On knot detection via picture recognition

URL: http://arxiv.org/abs/2510.06284v1
Date: Mon, 06 Oct 2025 22:36:10 GMT
Title: On knot detection via picture recognition
Authors: Anne Dranowski, Yura Kabkov, Daniel Tubbenhauer,
Abstract summary: We explain a strategy to approximate this goal, using a mixture of modern machine learning methods and traditional algorithms.<n>We present simple baselines that predict crossing number directly from images, showing that even lightweight CNN and transformer architectures can recover meaningful structural information.<n>The longer-term aim is to combine these perception modules with symbolic reconstruction into planar quantum diagram (PD) codes, enabling downstream invariant computation for robust knot classification.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Our goal is to one day take a photo of a knot and have a phone automatically recognize it. In this expository work, we explain a strategy to approximate this goal, using a mixture of modern machine learning methods (in particular convolutional neural networks and transformers for image recognition) and traditional algorithms (to compute quantum invariants like the Jones polynomial). We present simple baselines that predict crossing number directly from images, showing that even lightweight CNN and transformer architectures can recover meaningful structural information. The longer-term aim is to combine these perception modules with symbolic reconstruction into planar diagram (PD) codes, enabling downstream invariant computation for robust knot classification. This two-stage approach highlights the complementarity between machine learning, which handles noisy visual data, and invariants, which enforce rigorous topological distinctions.

Related papers

Visual Explanation via Similar Feature Activation for Metric Learning [23.559106251249872]
Class activation maps (CAM) have been extensively employed to explore the interpretability of softmax-based convolutional neural networks.<n>We propose a novel visual explanation method termed Similar Feature Activation Map (SFAM)<n>SFAM provides highly promising interpretable visual explanations for CNN models using Euclidean distance or cosine similarity as the similarity metric.
arXiv Detail & Related papers (2025-06-02T13:14:37Z)
Understanding Transformer-based Vision Models through Inversion [0.8124699127636158]
In this study, we revisit feature inversion, introducing a novel, modular variation that enables significantly more efficient application of the technique.<n>We demonstrate how our method can be systematically applied to the large-scale transformer-based vision models, Detection Transformer and Vision Transformer.<n>Our analysis reveals key insights into how these models encode contextual shape and image details, how their layers correlate, and their robustness against color perturbations.
arXiv Detail & Related papers (2024-12-09T14:43:06Z)
Image segmentation with traveling waves in an exactly solvable recurrent neural network [71.74150501418039]
We show that a recurrent neural network can effectively divide an image into groups according to a scene's structural characteristics. We present a precise description of the mechanism underlying object segmentation in this network. We then demonstrate a simple algorithm for object segmentation that generalizes across inputs ranging from simple geometric objects in grayscale images to natural images.
arXiv Detail & Related papers (2023-11-28T16:46:44Z)
Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components. CNNs are used to augment the local texture information of coarse priors. DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z)
Unsupervised Learning of Invariance Transformations [105.54048699217668]
We develop an algorithmic framework for finding approximate graph automorphisms. We discuss how this framework can be used to find approximate automorphisms in weighted graphs in general.
arXiv Detail & Related papers (2023-07-24T17:03:28Z)
Augmenting Deep Learning Adaptation for Wearable Sensor Data through Combined Temporal-Frequency Image Encoding [4.458210211781739]
We present a novel modified-recurrent plot-based image representation that seamlessly integrates both temporal and frequency domain information. We evaluate the proposed method using accelerometer-based activity recognition data and a pretrained ResNet model, and demonstrate its superior performance compared to existing approaches.
arXiv Detail & Related papers (2023-07-03T09:29:27Z)
Modeling Image Composition for Complex Scene Generation [77.10533862854706]
We present a method that achieves state-of-the-art results on layout-to-image generation tasks. After compressing RGB images into patch tokens, we propose the Transformer with Focal Attention (TwFA) for exploring dependencies of object-to-object, object-to-patch and patch-to-patch.
arXiv Detail & Related papers (2022-06-02T08:34:25Z)
Ablation study of self-supervised learning for image classification [0.0]
This project focuses on the self-supervised training of convolutional neural networks (CNNs) and transformer networks for the task of image recognition. A simple siamese network with different backbones is used in order to maximize the similarity of two augmented transformed images from the same source image.
arXiv Detail & Related papers (2021-12-04T09:59:01Z)
Transformer Guided Geometry Model for Flow-Based Unsupervised Visual Odometry [38.20137500372927]
We propose a method consisting of two camera pose estimators that deal with the information from pairwise images. For image sequences, a Transformer-like structure is adopted to build a geometry model over a local temporal window. A Flow-to-Flow Pose Estimator (F2FPE) is proposed to exploit the relationship between pairwise images.
arXiv Detail & Related papers (2020-12-08T19:39:26Z)
Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition. Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z)
FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning [64.32306537419498]
We propose a novel learned feature-based refinement and augmentation method that produces a varied set of complex transformations. These transformations also use information from both within-class and across-class representations that we extract through clustering. We demonstrate that our method is comparable to current state of art for smaller datasets while being able to scale up to larger datasets.
arXiv Detail & Related papers (2020-07-16T17:55:31Z)
Image-to-image Mapping with Many Domains by Sparse Attribute Transfer [71.28847881318013]
Unsupervised image-to-image translation consists of learning a pair of mappings between two domains without known pairwise correspondences between points. Current convention is to approach this task with cycle-consistent GANs. We propose an alternate approach that directly restricts the generator to performing a simple sparse transformation in a latent layer.
arXiv Detail & Related papers (2020-06-23T19:52:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.