Related papers: Image-Based Multi-Survey Classification of Light Curves with a Pre-Trained Vision Transformer

Image-Based Multi-Survey Classification of Light Curves with a Pre-Trained Vision Transformer

URL: http://arxiv.org/abs/2507.11711v1
Date: Tue, 15 Jul 2025 20:30:21 GMT
Title: Image-Based Multi-Survey Classification of Light Curves with a Pre-Trained Vision Transformer
Authors: Daniel Moreno-Cartagena, Guillermo Cabrera-Vives, Alejandra M. Muñoz Arancibia, Pavlos Protopapas, Francisco Förster, Márcio Catelan, A. Bayo, Pablo A. Estévez, P. Sánchez-Sáez, Franz E. Bauer, M. Pavez-Herrera, L. Hernández-García, Gonzalo Rojas,
Abstract summary: We explore the use of Swin Transformer V2, a pre-trained vision Transformer, for photometric classification in a multi-survey setting.<n>We evaluate different strategies for integrating data from the Zwicky Transient Facility (ZTF) and the Asteroid Terrestrial-impact Last Alert System (ATLAS)
Score: 31.76431580841178
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We explore the use of Swin Transformer V2, a pre-trained vision Transformer, for photometric classification in a multi-survey setting by leveraging light curves from the Zwicky Transient Facility (ZTF) and the Asteroid Terrestrial-impact Last Alert System (ATLAS). We evaluate different strategies for integrating data from these surveys and find that a multi-survey architecture which processes them jointly achieves the best performance. These results highlight the importance of modeling survey-specific characteristics and cross-survey interactions, and provide guidance for building scalable classifiers for future time-domain astronomy.

Related papers

AetherVision-Bench: An Open-Vocabulary RGB-Infrared Benchmark for Multi-Angle Segmentation across Aerial and Ground Perspectives [2.0293118701268154]
Embodied AI systems are transforming autonomous navigation for ground vehicles and drones by enhancing their perception abilities.<n>We present AetherVision-Bench, a benchmark for multi-angle segmentation across aerial, and ground perspectives.<n>We assess state-of-the-art OVSS models on the proposed benchmark and investigate the key factors that impact the performance of zero-shot transfer models.
arXiv Detail & Related papers (2025-06-04T08:41:19Z)
HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis [7.116403133334646]
We propose HyTAS, the first benchmark on transformer architecture search for Hyperspectral imaging. We evaluate 12 different methods to identify the optimal transformer over 5 different datasets. We perform an extensive factor analysis on the Hyperspectral transformer search performance.
arXiv Detail & Related papers (2024-07-23T08:18:43Z)
Geometric Features Enhanced Human-Object Interaction Detection [11.513009304308724]
We propose a novel end-to-end Transformer-style HOI detection model, i.e., geometric features enhanced HOI detector (GeoHOI) One key part of the model is a new unified self-supervised keypoint learning method named UniPointNet. GeoHOI effectively upgrades a Transformer-based HOI detector benefiting from the keypoints similarities measuring the likelihood of human-object interactions.
arXiv Detail & Related papers (2024-06-26T18:52:53Z)
Transformers Fusion across Disjoint Samples for Hyperspectral Image Classification [2.1223532600703385]
3D Swin Transformer (3D-ST) excels in capturing intricate spatial relationships within images. SST specializes in modeling long-range dependencies through self-attention mechanisms. This paper introduces an attentional fusion of these two transformers to significantly enhance the classification performance of Hyperspectral Images (HSIs)
arXiv Detail & Related papers (2024-05-02T08:49:01Z)
A Comprehensive Survey for Hyperspectral Image Classification: The Evolution from Conventional to Transformers and Mamba Models [25.18873183963132]
Hyperspectral Image Classification (HSC) presents significant challenges owing to the high dimensionality and intricate nature of HS data. Deep Learning (DL) techniques have emerged as robust solutions to address these challenges. We systematically review key concepts, methodologies, and state-of-the-art approaches in DL for HSC.
arXiv Detail & Related papers (2024-04-23T12:00:20Z)
DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion [82.2425759608975]
Infrared-visible object detection aims to achieve robust even full-day object detection by fusing the complementary information of infrared and visible images. We propose a Dynamic Adaptive Multispectral Detection Transformer (DAMSDet) to address these two challenges. Experiments on four public datasets demonstrate significant improvements compared to other state-of-the-art methods.
arXiv Detail & Related papers (2024-03-01T07:03:27Z)
Leveraging the Power of Data Augmentation for Transformer-based Tracking [64.46371987827312]
We propose two data augmentation methods customized for tracking. First, we optimize existing random cropping via a dynamic search radius mechanism and simulation for boundary samples. Second, we propose a token-level feature mixing augmentation strategy, which enables the model against challenges like background interference.
arXiv Detail & Related papers (2023-09-15T09:18:54Z)
ViT-Calibrator: Decision Stream Calibration for Vision Transformer [49.60474757318486]
We propose a new paradigm dubbed Decision Stream that boosts the performance of general Vision Transformers. We shed light on the information propagation mechanism in the learning procedure by exploring the correlation between different tokens and the relevance coefficient of multiple dimensions.
arXiv Detail & Related papers (2023-04-10T02:40:24Z)
Demystify Transformers & Convolutions in Modern Image Deep Networks [80.16624587948368]
This paper aims to identify the real gains of popular convolution and attention operators through a detailed study.<n>We find that the key difference among these feature transformation modules, such as attention or convolution, lies in their spatial feature aggregation approach.<n>Various STMs are integrated into this unified framework for comprehensive comparative analysis.
arXiv Detail & Related papers (2022-11-10T18:59:43Z)
Structural Prior Guided Generative Adversarial Transformers for Low-Light Image Enhancement [51.22694467126883]
We propose an effective Structural Prior guided Generative Adversarial Transformer (SPGAT) to solve low-light image enhancement. The generator is based on a U-shaped Transformer which is used to explore non-local information for better clear image restoration. To generate more realistic images, we develop a new structural prior guided adversarial learning method by building the skip connections between the generator and discriminators.
arXiv Detail & Related papers (2022-07-16T04:05:40Z)
Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks. We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers. Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z)
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond [76.35955924137986]
We propose a Vision Transformer Advanced by Exploring intrinsic IB from convolutions, i.e., ViTAE. ViTAE has several spatial pyramid reduction modules to downsample and embed the input image into tokens with rich multi-scale context. We obtain the state-of-the-art classification performance, i.e., 88.5% Top-1 classification accuracy on ImageNet validation set and the best 91.2% Top-1 accuracy on ImageNet real validation set.
arXiv Detail & Related papers (2022-02-21T10:40:05Z)
Paying Attention to Astronomical Transients: Introducing the Time-series Transformer for Photometric Classification [6.586394734694152]
We develop a new transformer architecture, first proposed for natural language processing. We apply the time-series transformer to the task of photometric classification, minimising the reliance of expert domain knowledge. We achieve a logarithmic-loss of 0.507 on imbalanced data in a representative setting using data from the Photometric LSST Astronomical Time-Series Classification Challenge.
arXiv Detail & Related papers (2021-05-13T10:16:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.