Related papers: GasTwinFormer: A Hybrid Vision Transformer for Livestock Methane Emission Segmentation and Dietary Classification in Optical Gas Imaging

GasTwinFormer: A Hybrid Vision Transformer for Livestock Methane Emission Segmentation and Dietary Classification in Optical Gas Imaging

URL: http://arxiv.org/abs/2508.15057v1
Date: Wed, 20 Aug 2025 20:45:10 GMT
Title: GasTwinFormer: A Hybrid Vision Transformer for Livestock Methane Emission Segmentation and Dietary Classification in Optical Gas Imaging
Authors: Toqi Tahamid Sarker, Mohamed Embaby, Taminul Islam, Amer AbuGhazaleh, Khaled R Ahmed,
Abstract summary: GasTwinFormer is a hybrid vision transformer for real-time methane emission segmentation and dietary classification in optical gas imaging.<n>We contribute the first comprehensive beef cattle methane emission dataset using OGI, containing 11,694 annotated frames across three dietary treatments.<n>GasTwinFormer achieves 74.47% mIoU and 83.63% mF1 for segmentation while maintaining exceptional efficiency with only 3.348M parameters, 3.428G FLOPs, and 114.9 FPS inference speed.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Livestock methane emissions represent 32% of human-caused methane production, making automated monitoring critical for climate mitigation strategies. We introduce GasTwinFormer, a hybrid vision transformer for real-time methane emission segmentation and dietary classification in optical gas imaging through a novel Mix Twin encoder alternating between spatially-reduced global attention and locally-grouped attention mechanisms. Our architecture incorporates a lightweight LR-ASPP decoder for multi-scale feature aggregation and enables simultaneous methane segmentation and dietary classification in a unified framework. We contribute the first comprehensive beef cattle methane emission dataset using OGI, containing 11,694 annotated frames across three dietary treatments. GasTwinFormer achieves 74.47% mIoU and 83.63% mF1 for segmentation while maintaining exceptional efficiency with only 3.348M parameters, 3.428G FLOPs, and 114.9 FPS inference speed. Additionally, our method achieves perfect dietary classification accuracy (100%), demonstrating the effectiveness of leveraging diet-emission correlations. Extensive ablation studies validate each architectural component, establishing GasTwinFormer as a practical solution for real-time livestock emission monitoring. Please see our project page at gastwinformer.github.io.

Related papers

FUME: Fused Unified Multi-Gas Emission Network for Livestock Rumen Acidosis Detection [3.7515646463759698]
Ruminal acidosis is a prevalent metabolic disorder in dairy cattle causing significant economic losses and animal welfare concerns.<n>We present FUME, the first deep learning approach for rumen acidosis detection from dual-gas optical imaging under in vitro conditions.<n>Our work establishes the feasibility of gas emission-based livestock health monitoring, paving the way for practical, in vitro acidosis detection systems.
arXiv Detail & Related papers (2026-01-13T04:17:22Z)
PanFoMa: A Lightweight Foundation Model and Benchmark for Pan-Cancer [54.958921946378304]
We introduce PanFoMa, a lightweight hybrid neural network that combines the strengths of Transformers and state-space models.<n>PanFoMa consists of a front-end local-context encoder with shared self-attention layers to capture complex, order-independent gene interactions.<n>We also construct a large-scale pan-cancer single-cell benchmark, PanFoMaBench, containing over 3.5 million high-quality cells.
arXiv Detail & Related papers (2025-12-02T08:31:31Z)
Multi-Representation Attention Framework for Underwater Bioacoustic Denoising and Recognition [0.924965746838578]
We introduce a multi-step, attention-guided framework that first segments spectrograms to generate soft masks of biologically relevant energy.<n>Image and mask embeddings are integrated via mid-level fusion, enabling the model to focus on salient spectrogram regions.<n>Using real-world recordings from the Saguenay St. Lawrence Marine Park Research Station in Canada, we demonstrate that segmentation-driven attention and mid-level fusion improve signal discrimination.
arXiv Detail & Related papers (2025-10-29T22:49:15Z)
Lightweight Vision Transformer with Window and Spatial Attention for Food Image Classification [1.1472801896854488]
We propose a lightweight food image classification algorithm that integrates a Window Multi-Head Attention Mechanism (WMHAM) and a Spatial Attention Mechanism (SAM)<n>Our model achieves accuracies of 95.24% and 94.33%, respectively, while significantly reducing parameters and FLOPs compared with baseline methods.
arXiv Detail & Related papers (2025-09-23T06:23:50Z)
CarboNeXT and CarboFormer: Dual Semantic Segmentation Architectures for Detecting and Quantifying Carbon Dioxide Emissions Using Optical Gas Imaging [0.0]
Carbon dioxide (CO$$) emissions are critical indicators of both environmental impact and various industrial processes, including livestock management.<n>We introduce CarboNeXT, a semantic segmentation framework for Optical Gas Imaging (OGI), designed to detect and quantify CO$$ emissions across diverse applications.
arXiv Detail & Related papers (2025-05-23T18:01:42Z)
Machine Learning for Methane Detection and Quantification from Space -- A survey [49.7996292123687]
Methane (CH_4) is a potent anthropogenic greenhouse gas, contributing 86 times more to global warming than Carbon Dioxide (CO_2) over 20 years. This work expands existing information on operational methane point source detection sensors in the Short-Wave Infrared (SWIR) bands. It reviews the state-of-the-art for traditional as well as Machine Learning (ML) approaches.
arXiv Detail & Related papers (2024-08-27T15:03:20Z)
Gasformer: A Transformer-based Architecture for Segmenting Methane Emissions from Livestock in Optical Gas Imaging [0.0]
Methane emissions from livestock, particularly cattle, significantly contribute to climate change. We introduce Gasformer, a novel semantic segmentation architecture for detecting low-flow rate methane emissions from livestock. We present two unique datasets captured with a FLIR GF77 OGI camera.
arXiv Detail & Related papers (2024-04-16T18:38:23Z)
Unlocking the Potential: Multi-task Deep Learning for Spaceborne Quantitative Monitoring of Fugitive Methane Plumes [0.7970333810038046]
Methane concentration inversion, plume segmentation, and emission rate estimation are three subtasks of methane emission monitoring. We introduce a novel deep learning-based framework for quantitative methane emission monitoring from remote sensing images. We train a U-Net network for methane concentration inversion, a Mask R-CNN network for methane plume segmentation, and a ResNet-50 network for methane emission rate estimation.
arXiv Detail & Related papers (2024-01-23T16:04:19Z)
Autonomous Detection of Methane Emissions in Multispectral Satellite Data Using Deep Learning [73.01013149014865]
Methane is one of the most potent greenhouse gases. Current methane emission monitoring techniques rely on approximate emission factors or self-reporting. Deep learning methods can be leveraged to automatize the detection of methane leaks in Sentinel-2 satellite multispectral data.
arXiv Detail & Related papers (2023-08-21T19:36:50Z)
Breast Ultrasound Tumor Classification Using a Hybrid Multitask CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification. Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations. In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z)
CageViT: Convolutional Activation Guided Efficient Vision Transformer [90.69578999760206]
This paper presents an efficient vision Transformer, called CageViT, that is guided by convolutional activation to reduce computation. Our CageViT, unlike current Transformers, utilizes a new encoder to handle the rearranged tokens. Experimental results demonstrate that the proposed CageViT outperforms the most recent state-of-the-art backbones by a large margin in terms of efficiency.
arXiv Detail & Related papers (2023-05-17T03:19:18Z)
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm [111.17100512647619]
This paper explains the rationality of Vision Transformer by analogy with the proven practical evolutionary algorithm (EA) We propose a novel pyramid EATFormer backbone that only contains the proposed EA-based transformer (EAT) block. Massive quantitative and quantitative experiments on image classification, downstream tasks, and explanatory experiments demonstrate the effectiveness and superiority of our approach.
arXiv Detail & Related papers (2022-06-19T04:49:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.