Related papers: A Hybrid Co-Finetuning Approach for Visual Bug Detection in Video Games

A Hybrid Co-Finetuning Approach for Visual Bug Detection in Video Games

URL: http://arxiv.org/abs/2510.03591v1
Date: Sat, 04 Oct 2025 00:43:10 GMT
Title: A Hybrid Co-Finetuning Approach for Visual Bug Detection in Video Games
Authors: Faliu Yi, Sherif Abdelfattah, Wei Huang, Adrian Brown,
Abstract summary: We propose a hybrid Co-FineTuning (CFT) method that effectively integrates both labeled and unlabeled data.<n>We show that CFT maintains competitive performance even when trained with only 50% of the labeled data from the target game.
Score: 3.5838409897789467
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Manual identification of visual bugs in video games is a resource-intensive and costly process, often demanding specialized domain knowledge. While supervised visual bug detection models offer a promising solution, their reliance on extensive labeled datasets presents a significant challenge due to the infrequent occurrence of such bugs. To overcome this limitation, we propose a hybrid Co-FineTuning (CFT) method that effectively integrates both labeled and unlabeled data. Our approach leverages labeled samples from the target game and diverse co-domain games, additionally incorporating unlabeled data to enhance feature representation learning. This strategy maximizes the utility of all available data, substantially reducing the dependency on labeled examples from the specific target game. The developed framework demonstrates enhanced scalability and adaptability, facilitating efficient visual bug detection across various game titles. Our experimental results show the robustness of the proposed method for game visual bug detection, exhibiting superior performance compared to conventional baselines across multiple gaming environments. Furthermore, CFT maintains competitive performance even when trained with only 50% of the labeled data from the target game.

Related papers

Unleashing the Power of Vision-Language Models for Long-Tailed Multi-Label Visual Recognition [55.189113121465816]
We propose a novel correlation adaptation prompt network (CAPNET) for long-tailed multi-label visual recognition.<n>CAPNET explicitly models correlations from CLIP's textual encoder.<n>It improves generalization through test-time ensembling and realigns visual-textual modalities.
arXiv Detail & Related papers (2025-11-25T18:57:28Z)
Game-invariant Features Through Contrastive and Domain-adversarial Learning [0.0]
Foundational game-image encoders often overfit to game-specific visual styles.<n>We present a method that combines contrastive learning and domain-adversarial training to learn game-invariant visual features.
arXiv Detail & Related papers (2025-05-22T22:45:51Z)
Cluster Aware Graph Anomaly Detection [32.791460110557104]
We propose a cluster aware multi-view graph anomaly detection method, called CARE.<n>Our approach captures both local and global node affinities by augmenting the graph's adjacency matrix with the pseudo-label.<n>We show that the proposed similarity-guided loss is a variant of contrastive learning loss.
arXiv Detail & Related papers (2024-09-15T15:41:59Z)
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [89.92916473403108]
This paper proposes a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.<n>The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.<n>We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z)
CrossMatch: Enhance Semi-Supervised Medical Image Segmentation with Perturbation Strategies and Knowledge Distillation [7.6057981800052845]
CrossMatch is a novel framework that integrates knowledge distillation with dual strategies-image-level and feature-level to improve the model's learning from both labeled and unlabeled data. Our method significantly surpasses other state-of-the-art techniques in standard benchmarks by effectively minimizing the gap between training on labeled and unlabeled data.
arXiv Detail & Related papers (2024-05-01T07:16:03Z)
Weak Supervision for Label Efficient Visual Bug Detection [0.0]
Traditional testing methods, limited by resources, face difficulties in addressing the plethora of potential bugs. We propose a novel method, utilizing unlabeled gameplay and domain-specific augmentations to generate datasets & self-supervised objectives. Our methodology uses weak-supervision to scale datasets for the crafted objectives and facilitates both autonomous and interactive weak-supervision.
arXiv Detail & Related papers (2023-09-20T06:00:02Z)
Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge. We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem. Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z)
Augment and Criticize: Exploring Informative Samples for Semi-Supervised Monocular 3D Object Detection [64.65563422852568]
We improve the challenging monocular 3D object detection problem with a general semi-supervised framework. We introduce a novel, simple, yet effective Augment and Criticize' framework that explores abundant informative samples from unlabeled data. The two new detectors, dubbed 3DSeMo_DLE and 3DSeMo_FLEX, achieve state-of-the-art results with remarkable improvements for over 3.5% AP_3D/BEV (Easy) on KITTI.
arXiv Detail & Related papers (2023-03-20T16:28:15Z)
Unified Visual Relationship Detection with Vision and Language Models [89.77838890788638]
This work focuses on training a single visual relationship detector predicting over the union of label spaces from multiple datasets. We propose UniVRD, a novel bottom-up method for Unified Visual Relationship Detection by leveraging vision and language models. Empirical results on both human-object interaction detection and scene-graph generation demonstrate the competitive performance of our model.
arXiv Detail & Related papers (2023-03-16T00:06:28Z)
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets [129.24823721649028]
In reinforcement learning, available data of decision making is often not annotated with actions. We propose combining large but sparsely-annotated datasets from a emphtarget environment of interest with fully-annotated datasets from various other emphsource environments. We show that utilizing even one additional environment dataset of sequential labelled data during IDM pretraining gives rise to substantial improvements in generating action labels for unannotated sequences.
arXiv Detail & Related papers (2022-11-23T22:48:22Z)
Distribution Alignment: A Unified Framework for Long-tail Visual Recognition [52.36728157779307]
We propose a unified distribution alignment strategy for long-tail visual recognition. We then introduce a generalized re-weight method in the two-stage learning to balance the class prior. Our approach achieves the state-of-the-art results across all four recognition tasks with a simple and unified framework.
arXiv Detail & Related papers (2021-03-30T14:09:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.