M2-Net: Multi-stages Specular Highlight Detection and Removal in
Multi-scenes
- URL: http://arxiv.org/abs/2207.09965v1
- Date: Wed, 20 Jul 2022 15:18:43 GMT
- Title: M2-Net: Multi-stages Specular Highlight Detection and Removal in
Multi-scenes
- Authors: Zhaoyangfan Huang and Kun Hu and Xingjun Wang
- Abstract summary: The framework consists of three main components, highlight feature extractor module, highlight coarse removal module, and highlight refine removal module.
Our algorithm is applied for the first time in video highlight removal with promising results.
- Score: 3.312427167335527
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a novel uniformity framework for highlight
detection and removal in multi-scenes, including synthetic images, face images,
natural images, and text images. The framework consists of three main
components, highlight feature extractor module, highlight coarse removal
module, and highlight refine removal module. Firstly, the highlight feature
extractor module can directly separate the highlight feature and non-highlight
feature from the original highlight image. Then highlight removal image is
obtained using a coarse highlight removal network. To further improve the
highlight removal effect, the refined highlight removal image is finally
obtained using refine highlight removal module based on contextual highlight
attention mechanisms. Extensive experimental results in multiple scenes
indicate that the proposed framework can obtain excellent visual effects of
highlight removal and achieve state-of-the-art results in several quantitative
evaluation metrics. Our algorithm is applied for the first time in video
highlight removal with promising results.
Related papers
- MAMS: Model-Agnostic Module Selection Framework for Video Captioning [11.442879458679144]
Existing multi-modal video captioning methods typically extract a fixed number of frames, which raises critical challenges.
This paper proposes the first model-agnostic module selection framework in video captioning.
Our experiments on three different benchmark datasets demonstrate that the proposed framework significantly improves the performance of three recent video captioning models.
arXiv Detail & Related papers (2025-01-30T11:10:18Z) - Prompt-Aware Controllable Shadow Removal [29.674151621173856]
We introduce a novel paradigm for prompt-aware controllable shadow removal.
Unlike existing approaches, our paradigm allows for targeted shadow removal from specific subjects based on user prompts.
We propose an end-to-end learnable model, the Prompt-Aware Controllable Shadow Removal Network (PACSRNet)
arXiv Detail & Related papers (2025-01-25T02:59:00Z) - Generalizable Entity Grounding via Assistance of Large Language Model [77.07759442298666]
We propose a novel approach to densely ground visual entities from a long caption.
We leverage a large multimodal model to extract semantic nouns, a class-a segmentation model to generate entity-level segmentation, and a multi-modal feature fusion module to associate each semantic noun with its corresponding segmentation mask.
arXiv Detail & Related papers (2024-02-04T16:06:05Z) - Towards High-Quality Specular Highlight Removal by Leveraging
Large-Scale Synthetic Data [45.30068102110486]
This paper aims to remove specular highlights from a single object-level image.
We propose a three-stage network to address them.
We present a large-scale synthetic dataset of object-level images.
arXiv Detail & Related papers (2023-09-12T15:10:23Z) - SimpSON: Simplifying Photo Cleanup with Single-Click Distracting Object
Segmentation Network [70.89436857471887]
We propose an interactive distractor selection method that is optimized to achieve the task with just a single click.
Our method surpasses the precision and recall achieved by the traditional method of running panoptic segmentation.
Our experiments demonstrate that the model can effectively and accurately segment unknown distracting objects interactively and in groups.
arXiv Detail & Related papers (2023-05-28T04:05:24Z) - Good Visual Guidance Makes A Better Extractor: Hierarchical Visual
Prefix for Multimodal Entity and Relation Extraction [88.6585431949086]
We propose a novel Hierarchical Visual Prefix fusion NeTwork (HVPNeT) for visual-enhanced entity and relation extraction.
We regard visual representation as pluggable visual prefix to guide the textual representation for error insensitive forecasting decision.
Experiments on three benchmark datasets demonstrate the effectiveness of our method, and achieve state-of-the-art performance.
arXiv Detail & Related papers (2022-05-07T02:10:55Z) - Two-stage Visual Cues Enhancement Network for Referring Image
Segmentation [89.49412325699537]
Referring Image (RIS) aims at segmenting the target object from an image referred by one given natural language expression.
In this paper, we tackle this problem by devising a Two-stage Visual cues enhancement Network (TV-Net)
Through the two-stage enhancement, our proposed TV-Net enjoys better performances in learning fine-grained matching behaviors between the natural language expression and image.
arXiv Detail & Related papers (2021-10-09T02:53:39Z) - Text-Aware Single Image Specular Highlight Removal [14.624958411229862]
Existing methods typically remove specular highlight for medical images and specific-object images, however, they cannot handle the images with text.
In this paper, we first raise and study the text-aware single image specular highlight removal problem.
The core goal is to improve the accuracy of text detection and recognition by removing the highlight from text images.
arXiv Detail & Related papers (2021-08-16T03:51:53Z) - Disassembling Object Representations without Labels [75.2215716328001]
We study a new representation-learning task, which we termed as disassembling object representations.
Disassembling enables category-specific modularity in the learned representations.
We propose an unsupervised approach to achieving disassembling, named Unsupervised Disassembling Object Representation (UDOR)
arXiv Detail & Related papers (2020-04-03T08:23:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.