Weak Supervision with Arbitrary Single Frame for Micro- and Macro-expression Spotting
- URL: http://arxiv.org/abs/2403.14240v1
- Date: Thu, 21 Mar 2024 09:01:21 GMT
- Title: Weak Supervision with Arbitrary Single Frame for Micro- and Macro-expression Spotting
- Authors: Wang-Wang Yu, Xian-Shi Zhang, Fu-Ya Luo, Yijun Cao, Kai-Fu Yang, Hong-Mei Yan, Yong-Jie Li,
- Abstract summary: We propose a point-level weakly-supervised expression spotting framework, where each expression requires to be annotated with only one random frame (i.e., a point)
We show MPLG generates more reliable pseudo labels by merging class-specific probabilities, attention scores, fused features, and point-level labels.
Experiments on the CAS(ME)2, CAS(ME)3, and SAMM-LV datasets demonstrate PWES achieves promising performance comparable to that of recent fully-supervised methods.
- Score: 22.04975008531069
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Frame-level micro- and macro-expression spotting methods require time-consuming frame-by-frame observation during annotation. Meanwhile, video-level spotting lacks sufficient information about the location and number of expressions during training, resulting in significantly inferior performance compared with fully-supervised spotting. To bridge this gap, we propose a point-level weakly-supervised expression spotting (PWES) framework, where each expression requires to be annotated with only one random frame (i.e., a point). To mitigate the issue of sparse label distribution, the prevailing solution is pseudo-label mining, which, however, introduces new problems: localizing contextual background snippets results in inaccurate boundaries and discarding foreground snippets leads to fragmentary predictions. Therefore, we design the strategies of multi-refined pseudo label generation (MPLG) and distribution-guided feature contrastive learning (DFCL) to address these problems. Specifically, MPLG generates more reliable pseudo labels by merging class-specific probabilities, attention scores, fused features, and point-level labels. DFCL is utilized to enhance feature similarity for the same categories and feature variability for different categories while capturing global representations across the entire datasets. Extensive experiments on the CAS(ME)^2, CAS(ME)^3, and SAMM-LV datasets demonstrate PWES achieves promising performance comparable to that of recent fully-supervised methods.
Related papers
- Bridge Feature Matching and Cross-Modal Alignment with Mutual-filtering for Zero-shot Anomaly Detection [25.349261412750586]
This study introduces textbfFiSeCLIP for ZSAD with training-free textbfCLIP, combining the feature matching with the cross-modal alignment.<n>Our approach exhibits superior performance for both anomaly classification and segmentation on anomaly detection benchmarks.
arXiv Detail & Related papers (2025-07-15T05:42:17Z) - How to characterize imprecision in multi-view clustering? [8.706415654055657]
We propose a multi-view low-rank evidential c-means based on entropy constraint (MvLRECM)
In MvLRECM, each object is allowed to belong to different clusters to characterize uncertainty when decision-making.
In addition, entropy-weighting and low-rank constraints are employed to reduce imprecision and improve accuracy.
arXiv Detail & Related papers (2024-04-07T14:20:51Z) - Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z) - Weakly-supervised Micro- and Macro-expression Spotting Based on
Multi-level Consistency [22.7160073059238]
Weakly-supervised expression spotting (WES) based on video-level labels can potentially mitigate the complexity of frame-level annotation.
We propose a novel and simple WES framework, MC-WES, using multi-consistency collaborative mechanisms.
We show that MC-WES is comparable to state-of-the-art fully-supervised methods.
arXiv Detail & Related papers (2023-05-04T11:14:47Z) - One Class One Click: Quasi Scene-level Weakly Supervised Point Cloud
Semantic Segmentation with Active Learning [29.493759008637532]
We introduce One Class One Click (OCOC), a low cost yet informative quasi scene-level label, which encapsulates point-level and scene-level annotations.
An active weakly supervised framework is proposed to leverage scarce labels by involving weak supervision from global and local perspectives.
It considerably outperforms genuine scene-level weakly supervised methods by up to 25% in terms of average F1 score.
arXiv Detail & Related papers (2022-11-23T01:23:26Z) - Pointly-Supervised Panoptic Segmentation [106.68888377104886]
We propose a new approach to applying point-level annotations for weakly-supervised panoptic segmentation.
Instead of the dense pixel-level labels used by fully supervised methods, point-level labels only provide a single point for each target as supervision.
We formulate the problem in an end-to-end framework by simultaneously generating panoptic pseudo-masks from point-level labels and learning from them.
arXiv Detail & Related papers (2022-10-25T12:03:51Z) - Collaborative Propagation on Multiple Instance Graphs for 3D Instance
Segmentation with Single-point Supervision [63.429704654271475]
We propose a novel weakly supervised method RWSeg that only requires labeling one object with one point.
With these sparse weak labels, we introduce a unified framework with two branches to propagate semantic and instance information.
Specifically, we propose a Cross-graph Competing Random Walks (CRW) algorithm that encourages competition among different instance graphs.
arXiv Detail & Related papers (2022-08-10T02:14:39Z) - Learning Self-Supervised Low-Rank Network for Single-Stage Weakly and
Semi-Supervised Semantic Segmentation [119.009033745244]
This paper presents a Self-supervised Low-Rank Network ( SLRNet) for single-stage weakly supervised semantic segmentation (WSSS) and semi-supervised semantic segmentation (SSSS)
SLRNet uses cross-view self-supervision, that is, it simultaneously predicts several attentive LR representations from different views of an image to learn precise pseudo-labels.
Experiments on the Pascal VOC 2012, COCO, and L2ID datasets demonstrate that our SLRNet outperforms both state-of-the-art WSSS and SSSS methods with a variety of different settings.
arXiv Detail & Related papers (2022-03-19T09:19:55Z) - WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z) - Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive
Person Re-Identification [64.37745443119942]
This paper jointly enforces visual and temporal consistency in the combination of a local one-hot classification and a global multi-class classification.
Experimental results on three large-scale ReID datasets demonstrate the superiority of proposed method in both unsupervised and unsupervised domain adaptive ReID tasks.
arXiv Detail & Related papers (2020-07-21T14:31:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.