Dummy Prototypical Networks for Few-Shot Open-Set Keyword Spotting
- URL: http://arxiv.org/abs/2206.13691v1
- Date: Tue, 28 Jun 2022 01:56:24 GMT
- Title: Dummy Prototypical Networks for Few-Shot Open-Set Keyword Spotting
- Authors: Byeonggeun Kim, Seunghan Yang, Inseop Chung, Simyung Chang
- Abstract summary: We tackle few-shot open-set keyword spotting with a new benchmark setting, named splitGSC.
We propose episode-known dummy prototypes based on metric learning to detect an open-set better and introduce a simple and powerful approach, Dummy Prototypical Networks (D-ProtoNets)
We also verify our method on a standard benchmark, miniImageNet, and D-ProtoNets shows the state-of-the-art open-set detection rate in FSOSR.
- Score: 6.4423565043274795
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Keyword spotting is the task of detecting a keyword in streaming audio.
Conventional keyword spotting targets predefined keywords classification, but
there is growing attention in few-shot (query-by-example) keyword spotting,
e.g., N-way classification given M-shot support samples. Moreover, in
real-world scenarios, there can be utterances from unexpected categories
(open-set) which need to be rejected rather than classified as one of the N
classes. Combining the two needs, we tackle few-shot open-set keyword spotting
with a new benchmark setting, named splitGSC. We propose episode-known dummy
prototypes based on metric learning to detect an open-set better and introduce
a simple and powerful approach, Dummy Prototypical Networks (D-ProtoNets). Our
D-ProtoNets shows clear margins compared to recent few-shot open-set
recognition (FSOSR) approaches in the suggested splitGSC. We also verify our
method on a standard benchmark, miniImageNet, and D-ProtoNets shows the
state-of-the-art open-set detection rate in FSOSR.
Related papers
- Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching [74.75284453828017]
Open-Vocabulary Keypoint Detection (OVKD) task is innovatively designed to use text prompts for identifying arbitrary keypoints across any species.
We have developed a novel framework named Open-Vocabulary Keypoint Detection with Semantic-feature Matching (KDSM)
This framework combines vision and language models, creating an interplay between language features and local keypoint visual features.
arXiv Detail & Related papers (2023-10-08T07:42:41Z) - Open-vocabulary Keyword-spotting with Adaptive Instance Normalization [18.250276540068047]
We propose AdaKWS, a novel method for keyword spotting in which a text encoder is trained to output keyword-conditioned normalization parameters.
We show significant improvements over recent keyword spotting and ASR baselines.
arXiv Detail & Related papers (2023-09-13T13:49:42Z) - To Wake-up or Not to Wake-up: Reducing Keyword False Alarm by Successive
Refinement [58.96644066571205]
We show that existing deep keyword spotting mechanisms can be improved by Successive Refinement.
We show across multiple models with size ranging from 13K parameters to 2.41M parameters, the successive refinement technique reduces FA by up to a factor of 8.
Our proposed approach is "plug-and-play" and can be applied to any deep keyword spotting model.
arXiv Detail & Related papers (2023-04-06T23:49:29Z) - Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection [76.5120397167247]
We present an open-set object detector, called Grounding DINO, by marrying Transformer-based detector DINO with grounded pre-training.
The key solution of open-set object detection is introducing language to a closed-set detector for open-set concept generalization.
Grounding DINO performs remarkably well on all three settings, including benchmarks on COCO, LVIS, ODinW, and RefCOCO/+/g.
arXiv Detail & Related papers (2023-03-09T18:52:16Z) - Towards visually prompted keyword localisation for zero-resource spoken
languages [27.696096343873215]
We formalise the task of visually prompted keyword localisation (VPKL)
VPKL is given an image of a keyword, detect and predict where in an utterance the keyword occurs.
We show that these innovations give improvements in VPKL over an existing speech-vision model.
arXiv Detail & Related papers (2022-10-12T14:17:34Z) - MASKER: Masked Keyword Regularization for Reliable Text Classification [73.90326322794803]
We propose a fine-tuning method, coined masked keyword regularization (MASKER), that facilitates context-based prediction.
MASKER regularizes the model to reconstruct the keywords from the rest of the words and make low-confidence predictions without enough context.
We demonstrate that MASKER improves OOD detection and cross-domain generalization without degrading classification accuracy.
arXiv Detail & Related papers (2020-12-17T04:54:16Z) - Detection of Adversarial Supports in Few-shot Classifiers Using Feature
Preserving Autoencoders and Self-Similarity [89.26308254637702]
We propose a detection strategy to highlight adversarial support sets.
We make use of feature preserving autoencoder filtering and also the concept of self-similarity of a support set to perform this detection.
Our method is attack-agnostic and also the first to explore detection for few-shot classifiers to the best of our knowledge.
arXiv Detail & Related papers (2020-12-09T14:13:41Z) - Open-set Adversarial Defense [93.25058425356694]
We show that open-set recognition systems are vulnerable to adversarial attacks.
Motivated by this observation, we emphasize the need of an Open-Set Adrial Defense (OSAD) mechanism.
This paper proposes an Open-Set Defense Network (OSDN) as a solution to the OSAD problem.
arXiv Detail & Related papers (2020-09-02T04:35:33Z) - Few-Shot Keyword Spotting With Prototypical Networks [3.6930948691311016]
keyword spotting has been widely used in many voice interfaces such as Amazon's Alexa and Google Home.
We first formulate this problem as a few-shot keyword spotting and approach it using metric learning.
We then propose a solution to the prototypical few-shot keyword spotting problem using temporal and dilated convolutions on networks.
arXiv Detail & Related papers (2020-07-25T20:17:56Z) - Small-Footprint Open-Vocabulary Keyword Spotting with Quantized LSTM
Networks [3.8382752162527933]
In this paper, we focus on an open-vocabulary keyword spotting method, allowing the user to define their own keywords without having to retrain the whole model.
We describe the different design choices leading to a fast and small-footprint system, able to run on tiny devices, for any arbitrary set of user-defined keywords.
arXiv Detail & Related papers (2020-02-25T13:27:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.