Abstract: Modern classification models tend to struggle when the amount of annotated
data is scarce. To overcome this issue, several neural few-shot classification
models have emerged, yielding significant progress over time, both in Computer
Vision and Natural Language Processing. In the latter, such models used to rely
on fixed word embeddings before the advent of transformers. Additionally, some
models used in Computer Vision are yet to be tested in NLP applications. In
this paper, we compare all these models, first adapting those made in the field
of image processing to NLP, and second providing them access to transformers.
We then test these models equipped with the same transformer-based encoder on
the intent detection task, known for having a large number of classes. Our
results reveal that while methods perform almost equally on the ARSC dataset,
this is not the case for the Intent Detection task, where the most recent and
supposedly best competitors perform worse than older and simpler ones (while
all are given access to transformers). We also show that a simple baseline is
surprisingly strong. All the new developed models, as well as the evaluation
framework, are made publicly available.