Limiting Tags Fosters Efficiency
- URL: http://arxiv.org/abs/2104.01028v1
- Date: Fri, 2 Apr 2021 12:58:45 GMT
- Title: Limiting Tags Fosters Efficiency
- Authors: Tiago Santos, Keith Burghardt, Kristina Lerman, Denis Helic
- Abstract summary: We use information-theoretic measures to track the descriptive and retrieval efficiency of tags on Stack Overflow.
We observe that tagging efficiency stabilizes over time, while tag content and descriptiveness both increase.
Our work offers insights into policies to improve information organization and retrieval in online communities.
- Score: 2.6143568807090696
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tagging facilitates information retrieval in social media and other online
communities by allowing users to organize and describe online content.
Researchers found that the efficiency of tagging systems steadily decreases
over time, because tags become less precise in identifying specific documents,
i.e., they lose their descriptiveness. However, previous works did not answer
how or even whether community managers can improve the efficiency of tags. In
this work, we use information-theoretic measures to track the descriptive and
retrieval efficiency of tags on Stack Overflow, a question-answering system
that strictly limits the number of tags users can specify per question. We
observe that tagging efficiency stabilizes over time, while tag content and
descriptiveness both increase. To explain this observation, we hypothesize that
limiting the number of tags fosters novelty and diversity in tag usage, two
properties which are both beneficial for tagging efficiency. To provide
qualitative evidence supporting our hypothesis, we present a statistical model
of tagging that demonstrates how novelty and diversity lead to greater tag
efficiency in the long run. Our work offers insights into policies to improve
information organization and retrieval in online communities.
Related papers
- Modeling Tag Prediction based on Question Tagging Behavior Analysis of
CommunityQA Platform Users [10.816557776555078]
We develop a flexible neural tag prediction architecture, which predicts both popular tags and more granular tags for each question.
Our experiments and obtained performance show the effectiveness of our model.
arXiv Detail & Related papers (2023-07-04T01:24:26Z) - Cross Encoding as Augmentation: Towards Effective Educational Text
Classification [9.786833703453741]
We propose a novel retrieval approach CEAA that provides effective learning in educational text classification.
Our main contributions are as follows: 1) we leverage transfer learning from question-answering datasets, and 2) we propose a simple but effective data augmentation method.
arXiv Detail & Related papers (2023-05-30T12:19:30Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Improve Text Classification Accuracy with Intent Information [0.38073142980733]
Existing method does not consider the use of label information, which may weaken the performance of text classification systems in some token-aware scenarios.
We introduce the use of label information as label embedding for the task of text classification and achieve remarkable performance on benchmark dataset.
arXiv Detail & Related papers (2022-12-15T08:15:32Z) - Graph-Based Recommendation System Enhanced with Community Detection [7.436429318051602]
Examining the tags of users will help to get their interests and leads to more accuracy in the recommendations.
Since user-defined tags are chosen freely and without any restrictions, problems arise in determining their exact meaning and the similarity of tags.
This article uses mathematical and statistical methods to determine lexical similarity and co-occurrence tags solution.
arXiv Detail & Related papers (2022-01-10T20:08:40Z) - Exploiting Context for Robustness to Label Noise in Active Learning [47.341705184013804]
We address the problems of how a system can identify which of the queried labels are wrong and how a multi-class active learning system can be adapted to minimize the negative impact of label noise.
We construct a graphical representation of the unlabeled data to encode these relationships and obtain new beliefs on the graph when noisy labels are available.
This is demonstrated in three different applications: scene classification, activity classification, and document classification.
arXiv Detail & Related papers (2020-10-18T18:59:44Z) - Adaptive Self-training for Few-shot Neural Sequence Labeling [55.43109437200101]
We develop techniques to address the label scarcity challenge for neural sequence labeling models.
Self-training serves as an effective mechanism to learn from large amounts of unlabeled data.
meta-learning helps in adaptive sample re-weighting to mitigate error propagation from noisy pseudo-labels.
arXiv Detail & Related papers (2020-10-07T22:29:05Z) - Automatic Validation of Textual Attribute Values in E-commerce Catalog
by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge.
It can learn transferable knowledge from a subset of categories with limited labeled data.
It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z) - Interaction Matching for Long-Tail Multi-Label Classification [57.262792333593644]
We present an elegant and effective approach for addressing limitations in existing multi-label classification models.
By performing soft n-gram interaction matching, we match labels with natural language descriptions.
arXiv Detail & Related papers (2020-05-18T15:27:55Z) - Adversarial Learning for Personalized Tag Recommendation [61.76193196463919]
We propose an end-to-end deep network which can be trained on large-scale datasets.
A joint training of user-preference and visual encoding allows the network to efficiently integrate the visual preference with tagging behavior.
We demonstrate the effectiveness of the proposed model on two different large-scale and publicly available datasets.
arXiv Detail & Related papers (2020-04-01T20:41:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.