Soft Labels for Rapid Satellite Object Detection
- URL: http://arxiv.org/abs/2212.00585v1
- Date: Thu, 1 Dec 2022 15:23:13 GMT
- Title: Soft Labels for Rapid Satellite Object Detection
- Authors: Matthew Ciolino, Grant Rosario, David Noever
- Abstract summary: We propose using satellite object detections as the basis for a new dataset of soft labels.
We show that soft labels can be used to train a model that is almost as accurate as a model trained on the original data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Soft labels in image classification are vector representations of an image's
true classification. In this paper, we investigate soft labels in the context
of satellite object detection. We propose using detections as the basis for a
new dataset of soft labels. Much of the effort in creating a high-quality model
is gathering and annotating the training data. If we could use a model to
generate a dataset for us, we could not only rapidly create datasets, but also
supplement existing open-source datasets. Using a subset of the xView dataset,
we train a YOLOv5 model to detect cars, planes, and ships. We then use that
model to generate soft labels for the second training set which we then train
and compare to the original model. We show that soft labels can be used to
train a model that is almost as accurate as a model trained on the original
data.
Related papers
- LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories [59.14011485494713]
This work introduces a fully automated 2D/3D labeling framework that can generate labels for RGB-D scans at equal (or better) level of accuracy.
We demonstrate the effectiveness of our LabelMaker pipeline by generating significantly better labels for the ScanNet datasets and automatically labelling the previously unlabeled ARKitScenes dataset.
arXiv Detail & Related papers (2023-11-20T20:40:24Z) - A Benchmark Generative Probabilistic Model for Weak Supervised Learning [2.0257616108612373]
Weak Supervised Learning approaches have been developed to alleviate the annotation burden.
We show that latent variable models (PLVMs) achieve state-of-the-art performance across four datasets.
arXiv Detail & Related papers (2023-03-31T07:06:24Z) - Convolutional Neural Networks for the classification of glitches in
gravitational-wave data streams [52.77024349608834]
We classify transient noise signals (i.e.glitches) and gravitational waves in data from the Advanced LIGO detectors.
We use models with a supervised learning approach, both trained from scratch using the Gravity Spy dataset.
We also explore a self-supervised approach, pre-training models with automatically generated pseudo-labels.
arXiv Detail & Related papers (2023-03-24T11:12:37Z) - Data Portraits: Recording Foundation Model Training Data [47.03896259762976]
Data Portraits are artifacts that record training data and allow for downstream inspection.
We document a popular language modeling corpus and a recently released code modeling dataset.
Our tool is lightweight and fast, costing only 3% of the dataset size in overhead.
arXiv Detail & Related papers (2023-03-06T04:22:33Z) - Learned Label Aggregation for Weak Supervision [8.819582879892762]
We propose a data programming approach that aggregates weak supervision signals to generate labeled data easily.
The quality of the generated labels depends on a label aggregation model that aggregates all noisy labels from all LFs to infer the ground-truth labels.
We show the model can be trained using synthetically generated data and design an effective architecture for the model.
arXiv Detail & Related papers (2022-07-27T14:36:35Z) - AutoGeoLabel: Automated Label Generation for Geospatial Machine Learning [69.47585818994959]
We evaluate a big data processing pipeline to auto-generate labels for remote sensing data.
We utilize the big geo-data platform IBM PAIRS to dynamically generate such labels in dense urban areas.
arXiv Detail & Related papers (2022-01-31T20:02:22Z) - Multi-Task Self-Training for Learning General Representations [97.01728635294879]
Multi-task self-training (MuST) harnesses the knowledge in independent specialized teacher models to train a single general student model.
MuST is scalable with unlabeled or partially labeled datasets and outperforms both specialized supervised models and self-supervised models when training on large scale datasets.
arXiv Detail & Related papers (2021-08-25T17:20:50Z) - SLADE: A Self-Training Framework For Distance Metric Learning [75.54078592084217]
We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional unlabeled data.
We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data.
We then train a student model on both labels and pseudo labels to generate final feature embeddings.
arXiv Detail & Related papers (2020-11-20T08:26:10Z) - Finding Friends and Flipping Frenemies: Automatic Paraphrase Dataset
Augmentation Using Graph Theory [21.06607915149245]
We construct a paraphrase graph from the provided sentence pair labels, and create an augmented dataset by directly inferring labels from the original sentence pairs using a transitivity property.
We evaluate our methods on paraphrase models trained using these datasets starting from a pretrained BERT model, and find that the automatically-enhanced training sets result in more accurate models.
arXiv Detail & Related papers (2020-11-03T17:18:03Z) - Are Labels Always Necessary for Classifier Accuracy Evaluation? [28.110519483540482]
We aim to estimate the classification accuracy on unlabeled test datasets.
We construct a meta-dataset comprised of datasets generated from the original images.
As the classification accuracy of the model on each sample (dataset) is known from the original dataset labels, our task can be solved via regression.
arXiv Detail & Related papers (2020-07-06T17:45:39Z) - Cross-dataset Training for Class Increasing Object Detection [52.34737978720484]
We present a conceptually simple, flexible and general framework for cross-dataset training in object detection.
By cross-dataset training, existing datasets can be utilized to detect the merged object classes with a single model.
While using cross-dataset training, we only need to label the new classes on the new dataset.
arXiv Detail & Related papers (2020-01-14T04:40:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.