Technical Report of 2023 ABO Fine-grained Semantic Segmentation
Competition
- URL: http://arxiv.org/abs/2310.00427v1
- Date: Sat, 30 Sep 2023 16:32:22 GMT
- Title: Technical Report of 2023 ABO Fine-grained Semantic Segmentation
Competition
- Authors: Zeyu Dong
- Abstract summary: We describe the technical details of our submission to the 2023 ABO Fine-grained Semantic Competition, by Team "Zeyu_Dong"
The task is to predicate the semantic labels for the convex gradient of five categories, which consist of high-quality, standardized 3D models of real products available for purchase online.
The appropriate method helps us rank 3rd place in the Dev phase of the 2023 ICCV 3DVeComm Workshop Challenge.
- Score: 0.3626013617212667
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this report, we describe the technical details of our submission to the
2023 ABO Fine-grained Semantic Segmentation Competition, by Team "Zeyu\_Dong"
(username:ZeyuDong). The task is to predicate the semantic labels for the
convex shape of five categories, which consist of high-quality, standardized 3D
models of real products available for purchase online. By using DGCNN as the
backbone to classify different structures of five classes, We carried out
numerous experiments and found learning rate stochastic gradient descent with
warm restarts and setting different rate of factors for various categories
contribute most to the performance of the model. The appropriate method helps
us rank 3rd place in the Dev phase of the 2023 ICCV 3DVeComm Workshop
Challenge.
Related papers
- ZAHA: Introducing the Level of Facade Generalization and the Large-Scale Point Cloud Facade Semantic Segmentation Benchmark Dataset [34.52622121269287]
Facade semantic segmentation is a long-standing challenge in photogrammetry and computer vision.
We introduce Level of Facade Generalization (LoFG), novel hierarchical facade classes based on international urban modeling standards.
We present to date the largest semantic 3D facade segmentation dataset, providing 601 million annotated points at five and 15 classes of LoFG2 and LoFG3, respectively.
arXiv Detail & Related papers (2024-11-07T16:58:18Z) - Data Efficient Acoustic Scene Classification using Teacher-Informed Confusing Class Instruction [11.15868814062321]
Three systems are introduced to tackle training splits of different sizes.
For small training splits, we explored reducing the complexity of the provided baseline model by reducing the number of base channels.
For the larger training splits, we use FocusNet to provide confusing class information to an ensemble of multiple Patchout faSt Spectrogram Transformer (PaSST) models and baseline models trained on the original sampling rate of 44.1 kHz.
arXiv Detail & Related papers (2024-09-18T13:16:00Z) - Automated 3D Tumor Segmentation using Temporal Cubic PatchGAN (TCuP-GAN) [0.276240219662896]
Temporal Cubic PatchGAN (TCuP-GAN) is a volume-to-volume translational model that marries the concepts of a generative feature learning framework with Convolutional Long Short-Term Memory Networks (LSTMs)
We demonstrate the capabilities of our TCuP-GAN on the data from four segmentation challenges (Adult Glioma, Meningioma, Pediatric Tumors, and Sub-Saharan Africa)
We demonstrate the successful learning of our framework to predict robust multi-class segmentation masks across all the challenges.
arXiv Detail & Related papers (2023-11-23T18:37:26Z) - 3DCoMPaT$^{++}$: An improved Large-scale 3D Vision Dataset for
Compositional Recognition [53.97029821609132]
3DCoMPaT$++$ is a multimodal 2D/3D dataset with 160 million rendered views of more than 10 million stylized 3D shapes.
We introduce a new task, called Grounded CoMPaT Recognition (GCR), to collectively recognize and ground compositions of materials on parts of 3D objects.
arXiv Detail & Related papers (2023-10-27T22:01:43Z) - Hierarchical Audio-Visual Information Fusion with Multi-label Joint
Decoding for MER 2023 [51.95161901441527]
In this paper, we propose a novel framework for recognizing both discrete and dimensional emotions.
Deep features extracted from foundation models are used as robust acoustic and visual representations of raw video.
Our final system achieves state-of-the-art performance and ranks third on the leaderboard on MER-MULTI sub-challenge.
arXiv Detail & Related papers (2023-09-11T03:19:10Z) - SLCA: Slow Learner with Classifier Alignment for Continual Learning on a
Pre-trained Model [73.80068155830708]
We present an extensive analysis for continual learning on a pre-trained model (CLPM)
We propose a simple but extremely effective approach named Slow Learner with Alignment (SLCA)
Across a variety of scenarios, our proposal provides substantial improvements for CLPM.
arXiv Detail & Related papers (2023-03-09T08:57:01Z) - CV 3315 Is All You Need : Semantic Segmentation Competition [14.818852884385015]
This competition focus on Urban-Sense based on the vehicle camera view.
Class highly unbalanced Urban-Sense images dataset challenge the existing solutions.
Deep Conventional neural network-based semantic segmentation methods become flexible solutions applicable to real-world applications.
arXiv Detail & Related papers (2022-06-25T06:27:57Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - Language-Grounded Indoor 3D Semantic Segmentation in the Wild [33.40572976383402]
We study a larger vocabulary for 3D semantic segmentation with a new extended benchmark on ScanNet data with 200 class categories.
We propose a language-driven pre-training method to encourage learned 3D features to lie close to their pre-trained text embeddings.
Our approach consistently outperforms state-of-the-art 3D pre-training for 3D semantic segmentation on our proposed benchmark.
arXiv Detail & Related papers (2022-04-16T09:17:40Z) - HS3: Learning with Proper Task Complexity in Hierarchically Supervised
Semantic Segmentation [81.87943324048756]
We propose Hierarchically Supervised Semantic (HS3), a training scheme that supervises intermediate layers in a segmentation network to learn meaningful representations by varying task complexity.
Our proposed HS3-Fuse framework further improves segmentation predictions and achieves state-of-the-art results on two large segmentation benchmarks: NYUD-v2 and Cityscapes.
arXiv Detail & Related papers (2021-11-03T16:33:29Z) - Three Steps to Multimodal Trajectory Prediction: Modality Clustering,
Classification and Synthesis [54.249502356251085]
We present a novel insight along with a brand-new prediction framework.
Our proposed method surpasses state-of-the-art works even without introducing social and map information.
arXiv Detail & Related papers (2021-03-14T06:21:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.