Related papers: On the effectiveness of Large Language Models in the mechanical design domain

On the effectiveness of Large Language Models in the mechanical design domain

URL: http://arxiv.org/abs/2505.01559v1
Date: Fri, 02 May 2025 19:59:56 GMT
Title: On the effectiveness of Large Language Models in the mechanical design domain
Authors: Daniele Grandi, Fabian Riquelme,
Abstract summary: We leverage the semantic data found in the ABC dataset, specifically the assembly names that designers assigned to the overall assemblies.<n>We developed two unsupervised tasks to evaluate how different model architectures perform on domain-specific data.<n>Our model on the zero-shot classification task outperforms the baselines by a wide margin, and achieves a top-1 classification accuracy of 0.386.
Score: 0.997854155788161
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we seek to understand the performance of large language models in the mechanical engineering domain. We leverage the semantic data found in the ABC dataset, specifically the assembly names that designers assigned to the overall assemblies, and the individual semantic part names that were assigned to each part. After pre-processing the data we developed two unsupervised tasks to evaluate how different model architectures perform on domain-specific data: a binary sentence-pair classification task and a zero-shot classification task. We achieved a 0.62 accuracy for the binary sentence-pair classification task with a fine-tuned model that focuses on fighting over-fitting: 1) modifying learning rates, 2) dropout values, 3) Sequence Length, and 4) adding a multi-head attention layer. Our model on the zero-shot classification task outperforms the baselines by a wide margin, and achieves a top-1 classification accuracy of 0.386. The results shed some light on the specific failure modes that arise when learning from language in this domain.

Related papers

Automatic Pruning of Fine-tuning Datasets for Transformer-based Language Models [13.340191056212692]
We propose an automatic dataset pruning method for the training set of fine-tuning tasks. Our method provides multiple subsets for use in dataset pruning. Experiments on 5 downstream tasks and 2 language models show that, on average, fine-tuning on the winning ticket subsets results in a $0.1 %$ increase in the evaluation performance of the model.
arXiv Detail & Related papers (2024-07-11T22:46:18Z)
Enhancing Visual Continual Learning with Language-Guided Supervision [76.38481740848434]
Continual learning aims to empower models to learn new tasks without forgetting previously acquired knowledge. We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks. Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals.
arXiv Detail & Related papers (2024-03-24T12:41:58Z)
Breaking Free Transformer Models: Task-specific Context Attribution Promises Improved Generalizability Without Fine-tuning Pre-trained LLMs [1.5138606851862884]
We present a framework that allows for maintaining generalizability and enhances the performance on the downstream task. We show that a linear transformation of the text representation from any transformer model using the task-specific concept operator results in a projection onto the latent concept space. Experimental results on three datasets, namely HateXplain, IMDB reviews, and Social Media Attributions, illustrate that the proposed model attains superior accuracy and generalizability.
arXiv Detail & Related papers (2024-01-30T00:23:29Z)
Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification [58.720142291102135]
This case study investigates the task of job classification in a real-world setting. The goal is to determine whether an English-language job posting is appropriate for a graduate or entry-level position.
arXiv Detail & Related papers (2023-03-13T14:09:53Z)
Learning from Mistakes: Self-Regularizing Hierarchical Representations in Point Cloud Semantic Segmentation [15.353256018248103]
LiDAR semantic segmentation has gained attention to accomplish fine-grained scene understanding. We present a coarse-to-fine setup that LEArns from classification mistaKes (LEAK) derived from a standard model. Our LEAK approach is very general and can be seamlessly applied on top of any segmentation architecture.
arXiv Detail & Related papers (2023-01-26T14:52:30Z)
Discover, Explanation, Improvement: An Automatic Slice Detection Framework for Natural Language Processing [72.14557106085284]
slice detection models (SDM) automatically identify underperforming groups of datapoints. This paper proposes a benchmark named "Discover, Explain, improve (DEIM)" for classification NLP tasks. Our evaluation shows that Edisa can accurately select error-prone datapoints with informative semantic features.
arXiv Detail & Related papers (2022-11-08T19:00:00Z)
Zero-Shot Text Classification with Self-Training [8.68603153534916]
We show that fine-tuning the zero-shot classifier on its most confident predictions leads to significant performance gains across a wide range of text classification tasks. Self-training adapts the zero-shot model to the task at hand.
arXiv Detail & Related papers (2022-10-31T17:55:00Z)
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains. We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images. A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z)
X2Parser: Cross-Lingual and Cross-Domain Framework for Task-Oriented Compositional Semantic Parsing [51.81533991497547]
Task-oriented compositional semantic parsing (TCSP) handles complex nested user queries. We present X2 compared a transferable Cross-lingual and Cross-domain for TCSP. We propose to predict flattened intents and slots representations separately and cast both prediction tasks into sequence labeling problems.
arXiv Detail & Related papers (2021-06-07T16:40:05Z)
Recognition and Processing of NATOM [0.0]
This paper shows how to process the NOTAM (Notice to Airmen) data of the field in civil aviation. For the original data of the NOTAM, there is a mixture of Chinese and English, and the structure is poor. Using Glove word vector methods to represent the data for using a custom mapping vocabulary.
arXiv Detail & Related papers (2021-04-29T10:12:00Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
Fine-Grained Visual Classification with Efficient End-to-end Localization [49.9887676289364]
We present an efficient localization module that can be fused with a classification network in an end-to-end setup. We evaluate the new model on the three benchmark datasets CUB200-2011, Stanford Cars and FGVC-Aircraft.
arXiv Detail & Related papers (2020-05-11T14:07:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.