Related papers: Multi-modal Machine Learning for Vehicle Rating Predictions Using Image, Text, and Parametric Data

Multi-modal Machine Learning for Vehicle Rating Predictions Using Image, Text, and Parametric Data

URL: http://arxiv.org/abs/2305.15218v2
Date: Sat, 27 May 2023 10:20:16 GMT
Title: Multi-modal Machine Learning for Vehicle Rating Predictions Using Image, Text, and Parametric Data
Authors: Hanqi Su, Binyang Song and Faez Ahmed
Abstract summary: We propose a multi-modal learning model for accurate vehicle rating predictions. The model simultaneously learns features from the parametric specifications, text descriptions, and images of vehicles. We find that the multi-modal model's explanatory power is 4% - 12% higher than that of the unimodal models.
Score: 3.463438487417909
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurate vehicle rating prediction can facilitate designing and configuring good vehicles. This prediction allows vehicle designers and manufacturers to optimize and improve their designs in a timely manner, enhance their product performance, and effectively attract consumers. However, most of the existing data-driven methods rely on data from a single mode, e.g., text, image, or parametric data, which results in a limited and incomplete exploration of the available information. These methods lack comprehensive analyses and exploration of data from multiple modes, which probably leads to inaccurate conclusions and hinders progress in this field. To overcome this limitation, we propose a multi-modal learning model for more comprehensive and accurate vehicle rating predictions. Specifically, the model simultaneously learns features from the parametric specifications, text descriptions, and images of vehicles to predict five vehicle rating scores, including the total score, critics score, performance score, safety score, and interior score. We compare the multi-modal learning model to the corresponding unimodal models and find that the multi-modal model's explanatory power is 4% - 12% higher than that of the unimodal models. On this basis, we conduct sensitivity analyses using SHAP to interpret our model and provide design and optimization directions to designers and manufacturers. Our study underscores the importance of the data-driven multi-modal learning approach for vehicle design, evaluation, and optimization. We have made the code publicly available at http://decode.mit.edu/projects/vehicleratings/.

Related papers

Data Scaling Laws for End-to-End Autonomous Driving [83.85463296830743]
We evaluate the performance of a simple end-to-end driving architecture on internal driving datasets ranging in size from 16 to 8192 hours. Specifically, we investigate how much additional training data is needed to achieve a target performance gain.
arXiv Detail & Related papers (2025-04-06T03:23:48Z)
MetaFollower: Adaptable Personalized Autonomous Car Following [63.90050686330677]
We propose an adaptable personalized car-following framework - MetaFollower. We first utilize Model-Agnostic Meta-Learning (MAML) to extract common driving knowledge from various CF events. We additionally combine Long Short-Term Memory (LSTM) and Intelligent Driver Model (IDM) to reflect temporal heterogeneity with high interpretability.
arXiv Detail & Related papers (2024-06-23T15:30:40Z)
Trajeglish: Traffic Modeling as Next-Token Prediction [67.28197954427638]
A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs. We apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios. Our model tops the Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%.
arXiv Detail & Related papers (2023-12-07T18:53:27Z)
StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning. This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models. Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z)
FollowNet: A Comprehensive Benchmark for Car-Following Behavior Modeling [20.784555362703294]
We establish a public benchmark dataset for car-following behavior modeling. The benchmark consists of more than 80K car-following events extracted from five public driving datasets. Results show that the deep deterministic policy gradient (DDPG) based model performs competitively with a lower MSE for spacing.
arXiv Detail & Related papers (2023-05-25T08:59:26Z)
A Control-Centric Benchmark for Video Prediction [69.22614362800692]
We propose a benchmark for action-conditioned video prediction in the form of a control benchmark. Our benchmark includes simulated environments with 11 task categories and 310 task instance definitions. We then leverage our benchmark to study the effects of scaling model size, quantity of training data, and model ensembling.
arXiv Detail & Related papers (2023-04-26T17:59:45Z)
IDM-Follower: A Model-Informed Deep Learning Method for Long-Sequence Car-Following Trajectory Prediction [24.94160059351764]
Most car-following models are generative and only consider the inputs of the speed, position, and acceleration of the last time step. We implement a novel structure with two independent encoders and a self-attention decoder that could sequentially predict the following trajectories. Numerical experiments with multiple settings on simulation and NGSIM datasets show that the IDM-Follower can improve the prediction performance.
arXiv Detail & Related papers (2022-10-20T02:24:27Z)
On the Choice of Data for Efficient Training and Validation of End-to-End Driving Models [32.381828309166195]
We investigate the influence of several data design choices regarding training and validation of deep driving models trainable in an end-to-end fashion. We show by correlation analysis, which validation design enables the driving performance measured during validation to generalize to unknown test environments.
arXiv Detail & Related papers (2022-06-01T16:25:28Z)
Transforming Model Prediction for Tracking [109.08417327309937]
Transformers capture global relations with little inductive bias, allowing it to learn the prediction of more powerful target models. We train the proposed tracker end-to-end and validate its performance by conducting comprehensive experiments on multiple tracking datasets. Our tracker sets a new state of the art on three benchmarks, achieving an AUC of 68.5% on the challenging LaSOT dataset.
arXiv Detail & Related papers (2022-03-21T17:59:40Z)
CRAT-Pred: Vehicle Trajectory Prediction with Crystal Graph Convolutional Neural Networks and Multi-Head Self-Attention [10.83642398981694]
CRAT-Pred is a trajectory prediction model that does not rely on map information. The model achieves state-of-the-art performance with a significantly lower number of model parameters. In addition to that, we quantitatively show that the self-attention mechanism is able to learn social interactions between vehicles, with the weights representing a measurable interaction score.
arXiv Detail & Related papers (2022-02-09T14:36:36Z)
Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms. We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance. We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z)
The Importance of Balanced Data Sets: Analyzing a Vehicle Trajectory Prediction Model based on Neural Networks and Distributed Representations [0.0]
We investigate the composition of training data in vehicle trajectory prediction. We show that the models employing our semantic vector representation outperform the numerical model when trained on an adequate data set.
arXiv Detail & Related papers (2020-09-30T20:00:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.