Personalized Scientific Figure Caption Generation: An Empirical Study on Author-Specific Writing Style Transfer
- URL: http://arxiv.org/abs/2509.25817v1
- Date: Tue, 30 Sep 2025 05:52:15 GMT
- Title: Personalized Scientific Figure Caption Generation: An Empirical Study on Author-Specific Writing Style Transfer
- Authors: Jaeyoung Kim, Jongho Lee, Hongjun Choi, Sion Jang,
- Abstract summary: We study personalized figure caption generation using author profile data from scientific papers.<n>We reveal a fundamental trade-off between matching author style and maintaining caption quality.
- Score: 12.354075334437141
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study personalized figure caption generation using author profile data from scientific papers. Our experiments demonstrate that rich author profile data, combined with relevant metadata, can significantly improve the personalization performance of multimodal large language models. However, we also reveal a fundamental trade-off between matching author style and maintaining caption quality. Our findings offer valuable insights and future directions for developing practical caption automation systems that balance both objectives. This work was conducted as part of the 3rd SciCap challenge.
Related papers
- How to Train your Text-to-Image Model: Evaluating Design Choices for Synthetic Training Captions [29.52344052330828]
We investigate how different synthetic captioning strategies impact the downstream performance of text-to-image models.<n>Our experiments demonstrate that dense, high-quality captions enhance text alignment but may introduce trade-offs in output aesthetics and diversity.<n>Our findings underscore the importance of caption design in achieving optimal model performance.
arXiv Detail & Related papers (2025-06-20T01:52:17Z) - Personalized Graph-Based Retrieval for Large Language Models [51.7278897841697]
We propose a framework that leverages user-centric knowledge graphs to enrich personalization.<n>By directly integrating structured user knowledge into the retrieval process and augmenting prompts with user-relevant context, PGraph enhances contextual understanding and output quality.<n>We also introduce the Personalized Graph-based Benchmark for Text Generation, designed to evaluate personalized text generation tasks in real-world settings where user history is sparse or unavailable.
arXiv Detail & Related papers (2025-01-04T01:46:49Z) - Personalized Representation from Personalized Generation [36.848215621708235]
We formalize the challenge of using personalized synthetic data to learn personalized representations.<n>We show that our method improves personalized representation learning for diverse downstream tasks.
arXiv Detail & Related papers (2024-12-20T18:59:03Z) - Personalized Multimodal Large Language Models: A Survey [127.9521218125761]
Multimodal Large Language Models (MLLMs) have become increasingly important due to their state-of-the-art performance and ability to integrate multiple data modalities.<n>This paper presents a comprehensive survey on personalized multimodal large language models, focusing on their architecture, training methods, and applications.
arXiv Detail & Related papers (2024-12-03T03:59:03Z) - Capturing Style in Author and Document Representation [4.323709559692927]
We propose a new architecture that learns embeddings for both authors and documents with a stylistic constraint.<n>We evaluate our method on three datasets: a literary corpus extracted from the Gutenberg Project, the Blog Authorship and IMDb62.
arXiv Detail & Related papers (2024-07-18T10:01:09Z) - MMSci: A Dataset for Graduate-Level Multi-Discipline Multimodal Scientific Understanding [59.41495657570397]
We present a comprehensive dataset compiled from Nature Communications articles covering 72 scientific fields.<n>We evaluated 19 proprietary and open-source models on two benchmark tasks, figure captioning and multiple-choice, and conducted human expert annotation.<n>Fine-tuning Qwen2-VL-7B with our task-specific data achieved better performance than GPT-4o and even human experts in multiple-choice evaluations.
arXiv Detail & Related papers (2024-07-06T00:40:53Z) - Step-Back Profiling: Distilling User History for Personalized Scientific Writing [50.481041470669766]
Large language models (LLM) excel at a variety of natural language processing tasks, yet they struggle to generate personalized content for individuals.
We introduce STEP-BACK PROFILING to personalize LLMs by distilling user history into concise profiles.
Our approach outperforms the baselines by up to 3.6 points on the general personalization benchmark.
arXiv Detail & Related papers (2024-06-20T12:58:26Z) - Unsupervised Neural Stylistic Text Generation using Transfer learning
and Adapters [66.17039929803933]
We propose a novel transfer learning framework which updates only $0.3%$ of model parameters to learn style specific attributes for response generation.
We learn style specific attributes from the PERSONALITY-CAPTIONS dataset.
arXiv Detail & Related papers (2022-10-07T00:09:22Z) - PART: Pre-trained Authorship Representation Transformer [52.623051272843426]
Authors writing documents imprint identifying information within their texts.<n>Previous works use hand-crafted features or classification tasks to train their authorship models.<n>We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - Generating More Pertinent Captions by Leveraging Semantics and Style on
Multi-Source Datasets [56.018551958004814]
This paper addresses the task of generating fluent descriptions by training on a non-uniform combination of data sources.
Large-scale datasets with noisy image-text pairs provide a sub-optimal source of supervision.
We propose to leverage and separate semantics and descriptive style through the incorporation of a style token and keywords extracted through a retrieval component.
arXiv Detail & Related papers (2021-11-24T19:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.