Beyond SEO: A Transformer-Based Approach for Reinventing Web Content Optimisation
- URL: http://arxiv.org/abs/2507.03169v1
- Date: Thu, 03 Jul 2025 20:52:10 GMT
- Title: Beyond SEO: A Transformer-Based Approach for Reinventing Web Content Optimisation
- Authors: Florian Lüttgenau, Imar Colic, Gervasio Ramirez,
- Abstract summary: We present a domain-specific fine-tuning approach for Generative Engine Optimization (GEO)<n>Our method fine-tunes a BART-base transformer on synthetically generated training data comprising 1,905 cleaned travel website content pairs.<n> optimized content demonstrates substantial visibility gains in generative search responses with 15.63% improvement in absolute word count and 30.96% improvement in position-adjusted word count metrics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rise of generative AI search engines is disrupting traditional SEO, with Gartner predicting 25% reduction in conventional search usage by 2026. This necessitates new approaches for web content visibility in AI-driven search environments. We present a domain-specific fine-tuning approach for Generative Engine Optimization (GEO) that transforms web content to improve discoverability in large language model outputs. Our method fine-tunes a BART-base transformer on synthetically generated training data comprising 1,905 cleaned travel website content pairs. Each pair consists of raw website text and its GEO-optimized counterpart incorporating credible citations, statistical evidence, and improved linguistic fluency. We evaluate using intrinsic metrics (ROUGE-L, BLEU) and extrinsic visibility assessments through controlled experiments with Llama-3.3-70B. The fine-tuned model achieves significant improvements over baseline BART: ROUGE-L scores of 0.249 (vs. 0.226) and BLEU scores of 0.200 (vs. 0.173). Most importantly, optimized content demonstrates substantial visibility gains in generative search responses with 15.63% improvement in absolute word count and 30.96% improvement in position-adjusted word count metrics. This work provides the first empirical demonstration that targeted transformer fine-tuning can effectively enhance web content visibility in generative search engines with modest computational resources. Our results suggest GEO represents a tractable approach for content optimization in the AI-driven search landscape, offering concrete evidence that small-scale, domain-focused fine-tuning yields meaningful improvements in content discoverability.
Related papers
- SAGEO Arena: A Realistic Environment for Evaluating Search-Augmented Generative Engine Optimization [11.467565046589414]
Search-Augmented Generative Engines (SAGE) have emerged as a new paradigm for information access.<n>No evaluation environment currently supports comprehensive investigation of SAGEO.<n>We introduce SAGEO Arena, a realistic and reproducible environment for stage-level SAGEO analysis.
arXiv Detail & Related papers (2026-02-12T17:18:00Z) - E-GEO: A Testbed for Generative Engine Optimization in E-Commerce [9.66101096555058]
generative engine optimization improves content visibility and relevance for generative engines.<n>Current GEO practices are ad hoc, and their impacts remain poorly understood.<n>E-GEO is the first benchmark built specifically for e-commerce GEO.
arXiv Detail & Related papers (2025-11-25T21:28:40Z) - Caption Injection for Optimization in Generative Search Engine [15.472540238931202]
Generative Search Engines (GSEs) leverage Retrieval-Augmented Generation (RAG) techniques and Large Language Models (LLMs)<n>We propose Caption Injection, the first multimodal G-SEO approach, which extracts captions from images and injects them into textual content.<n> Experimental results show that Caption Injection significantly outperforms text-only G-SEO baselines under the G-Eval metric.
arXiv Detail & Related papers (2025-11-06T05:37:27Z) - Role-Augmented Intent-Driven Generative Search Engine Optimization [9.876307656819039]
We propose a Role-Augmented Intent-Driven Generative Search Engine Optimization (G-SEO) method.<n>Our method models search intent through reflective refinement across diverse informational roles, enabling targeted content enhancement.<n> Experimental results demonstrate that search intent serves as an effective signal for guiding content optimization.
arXiv Detail & Related papers (2025-08-15T02:08:55Z) - KARE-RAG: Knowledge-Aware Refinement and Enhancement for RAG [63.82127103851471]
Retrieval-Augmented Generation (RAG) enables large language models to access broader knowledge sources.<n>We demonstrate that enhancing generative models' capacity to process noisy content is equally critical for robust performance.<n>We present KARE-RAG, which improves knowledge utilization through three key innovations.
arXiv Detail & Related papers (2025-06-03T06:31:17Z) - TWSSenti: A Novel Hybrid Framework for Topic-Wise Sentiment Analysis on Social Media Using Transformer Models [0.0]
This study explores a hybrid framework combining transformer-based models to improve sentiment classification accuracy and robustness.<n>The framework addresses challenges such as noisy data, contextual ambiguity, and generalization across diverse datasets.<n>This research highlights its applicability to real-world tasks such as social media monitoring, customer sentiment analysis, and public opinion tracking.
arXiv Detail & Related papers (2025-04-14T05:44:11Z) - Zero-Indexing Internet Search Augmented Generation for Large Language Models [15.138260067336455]
Retrieval augmented generation has emerged as an effective method to enhance large language model performance.<n>This approach typically relies on an internal retrieval module that uses various indexing mechanisms to manage a static pre-processed corpus.<n>In this paper, we explore an alternative approach that leverages standard search engine APIs to dynamically integrate the latest online information.
arXiv Detail & Related papers (2024-11-29T05:31:04Z) - Transforming In-Vehicle Network Intrusion Detection: VAE-based Knowledge Distillation Meets Explainable AI [0.0]
This paper introduces an advanced intrusion detection system (IDS) called KD-XVAE that uses a Variational Autoencoder (VAE)-based knowledge distillation approach.
Our model significantly reduces complexity, operating with just 1669 parameters and achieving an inference time of 0.3 ms per batch.
arXiv Detail & Related papers (2024-10-11T17:57:16Z) - Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation [82.95830628372845]
This paper introduces a collaborative vision-text optimizing mechanism within the Open-Vocabulary encoder (OVS) field.<n>To the best of our knowledge, we are the first to establish the collaborative vision-text optimizing mechanism within the OVS field.<n>In open-vocabulary semantic segmentation, our method outperforms the previous state-of-the-art approaches by +0.5, +2.3, +3.4, +0.4 and +1.1 mIoU, respectively.
arXiv Detail & Related papers (2024-08-01T17:48:08Z) - Tree Search for Language Model Agents [69.43007235771383]
We propose an inference-time search algorithm for LM agents to perform exploration and multi-step planning in interactive web environments.
Our approach is a form of best-first tree search that operates within the actual environment space.
It is the first tree search algorithm for LM agents that shows effectiveness on realistic web tasks.
arXiv Detail & Related papers (2024-07-01T17:07:55Z) - LOLA: LLM-Assisted Online Learning Algorithm for Content Experiments [2.2021543101231167]
Modern media firms require automated and efficient methods to identify content that is most engaging and appealing to users.
We first investigate the ability of three pure-LLM approaches to identify the catchiest headline: prompt-based methods, embedding-based methods, and fine-tuned open-source LLMs.
We then introduce the LLM-Assisted Online Learning Algorithm (LOLA), a novel framework that integrates Large Language Models (LLMs) with adaptive experimentation to optimize content delivery.
arXiv Detail & Related papers (2024-06-03T07:56:58Z) - GEO: Generative Engine Optimization [50.45232692363787]
We formalize the unified framework of generative engines (GEs)
GEs use large language models (LLMs) to gather and summarize information to answer user queries.
Generative Engines typically satisfy queries by synthesizing information from multiple sources and summarizing them.
We introduce Generative Engine Optimization (GEO), the first novel paradigm to aid content creators in improving their content visibility in generative engine responses.
arXiv Detail & Related papers (2023-11-16T10:06:09Z) - Unified Embedding Based Personalized Retrieval in Etsy Search [0.206242362470764]
We propose learning a unified embedding model incorporating graph, transformer and term-based embeddings end to end.
Our personalized retrieval model significantly improves the overall search experience, as measured by a 5.58% increase in search purchase rate and a 2.63% increase in site-wide conversion rate.
arXiv Detail & Related papers (2023-06-07T23:24:50Z) - GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous
Structured Pruning for Vision Transformer [76.2625311630021]
Vision transformers (ViTs) have shown very impressive empirical performance in various computer vision tasks.
To mitigate this challenging problem, structured pruning is a promising solution to compress model size and enable practical efficiency.
We propose GOHSP, a unified framework of Graph and Optimization-based Structured Pruning for ViT models.
arXiv Detail & Related papers (2023-01-13T00:40:24Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.