Tourism Question Answer System in Indian Language using Domain-Adapted Foundation Models
- URL: http://arxiv.org/abs/2511.23235v1
- Date: Fri, 28 Nov 2025 14:44:16 GMT
- Title: Tourism Question Answer System in Indian Language using Domain-Adapted Foundation Models
- Authors: Praveen Gatla, Anushka, Nikita Kanwar, Gouri Sahoo, Rajesh Kumar Mundotiya,
- Abstract summary: This article presents the first comprehensive study on designing a baseline extractive question-answering (QA) system for the Hindi tourism domain.<n>It targets ten tourism-centric variants-Ganga Aarti, Cruise, Food Court, Public Toilet, Kund, Museum, General, Ashram, Temple and Travel.<n>We propose a framework leveraging foundation models-BERT and RoBERTa, fine-tuned using Supervised Fine-Tuning (SFT) and Low-Rank Adaptation (LoRA) to optimize parameter efficiency and task performance.
- Score: 0.6524460254566904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This article presents the first comprehensive study on designing a baseline extractive question-answering (QA) system for the Hindi tourism domain, with a specialized focus on the Varanasi-a cultural and spiritual hub renowned for its Bhakti-Bhaav (devotional ethos). Targeting ten tourism-centric subdomains-Ganga Aarti, Cruise, Food Court, Public Toilet, Kund, Museum, General, Ashram, Temple and Travel, the work addresses the absence of language-specific QA resources in Hindi for culturally nuanced applications. In this paper, a dataset comprising 7,715 Hindi QA pairs pertaining to Varanasi tourism was constructed and subsequently augmented with 27,455 pairs generated via Llama zero-shot prompting. We propose a framework leveraging foundation models-BERT and RoBERTa, fine-tuned using Supervised Fine-Tuning (SFT) and Low-Rank Adaptation (LoRA), to optimize parameter efficiency and task performance. Multiple variants of BERT, including pre-trained languages (e.g., Hindi-BERT), are evaluated to assess their suitability for low-resource domain-specific QA. Evaluation metrics - F1, BLEU, and ROUGE-L - highlight trade-offs between answer precision and linguistic fluency. Experiments demonstrate that LoRA-based fine-tuning achieves competitive performance (85.3\% F1) while reducing trainable parameters by 98\% compared to SFT, striking a balance between efficiency and accuracy. Comparative analysis across models reveals that RoBERTa with SFT outperforms BERT variants in capturing contextual nuances, particularly for culturally embedded terms (e.g., Aarti, Kund). This work establishes a foundational baseline for Hindi tourism QA systems, emphasizing the role of LORA in low-resource settings and underscoring the need for culturally contextualized NLP frameworks in the tourism domain.
Related papers
- Benchmarking BERT-based Models for Sentence-level Topic Classification in Nepali Language [1.6474262142781433]
This study benchmarks multilingual, Indic, Hindi, and Nepali BERT variants to evaluate their effectiveness in Nepali topic classification.<n>Ten pre-trained models, including mBERT, XLM-R, MuRIL, DevBERT, HindiBERT, IndicBERT, and NepBERTa, were fine-tuned and tested.<n>Indic models, particularly MuRIL-large, achieved the highest F1-score of 90.60%, outperforming multilingual and monolingual models.
arXiv Detail & Related papers (2026-02-27T11:42:38Z) - Aspect-Based Sentiment Analysis for Future Tourism Experiences: A BERT-MoE Framework for Persian User Reviews [0.0]
This study advances aspect-based sentiment analysis (ABSA) for Persian-language user reviews in the tourism domain.<n>We propose a hybrid BERT-based model with Top-K routing and auxiliary losses to mitigate routing collapse and improve efficiency.<n>The proposed model achieves a weighted F1-score of 90.6% for ABSA, outperforming baseline BERT (89.25%) and a standard hybrid approach (85.7%)
arXiv Detail & Related papers (2026-02-13T10:01:33Z) - FastPOS: Language-Agnostic Scalable POS Tagging Framework Low-Resource Use Case [0.0]
The framework achieves 96.85 percent and 97 percent token-level accuracy across POS categories in Bangla and Hindi.<n>Its modular and open-source design enables rapid cross-lingual adaptation while reducing model design and tuning overhead.
arXiv Detail & Related papers (2025-11-30T05:48:12Z) - SEA-BED: Southeast Asia Embedding Benchmark [43.05386334897603]
With nearly 700 million speakers, the Southeast Asia region lacks a region-specific embedding benchmark.<n>We introduce SEA-BED, the first large-scale embedding benchmark with 169 datasets across 9 tasks and 10 languages.<n>We evaluate 17 embedding models across six studies, analyzing task and language challenges, cross-benchmark comparisons, and translation trade-offs.
arXiv Detail & Related papers (2025-08-17T05:10:40Z) - Advancing Dialectal Arabic to Modern Standard Arabic Machine Translation [22.369277951685234]
This paper presents two core contributions to advancing DA-MSA translation for the Levantine, Egyptian, and Gulf dialects.<n>Few-shot prompting consistently outperformed zero-shot, chain-of-thought, and our proposed Ara-TEaR method.<n>For fine-tuning LLMs, a quantized Gemma2-9B model achieved a chrF++ score of 49.88, outperforming zero-shot GPT-4o (44.58)
arXiv Detail & Related papers (2025-07-27T14:37:53Z) - PARAM-1 BharatGen 2.9B Model [14.552007884700618]
PARAM-1 is a 2.9B parameter decoder-only, text-only language model trained from scratch with an explicit architectural and linguistic focus on Indian diversity.<n>It is guided by three core principles: equitable representation of Indic languages through a 25% corpus allocation; tokenization fairness via a SentencePiece tokenizer adapted to Indian morphological structures; and culturally aligned evaluation benchmarks across IndicQA, code-mixed reasoning, and socio-linguistic robustness tasks.
arXiv Detail & Related papers (2025-07-16T06:14:33Z) - Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval [49.1574468325115]
We introduce Amharic-specific dense retrieval models based on pre-trained Amharic BERT and RoBERTa backbones.<n>Our proposed RoBERTa-Base-Amharic-Embed model (110M parameters) achieves a 17.6% relative improvement in MRR@10.<n>More compact variants, such as RoBERTa-Medium-Amharic-Embed (42M) remain competitive while being over 13x smaller.
arXiv Detail & Related papers (2025-05-25T23:06:20Z) - A New HOPE: Domain-agnostic Automatic Evaluation of Text Chunking [44.47350338664039]
Document chunking fundamentally impacts Retrieval-Augmented Generation (RAG)<n>There is currently no framework to analyze the impact of different chunking methods.<n>We introduce a novel methodology that defines essential characteristics of the chunking process at three levels.
arXiv Detail & Related papers (2025-05-04T16:22:27Z) - ChinaTravel: An Open-Ended Benchmark for Language Agents in Chinese Travel Planning [38.44879526364259]
We introduce emphChinaTravel, the first open-ended benchmark grounded in authentic Chinese travel requirements.<n>We design a compositionally generalizable domain-specific language for scalable evaluation, covering feasibility, constraint satisfaction, and preference comparison.<n> Empirical studies reveal the potential of neuro-symbolic agents in travel planning, achieving a 37.0% constraint satisfaction rate on human queries.
arXiv Detail & Related papers (2024-12-18T10:10:12Z) - Enhancing Aspect-based Sentiment Analysis in Tourism Using Large Language Models and Positional Information [14.871979025512669]
This paper proposes an aspect-based sentiment analysis model, ACOS_LLM, for Aspect-Category--Sentiment Quadruple Extraction (ACOSQE)
The model comprises two key stages: auxiliary knowledge generation and ACOSQE.
Results demonstrate the model's superior performance, with an F1 improvement of 7.49% compared to other models on the tourism dataset.
arXiv Detail & Related papers (2024-09-23T13:19:17Z) - From Multiple-Choice to Extractive QA: A Case Study for English and Arabic [51.13706104333848]
We explore the feasibility of repurposing an existing multilingual dataset for a new NLP task.<n>We present annotation guidelines and a parallel EQA dataset for English and Modern Standard Arabic.<n>We aim to help others adapt our approach for the remaining 120 BELEBELE language variants, many of which are deemed under-resourced.
arXiv Detail & Related papers (2024-04-26T11:46:05Z) - AceGPT, Localizing Large Language Models in Arabic [73.39989503874634]
The paper proposes a comprehensive solution that includes pre-training with Arabic texts, Supervised Fine-Tuning (SFT) utilizing native Arabic instructions, and GPT-4 responses in Arabic.
The goal is to cultivate culturally cognizant and value-aligned Arabic LLMs capable of accommodating the diverse, application-specific needs of Arabic-speaking communities.
arXiv Detail & Related papers (2023-09-21T13:20:13Z) - Strategies for improving low resource speech to text translation relying
on pre-trained ASR models [59.90106959717875]
This paper presents techniques and findings for improving the performance of low-resource speech to text translation (ST)
We conducted experiments on both simulated and real-low resource setups, on language pairs English - Portuguese, and Tamasheq - French respectively.
arXiv Detail & Related papers (2023-05-31T21:58:07Z) - Fine-tuning Pretrained Multilingual BERT Model for Indonesian
Aspect-based Sentiment Analysis [0.0]
Previous research on Aspect-based Sentiment Analysis (ABSA) for Indonesian reviews in hotel domain has been conducted using CNN and XGBoost.
In this paper, we intend to incorporate one of the foremost language representation model, BERT, to perform ABSA in Indonesian reviews dataset.
arXiv Detail & Related papers (2021-03-05T15:05:51Z) - An Attention Ensemble Approach for Efficient Text Classification of
Indian Languages [0.0]
This paper focuses on the coarse-grained technical domain identification of short text documents in Marathi, a Devanagari script-based Indian language.
A hybrid CNN-BiLSTM attention ensemble model is proposed that competently combines the intermediate sentence representations generated by the convolutional neural network and the bidirectional long short-term memory, leading to efficient text classification.
Experimental results show that the proposed model outperforms various baseline machine learning and deep learning models in the given task, giving the best validation accuracy of 89.57% and f1-score of 0.8875.
arXiv Detail & Related papers (2021-02-20T07:31:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.