Quokka: An Open-source Large Language Model ChatBot for Material Science
- URL: http://arxiv.org/abs/2401.01089v1
- Date: Tue, 2 Jan 2024 08:14:48 GMT
- Title: Quokka: An Open-source Large Language Model ChatBot for Material Science
- Authors: Xianjun Yang, Stephen D. Wilson, Linda Petzold
- Abstract summary: This paper presents the development of a specialized chatbots for materials science.
The methodology involves an initial pretraining phase on over one million domain-specific papers.
We make the four trained checkpoints freely available to the research community at https://github.com/Xianjun-Yang/Quokka.
- Score: 14.48214929380849
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper presents the development of a specialized chatbot for materials
science, leveraging the Llama-2 language model, and continuing pre-training on
the expansive research articles in the materials science domain from the S2ORC
dataset. The methodology involves an initial pretraining phase on over one
million domain-specific papers, followed by an instruction-tuning process to
refine the chatbot's capabilities. The chatbot is designed to assist
researchers, educators, and students by providing instant, context-aware
responses to queries in the field of materials science. We make the four
trained checkpoints (7B, 13B, with or without chat ability) freely available to
the research community at https://github.com/Xianjun-Yang/Quokka.
Related papers
- Comparing the Utility, Preference, and Performance of Course Material Search Functionality and Retrieval-Augmented Generation Large Language Model (RAG-LLM) AI Chatbots in Information-Seeking Tasks [2.377308748205625]
The purpose of this study was to explore the utility of recent large language models (LLMs) as a support mechanism for students.
We conducted a lab-based user study in which participants worked on tasks from a web software development course.
Our findings highlight that both support mechanisms are seen as useful and that support mechanisms work well for specific tasks, while less so for other tasks.
arXiv Detail & Related papers (2024-10-17T08:37:25Z) - Language Models as Science Tutors [79.73256703631492]
We introduce TutorEval and TutorChat to measure real-life usability of LMs as scientific assistants.
We show that fine-tuning base models with existing dialogue datasets leads to poor performance on TutorEval.
We use TutorChat to fine-tune Llemma models with 7B and 34B parameters. These LM tutors specialized in math have a 32K-token context window, and they excel at TutorEval while performing strongly on GSM8K and MATH.
arXiv Detail & Related papers (2024-02-16T22:24:13Z) - Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research [139.69207791947738]
Dolma is a three-trillion-token English corpus built from a diverse mixture of web content, scientific papers, code, public-domain books, social media, and encyclopedic materials.
We document Dolma, including its design principles, details about its construction, and a summary of its contents.
We present analyses and experimental results on intermediate states of Dolma to share what we have learned about important data curation practices.
arXiv Detail & Related papers (2024-01-31T20:29:50Z) - Deep Learning Based Amharic Chatbot for FAQs in Universities [0.0]
This paper proposes a model that answers frequently asked questions (FAQs) in the Amharic language.
The proposed program employs tokenization, stop word removal, and stemming to analyze and categorize Amharic input sentences.
The model was integrated with Facebook Messenger and deployed on a Heroku server for 24-hour accessibility.
arXiv Detail & Related papers (2024-01-26T18:37:21Z) - A Self-enhancement Approach for Domain-specific Chatbot Training via
Knowledge Mining and Digest [62.63606958140248]
Large Language Models (LLMs) often encounter challenges when dealing with intricate and knowledge-demanding queries in specific domains.
This paper introduces a novel approach to enhance LLMs by effectively extracting the relevant knowledge from domain-specific textual sources.
We train a knowledge miner, namely LLMiner, which autonomously extracts Question-Answer pairs from relevant documents.
arXiv Detail & Related papers (2023-11-17T16:09:10Z) - Few-Shot Bot: Prompt-Based Learning for Dialogue Systems [58.27337673451943]
Learning to converse using only a few examples is a great challenge in conversational AI.
The current best conversational models are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL)
We propose prompt-based few-shot learning which does not require gradient-based fine-tuning but instead uses a few examples as the only source of learning.
arXiv Detail & Related papers (2021-10-15T14:36:45Z) - Put Chatbot into Its Interlocutor's Shoes: New Framework to Learn
Chatbot Responding with Intention [55.77218465471519]
This paper proposes an innovative framework to train chatbots to possess human-like intentions.
Our framework included a guiding robot and an interlocutor model that plays the role of humans.
We examined our framework using three experimental setups and evaluate the guiding robot with four different metrics to demonstrated flexibility and performance advantages.
arXiv Detail & Related papers (2021-03-30T15:24:37Z) - TruthBot: An Automated Conversational Tool for Intent Learning, Curated
Information Presenting, and Fake News Alerting [12.95006904081387]
TruthBot is designed for seeking truth (trustworthy and verified information) on specific topics.
It helps users to obtain information specific to certain topics, fact-check information, and get recent news.
TruthBot has been deployed in June 2020 and is currently running.
arXiv Detail & Related papers (2021-01-31T18:23:05Z) - Developing FB Chatbot Based on Deep Learning Using RASA Framework for
University Enquiries [0.0]
This research is a first stage development within fairly sufficient simulate data.
The concept is not something new in today society which is developing with recent technology.
This uses the FB platform because of the FB users have already reached up to 60.8% of its entire population in Indonesia.
arXiv Detail & Related papers (2020-09-25T17:01:19Z) - Chatbot: A Conversational Agent employed with Named Entity Recognition
Model using Artificial Neural Network [0.0]
Natural Language Understanding (NLU) has been impressively improved by deep learning methods.
This research focuses on Named Entity Recognition (NER) models which can be integrated into NLU service of a dataset.
The NER model in the proposed architecture is based on artificial neural network which is trained on manually created entities.
arXiv Detail & Related papers (2020-06-19T14:47:21Z) - Conversations with Search Engines: SERP-based Conversational Response
Generation [77.1381159789032]
We create a suitable dataset, the Search as a Conversation (SaaC) dataset, for the development of pipelines for conversations with search engines.
We also develop a state-of-the-art pipeline for conversations with search engines, the Conversations with Search Engines (CaSE) using this dataset.
CaSE enhances the state-of-the-art by introducing a supporting token identification module and aprior-aware pointer generator.
arXiv Detail & Related papers (2020-04-29T13:07:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.