Related papers: Quokka: An Open-source Large Language Model ChatBot for Material Science

Quokka: An Open-source Large Language Model ChatBot for Material Science

URL: http://arxiv.org/abs/2401.01089v1
Date: Tue, 2 Jan 2024 08:14:48 GMT
Title: Quokka: An Open-source Large Language Model ChatBot for Material Science
Authors: Xianjun Yang, Stephen D. Wilson, Linda Petzold
Abstract summary: This paper presents the development of a specialized chatbots for materials science. The methodology involves an initial pretraining phase on over one million domain-specific papers. We make the four trained checkpoints freely available to the research community at https://github.com/Xianjun-Yang/Quokka.
Score: 14.48214929380849
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This paper presents the development of a specialized chatbot for materials science, leveraging the Llama-2 language model, and continuing pre-training on the expansive research articles in the materials science domain from the S2ORC dataset. The methodology involves an initial pretraining phase on over one million domain-specific papers, followed by an instruction-tuning process to refine the chatbot's capabilities. The chatbot is designed to assist researchers, educators, and students by providing instant, context-aware responses to queries in the field of materials science. We make the four trained checkpoints (7B, 13B, with or without chat ability) freely available to the research community at https://github.com/Xianjun-Yang/Quokka.

Related papers

Seq2Seq Model-Based Chatbot with LSTM and Attention Mechanism for Enhanced User Interaction [1.937324318931008]
This work proposes a Sequence-to-Sequence (Seq2Seq) model with an encoder-decoder architecture that incorporates attention mechanisms and Long Short-Term Memory (LSTM) cells. The proposed Seq2Seq model-based robot is trained, validated, and tested on a dataset specifically for the tourism sector in Draa-Tafilalet, Morocco.
arXiv Detail & Related papers (2024-12-27T23:50:54Z)
Comparing the Utility, Preference, and Performance of Course Material Search Functionality and Retrieval-Augmented Generation Large Language Model (RAG-LLM) AI Chatbots in Information-Seeking Tasks [2.377308748205625]
The purpose of this study was to explore the utility of recent large language models (LLMs) as a support mechanism for students. We conducted a lab-based user study in which participants worked on tasks from a web software development course. Our findings highlight that both support mechanisms are seen as useful and that support mechanisms work well for specific tasks, while less so for other tasks.
arXiv Detail & Related papers (2024-10-17T08:37:25Z)
Language Models as Science Tutors [79.73256703631492]
We introduce TutorEval and TutorChat to measure real-life usability of LMs as scientific assistants. We show that fine-tuning base models with existing dialogue datasets leads to poor performance on TutorEval. We use TutorChat to fine-tune Llemma models with 7B and 34B parameters. These LM tutors specialized in math have a 32K-token context window, and they excel at TutorEval while performing strongly on GSM8K and MATH.
arXiv Detail & Related papers (2024-02-16T22:24:13Z)
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research [139.69207791947738]
Dolma is a three-trillion-token English corpus built from a diverse mixture of web content, scientific papers, code, public-domain books, social media, and encyclopedic materials. We document Dolma, including its design principles, details about its construction, and a summary of its contents. We present analyses and experimental results on intermediate states of Dolma to share what we have learned about important data curation practices.
arXiv Detail & Related papers (2024-01-31T20:29:50Z)
Deep Learning Based Amharic Chatbot for FAQs in Universities [0.0]
This paper proposes a model that answers frequently asked questions (FAQs) in the Amharic language. The proposed program employs tokenization, stop word removal, and stemming to analyze and categorize Amharic input sentences. The model was integrated with Facebook Messenger and deployed on a Heroku server for 24-hour accessibility.
arXiv Detail & Related papers (2024-01-26T18:37:21Z)
A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest [62.63606958140248]
Large Language Models (LLMs) often encounter challenges when dealing with intricate and knowledge-demanding queries in specific domains. This paper introduces a novel approach to enhance LLMs by effectively extracting the relevant knowledge from domain-specific textual sources. We train a knowledge miner, namely LLMiner, which autonomously extracts Question-Answer pairs from relevant documents.
arXiv Detail & Related papers (2023-11-17T16:09:10Z)
Few-Shot Bot: Prompt-Based Learning for Dialogue Systems [58.27337673451943]
Learning to converse using only a few examples is a great challenge in conversational AI. The current best conversational models are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL) We propose prompt-based few-shot learning which does not require gradient-based fine-tuning but instead uses a few examples as the only source of learning.
arXiv Detail & Related papers (2021-10-15T14:36:45Z)
Put Chatbot into Its Interlocutor's Shoes: New Framework to Learn Chatbot Responding with Intention [55.77218465471519]
This paper proposes an innovative framework to train chatbots to possess human-like intentions. Our framework included a guiding robot and an interlocutor model that plays the role of humans. We examined our framework using three experimental setups and evaluate the guiding robot with four different metrics to demonstrated flexibility and performance advantages.
arXiv Detail & Related papers (2021-03-30T15:24:37Z)
TruthBot: An Automated Conversational Tool for Intent Learning, Curated Information Presenting, and Fake News Alerting [12.95006904081387]
TruthBot is designed for seeking truth (trustworthy and verified information) on specific topics. It helps users to obtain information specific to certain topics, fact-check information, and get recent news. TruthBot has been deployed in June 2020 and is currently running.
arXiv Detail & Related papers (2021-01-31T18:23:05Z)
Developing FB Chatbot Based on Deep Learning Using RASA Framework for University Enquiries [0.0]
This research is a first stage development within fairly sufficient simulate data. The concept is not something new in today society which is developing with recent technology. This uses the FB platform because of the FB users have already reached up to 60.8% of its entire population in Indonesia.
arXiv Detail & Related papers (2020-09-25T17:01:19Z)
Chatbot: A Conversational Agent employed with Named Entity Recognition Model using Artificial Neural Network [0.0]
Natural Language Understanding (NLU) has been impressively improved by deep learning methods. This research focuses on Named Entity Recognition (NER) models which can be integrated into NLU service of a dataset. The NER model in the proposed architecture is based on artificial neural network which is trained on manually created entities.
arXiv Detail & Related papers (2020-06-19T14:47:21Z)
Conversations with Search Engines: SERP-based Conversational Response Generation [77.1381159789032]
We create a suitable dataset, the Search as a Conversation (SaaC) dataset, for the development of pipelines for conversations with search engines. We also develop a state-of-the-art pipeline for conversations with search engines, the Conversations with Search Engines (CaSE) using this dataset. CaSE enhances the state-of-the-art by introducing a supporting token identification module and aprior-aware pointer generator.
arXiv Detail & Related papers (2020-04-29T13:07:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.