Demonstrating Linked Battery Data To Accelerate Knowledge Flow in Battery Science
- URL: http://arxiv.org/abs/2410.23303v1
- Date: Wed, 16 Oct 2024 14:12:41 GMT
- Title: Demonstrating Linked Battery Data To Accelerate Knowledge Flow in Battery Science
- Authors: Philipp Dechent, Elias Barbers, Simon Clark, Susanne Lehner, Brady Planden, Masaki Adachi, David A. Howey, Sabine Paarmann,
- Abstract summary: Batteries are pivotal for transitioning to a climate-friendly future, leading to a surge in battery research.
Scopus lists 14,388 papers that mention "lithium-ion battery" in 2023 alone, making it infeasible for individuals to keep up.
This paper discusses strategies based on structured, semantic, and linked data to manage this information overload.
- Score: 0.5804487044220691
- License:
- Abstract: Batteries are pivotal for transitioning to a climate-friendly future, leading to a surge in battery research. Scopus (Elsevier) lists 14,388 papers that mention "lithium-ion battery" in 2023 alone, making it infeasible for individuals to keep up. This paper discusses strategies based on structured, semantic, and linked data to manage this information overload. Structured data follows a predefined, machine-readable format; semantic data includes metadata for context; linked data references other semantic data, forming a web of interconnected information. We use a battery-related ontology, BattINFO to standardise terms and enable automated data extraction and analysis. Our methodology integrates full-text search and machine-readable data, enhancing data retrieval and battery testing. We aim to unify commercial cell information and develop tools for the battery community such as manufacturer-independent cycling procedure descriptions and external memory for Large Language Models. Although only a first step, this approach significantly accelerates battery research and digitalizes battery testing, inviting community participation for continuous improvement. We provide the structured data and the tools to access them as open source.
Related papers
- Text-to-Battery Recipe: A language modeling-based protocol for automatic battery recipe extraction and retrieval [5.3498018871204245]
We propose a language modeling-based protocol, Text-to-Battery Recipe (T2BR), for the automatic extraction of end-to-end battery recipes.
The proposed protocol will significantly accelerate the review of battery material literature and catalyze innovations in battery design and development.
arXiv Detail & Related papers (2024-07-22T08:15:02Z) - Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.
Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z) - Cycle Life Prediction for Lithium-ion Batteries: Machine Learning and More [0.0]
Batteries are dynamic systems with complicated nonlinear aging.
This tutorial begins with an overview of first-principles, machine learning, and hybrid battery models.
We highlight the challenges of machine learning models, motivating the incorporation of physics in hybrid modeling approaches.
arXiv Detail & Related papers (2024-04-05T12:05:20Z) - Text2Data: Low-Resource Data Generation with Textual Control [104.38011760992637]
Natural language serves as a common and straightforward control signal for humans to interact seamlessly with machines.
We propose Text2Data, a novel approach that utilizes unlabeled data to understand the underlying data distribution through an unsupervised diffusion model.
It undergoes controllable finetuning via a novel constraint optimization-based learning objective that ensures controllability and effectively counteracts catastrophic forgetting.
arXiv Detail & Related papers (2024-02-08T03:41:39Z) - Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research [139.69207791947738]
Dolma is a three-trillion-token English corpus built from a diverse mixture of web content, scientific papers, code, public-domain books, social media, and encyclopedic materials.
We document Dolma, including its design principles, details about its construction, and a summary of its contents.
We present analyses and experimental results on intermediate states of Dolma to share what we have learned about important data curation practices.
arXiv Detail & Related papers (2024-01-31T20:29:50Z) - BatteryML:An Open-source platform for Machine Learning on Battery Degradation [15.469939183346467]
We present BatteryML - a one-step, all-encompass, and open-source platform designed to unify data preprocessing, feature extraction, and the implementation of both traditional and state-of-the-art models.
This streamlined approach promises to enhance the practicality and efficiency of research applications.
arXiv Detail & Related papers (2023-10-23T08:51:05Z) - Active Retrieval Augmented Generation [123.68874416084499]
Augmenting large language models (LMs) by retrieving information from external knowledge resources is one promising solution.
Most existing retrieval augmented LMs employ a retrieve-and-generate setup that only retrieves information once based on the input.
We propose Forward-Looking Active REtrieval augmented generation (FLARE), a generic method which iteratively uses a prediction of the upcoming sentence to anticipate future content.
arXiv Detail & Related papers (2023-05-11T17:13:40Z) - A Deep Learning Approach Towards Generating High-fidelity Diverse
Synthetic Battery Datasets [0.0]
We introduce few Deep Learning-based methods to synthesize high-fidelity battery datasets.
These augmented synthetic datasets will help battery researchers build better estimation models.
arXiv Detail & Related papers (2023-04-09T05:41:21Z) - The Semantic Scholar Open Data Platform [79.4493235243312]
Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover and understand scientific literature.
We combine public and proprietary data sources using state-of-the-art techniques for scholarly PDF content extraction and automatic knowledge graph construction.
The graph includes advanced semantic features such as structurally parsed text, natural language summaries, and vector embeddings.
arXiv Detail & Related papers (2023-01-24T17:13:08Z) - EVBattery: A Large-Scale Electric Vehicle Dataset for Battery Health and
Capacity Estimation [15.169440280225647]
Electric vehicles (EVs) play an important role in reducing carbon emissions.
As EV adoption accelerates, safety issues caused by EV batteries have become an important research topic.
In order to benchmark and develop data-driven methods for this task, we introduce a large and comprehensive dataset of EV batteries.
Our dataset is the first large-scale public dataset on real-world battery data, as existing data either include only several vehicles or is collected in the lab environment.
arXiv Detail & Related papers (2022-01-28T10:06:39Z) - Data-to-text Generation with Macro Planning [61.265321323312286]
We propose a neural model with a macro planning stage followed by a generation stage reminiscent of traditional methods.
Our approach outperforms competitive baselines in terms of automatic and human evaluation.
arXiv Detail & Related papers (2021-02-04T16:32:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.