Unified machine learning tasks and datasets for enhancing renewable
energy
- URL: http://arxiv.org/abs/2311.06876v1
- Date: Sun, 12 Nov 2023 15:30:44 GMT
- Title: Unified machine learning tasks and datasets for enhancing renewable
energy
- Authors: Arsam Aryandoust, Thomas Rigoni, Francesco di Stefano, Anthony Patt
- Abstract summary: We introduce the ETT-17 (Energy Transition Tasks-17), a collection of 17 datasets related to enhancing renewable energy.
We unify all tasks and datasets, such that they can be solved using a single multi-tasking ML model.
- Score: 0.8356833388425764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-tasking machine learning (ML) models exhibit prediction abilities in
domains with little to no training data available (few-shot and zero-shot
learning). Over-parameterized ML models are further capable of zero-loss
training and near-optimal generalization performance. An open research question
is, how these novel paradigms contribute to solving tasks related to enhancing
the renewable energy transition and mitigating climate change. A collection of
unified ML tasks and datasets from this domain can largely facilitate the
development and empirical testing of such models, but is currently missing.
Here, we introduce the ETT-17 (Energy Transition Tasks-17), a collection of 17
datasets from six different application domains related to enhancing renewable
energy, including out-of-distribution validation and testing data. We unify all
tasks and datasets, such that they can be solved using a single multi-tasking
ML model. We further analyse the dimensions of each dataset; investigate what
they require for designing over-parameterized models; introduce a set of
dataset scores that describe important properties of each task and dataset; and
provide performance benchmarks.
Related papers
- Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning [1.6570772838074355]
multimodal large language models (MLLMs) exhibit great potential for chart question answering (CQA)
Recent efforts primarily focus on scaling up training datasets through data collection and synthesis.
We propose a visualization-referenced instruction tuning approach to guide the training dataset enhancement and model development.
arXiv Detail & Related papers (2024-07-29T17:04:34Z) - RS-GPT4V: A Unified Multimodal Instruction-Following Dataset for Remote Sensing Image Understanding [4.266920365127677]
Under the new LaGD paradigm, the old datasets are no longer suitable for fire-new tasks.
We designed a high-quality, diversified, and unified multimodal instruction-following dataset for RSI understanding.
The empirical results show that the fine-tuned MLLMs by RS-GPT4V can describe fine-grained information.
arXiv Detail & Related papers (2024-06-18T10:34:28Z) - Genixer: Empowering Multimodal Large Language Models as a Powerful Data Generator [63.762209407570715]
Genixer is a comprehensive data generation pipeline consisting of four key steps.
A synthetic VQA-like dataset trained with LLaVA1.5 enhances performance on 10 out of 12 multimodal benchmarks.
MLLMs trained with task-specific datasets can surpass GPT-4V in generating complex instruction tuning data.
arXiv Detail & Related papers (2023-12-11T09:44:41Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - An Efficient General-Purpose Modular Vision Model via Multi-Task
Heterogeneous Training [79.78201886156513]
We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently.
Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks.
arXiv Detail & Related papers (2023-06-29T17:59:57Z) - Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z) - InPars: Data Augmentation for Information Retrieval using Large Language
Models [5.851846467503597]
In this work, we harness the few-shot capabilities of large pretrained language models as synthetic data generators for information retrieval tasks.
We show that models finetuned solely on our unsupervised dataset outperform strong baselines such as BM25.
retrievers finetuned on both supervised and our synthetic data achieve better zero-shot transfer than models finetuned only on supervised data.
arXiv Detail & Related papers (2022-02-10T16:52:45Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.