ChatGPT is not all you need. A State of the Art Review of large
Generative AI models
- URL: http://arxiv.org/abs/2301.04655v1
- Date: Wed, 11 Jan 2023 15:48:36 GMT
- Title: ChatGPT is not all you need. A State of the Art Review of large
Generative AI models
- Authors: Roberto Gozalo-Brizuela, Eduardo C. Garrido-Merchan
- Abstract summary: This work consists on an attempt to describe in a concise way the main models that are affected by generative AI and to provide a taxonomy of the main generative models published recently.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: During the last two years there has been a plethora of large generative
models such as ChatGPT or Stable Diffusion that have been published.
Concretely, these models are able to perform tasks such as being a general
question and answering system or automatically creating artistic images that
are revolutionizing several sectors. Consequently, the implications that these
generative models have in the industry and society are enormous, as several job
positions may be transformed. For example, Generative AI is capable of
transforming effectively and creatively texts to images, like the DALLE-2
model; text to 3D images, like the Dreamfusion model; images to text, like the
Flamingo model; texts to video, like the Phenaki model; texts to audio, like
the AudioLM model; texts to other texts, like ChatGPT; texts to code, like the
Codex model; texts to scientific texts, like the Galactica model or even create
algorithms like AlphaTensor. This work consists on an attempt to describe in a
concise way the main models are sectors that are affected by generative AI and
to provide a taxonomy of the main generative models published recently.
Related papers
- Bridging Different Language Models and Generative Vision Models for
Text-to-Image Generation [12.024554708901514]
We propose LaVi-Bridge, a pipeline that enables the integration of diverse pre-trained language models and generative vision models for text-to-image generation.
Our pipeline is compatible with various language models and generative vision models, accommodating different structures.
arXiv Detail & Related papers (2024-03-12T17:50:11Z) - RenAIssance: A Survey into AI Text-to-Image Generation in the Era of
Large Model [93.8067369210696]
Text-to-image generation (TTI) refers to the usage of models that could process text input and generate high fidelity images based on text descriptions.
Diffusion models are one prominent type of generative model used for the generation of images through the systematic introduction of noises with repeating steps.
In the era of large models, scaling up model size and the integration with large language models have further improved the performance of TTI models.
arXiv Detail & Related papers (2023-09-02T03:27:20Z) - Text-to-image Diffusion Models in Generative AI: A Survey [75.32882187215394]
We present a review of state-of-the-art methods on text-conditioned image synthesis, i.e., text-to-image.
We discuss applications beyond text-to-image generation: text-guided creative generation and text-guided image editing.
arXiv Detail & Related papers (2023-03-14T13:49:54Z) - eDiffi: Text-to-Image Diffusion Models with an Ensemble of Expert
Denoisers [87.52504764677226]
Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis.
We train an ensemble of text-to-image diffusion models specialized for different stages synthesis.
Our ensemble of diffusion models, called eDiffi, results in improved text alignment while maintaining the same inference cost.
arXiv Detail & Related papers (2022-11-02T17:43:04Z) - Implementing and Experimenting with Diffusion Models for Text-to-Image
Generation [0.0]
Two models, DALL-E 2 and Imagen, have demonstrated that highly photorealistic images could be generated from a simple textual description of an image.
Text-to-image models require exceptionally large amounts of computational resources to train, as well as handling huge datasets collected from the internet.
This thesis contributes by reviewing the different approaches and techniques used by these models, and then by proposing our own implementation of a text-to-image model.
arXiv Detail & Related papers (2022-09-22T12:03:33Z) - On Advances in Text Generation from Images Beyond Captioning: A Case
Study in Self-Rationalization [89.94078728495423]
We show that recent advances in each modality, CLIP image representations and scaling of language models, do not consistently improve multimodal self-rationalization of tasks with multimodal inputs.
Our findings call for a backbone modelling approach that can be built on to advance text generation from images and text beyond image captioning.
arXiv Detail & Related papers (2022-05-24T00:52:40Z) - Twist Decoding: Diverse Generators Guide Each Other [116.20780037268801]
We introduce Twist decoding, a simple and general inference algorithm that generates text while benefiting from diverse models.
Our method does not assume the vocabulary, tokenization or even generation order is shared.
arXiv Detail & Related papers (2022-05-19T01:27:53Z) - Are You Robert or RoBERTa? Deceiving Online Authorship Attribution
Models Using Neural Text Generators [3.9533044769534444]
GPT-2 and XLM language models are used to generate texts using existing posts of online users.
We then examine whether these AI-based text generators are capable of mimicking authorial style to such a degree that they can deceive typical AA models.
Our findings highlight the current capacity of powerful natural language models to generate original online posts capable of mimicking authorial style.
arXiv Detail & Related papers (2022-03-18T09:19:14Z) - LAFITE: Towards Language-Free Training for Text-to-Image Generation [83.2935513540494]
We propose the first work to train text-to-image generation models without any text data.
Our method leverages the well-aligned multi-modal semantic space of the powerful pre-trained CLIP model.
We obtain state-of-the-art results in the standard text-to-image generation tasks.
arXiv Detail & Related papers (2021-11-27T01:54:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.