Artificial intelligence optical hardware empowers high-resolution
hyperspectral video understanding at 1.2 Tb/s
- URL: http://arxiv.org/abs/2312.10639v1
- Date: Sun, 17 Dec 2023 07:51:38 GMT
- Title: Artificial intelligence optical hardware empowers high-resolution
hyperspectral video understanding at 1.2 Tb/s
- Authors: Maksim Makarenko, Qizhou Wang, Arturo Burguete-Lopez, Silvio Giancola,
Bernard Ghanem, Luca Passone, Andrea Fratalocchi
- Abstract summary: This work introduces a hardware-accelerated integrated optoelectronic platform for multidimensional video understanding in real-time.
The technology platform combines artificial intelligence hardware, processing information optically, with state-of-the-art machine vision networks.
Such performance surpasses the speed of the closest technologies with similar spectral resolution by three to four orders of magnitude.
- Score: 53.91923493664551
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Foundation models, exemplified by GPT technology, are discovering new
horizons in artificial intelligence by executing tasks beyond their designers'
expectations. While the present generation provides fundamental advances in
understanding language and images, the next frontier is video comprehension.
Progress in this area must overcome the 1 Tb/s data rate demanded to grasp
real-time multidimensional video information. This speed limit lies well beyond
the capabilities of the existing generation of hardware, imposing a roadblock
to further advances. This work introduces a hardware-accelerated integrated
optoelectronic platform for multidimensional video understanding in real-time.
The technology platform combines artificial intelligence hardware, processing
information optically, with state-of-the-art machine vision networks, resulting
in a data processing speed of 1.2 Tb/s with hundreds of frequency bands and
megapixel spatial resolution at video rates. Such performance, validated in the
AI tasks of video semantic segmentation and object understanding in indoor and
aerial applications, surpasses the speed of the closest technologies with
similar spectral resolution by three to four orders of magnitude. This platform
opens up new avenues for research in real-time AI video understanding of
multidimensional visual information, helping the empowerment of future
human-machine interactions and cognitive processing developments.
Related papers
- Video as the New Language for Real-World Decision Making [100.68643056416394]
Video data captures important information about the physical world that is difficult to express in language.
Video can serve as a unified interface that can absorb internet knowledge and represent diverse tasks.
We identify major impact opportunities in domains such as robotics, self-driving, and science.
arXiv Detail & Related papers (2024-02-27T02:05:29Z) - Using the Abstract Computer Architecture Description Language to Model
AI Hardware Accelerators [77.89070422157178]
Manufacturers of AI-integrated products face a critical challenge: selecting an accelerator that aligns with their product's performance requirements.
The Abstract Computer Architecture Description Language (ACADL) is a concise formalization of computer architecture block diagrams.
In this paper, we demonstrate how to use the ACADL to model AI hardware accelerators, use their ACADL description to map DNNs onto them, and explain the timing simulation semantics to gather performance results.
arXiv Detail & Related papers (2024-01-30T19:27:16Z) - A Survey on Generative AI and LLM for Video Generation, Understanding, and Streaming [26.082980156232086]
Top-trending AI technologies, i.e., generative artificial intelligence (Generative AI) and large language models (LLMs), are reshaping the field of video technology.
The paper highlights the innovative use of these technologies in producing highly realistic videos.
In the realm of video streaming, the paper discusses how LLMs contribute to more efficient and user-centric streaming experiences.
arXiv Detail & Related papers (2024-01-30T14:37:10Z) - Accelerating Neural Networks for Large Language Models and Graph
Processing with Silicon Photonics [4.471962177124311]
Large language models (LLMs) and graph processing have emerged as transformative technologies for natural language processing (NLP), computer vision, and graph-structured data applications.
However, the complex structures of these models pose challenges for acceleration on conventional electronic platforms.
We describe novel hardware accelerators based on silicon photonics to accelerate transformer neural networks that are used in LLMs and graph neural networks for graph data processing.
arXiv Detail & Related papers (2024-01-12T20:32:38Z) - Neural Rendering and Its Hardware Acceleration: A Review [39.6466512858213]
Neural rendering is a new image and video generation method based on deep learning.
In this paper, we review the technical connotation, main challenges, and research progress of neural rendering.
arXiv Detail & Related papers (2024-01-06T07:57:11Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Cross-Layer Design for AI Acceleration with Non-Coherent Optical
Computing [5.188712126001397]
We show how cross-layer design can overcome challenges in non-coherent optical computing platforms.
Non-coherent optical computing represents a promising approach for light-speed acceleration of AI workloads.
arXiv Detail & Related papers (2023-03-22T21:03:40Z) - Make-A-Video: Text-to-Video Generation without Text-Video Data [69.20996352229422]
Make-A-Video is an approach for translating the tremendous recent progress in Text-to-Image (T2I) generation to Text-to-Video (T2V)
We design a simple yet effective way to build on T2I models with novel and effective spatial-temporal modules.
In all aspects, spatial and temporal resolution, faithfulness to text, and quality, Make-A-Video sets the new state-of-the-art in text-to-video generation.
arXiv Detail & Related papers (2022-09-29T13:59:46Z) - Scalable Optical Learning Operator [0.2399911126932526]
The presented framework overcomes the energy scaling problem of existing systems without classifying speed.
We numerically and experimentally showed the ability of the method to execute several different tasks with accuracy comparable to a digital implementation.
Our results indicate that a powerful supercomputer would be required to duplicate the performance of the multimode fiber-based computer.
arXiv Detail & Related papers (2020-12-22T23:06:59Z) - Photonics for artificial intelligence and neuromorphic computing [52.77024349608834]
Photonic integrated circuits have enabled ultrafast artificial neural networks.
Photonic neuromorphic systems offer sub-nanosecond latencies.
These systems could address the growing demand for machine learning and artificial intelligence.
arXiv Detail & Related papers (2020-10-30T21:41:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.