Sketch2FullStack: Generating Skeleton Code of Full Stack Website and
Application from Sketch using Deep Learning and Computer Vision
- URL: http://arxiv.org/abs/2211.14607v1
- Date: Sat, 26 Nov 2022 16:32:13 GMT
- Title: Sketch2FullStack: Generating Skeleton Code of Full Stack Website and
Application from Sketch using Deep Learning and Computer Vision
- Authors: Somoy Subandhu Barua, Imam Mohammad Zulkarnain, Abhishek Roy, Md.
Golam Rabiul Alam, Md Zia Uddin
- Abstract summary: It requires a team of experienced developers specifically to design a large website and then convert it to code.
It would save valuable resources and fasten the overall development process.
- Score: 2.422788410602121
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For a full-stack web or app development, it requires a software firm or more
specifically a team of experienced developers to contribute a large portion of
their time and resources to design the website and then convert it to code. As
a result, the efficiency of the development team is significantly reduced when
it comes to converting UI wireframes and database schemas into an actual
working system. It would save valuable resources and fasten the overall
workflow if the clients or developers can automate this process of converting
the pre-made full-stack website design to get a partially working if not fully
working code. In this paper, we present a novel approach of generating the
skeleton code from sketched images using Deep Learning and Computer Vision
approaches. The dataset for training are first-hand sketched images of low
fidelity wireframes, database schemas and class diagrams. The approach consists
of three parts. First, the front-end or UI elements detection and extraction
from custom-made UI wireframes. Second, individual database table creation from
schema designs and lastly, creating a class file from class diagrams.
Related papers
- OneEncoder: A Lightweight Framework for Progressive Alignment of Modalities [0.08192907805418585]
Cross-modal alignment learning integrates information from different modalities like text, image, audio and video to create unified models.
Current techniques rely on large modality-specific encoders, necessitating fine-tuning or training from scratch on vast aligned datasets.
OneEncoder is a lightweight framework that progressively represents and aligns four modalities.
arXiv Detail & Related papers (2024-09-17T10:38:46Z) - Automatically Generating UI Code from Screenshot: A Divide-and-Conquer-Based Approach [51.522121376987634]
We propose DCGen, a divide-and-based approach to automate the translation of webpage design to UI code.
DCGen starts by dividing screenshots into manageable segments, generating descriptions for each segment, and then reassembling them into complete UI code for the entire screenshot.
We conduct extensive testing with a dataset comprised of real-world websites and various MLLMs and demonstrate that DCGen achieves up to a 14% improvement in visual similarity over competing methods.
arXiv Detail & Related papers (2024-06-24T07:58:36Z) - VISION2UI: A Real-World Dataset with Layout for Code Generation from UI Designs [29.80918775422563]
We present a novel dataset, termed VISION2UI, extracted from real-world scenarios, augmented with comprehensive layout information.
This dataset is derived through a series of operations, encompassing collecting, cleaning, and filtering of the open-source Common Crawl dataset.
Ultimately, this process yields a dataset comprising 2,000 parallel samples encompassing design visions and UI code.
arXiv Detail & Related papers (2024-04-09T15:05:48Z) - Knowledge Graph Generation From Text [18.989264255589806]
We propose a novel end-to-end Knowledge Graph (KG) generation system from textual inputs.
The graph nodes are generated first using pretrained language model, followed by a simple edge construction head.
We evaluated the model on a recent WebNLG 2020 Challenge dataset, matching the state-of-the-art performance on text-to-RDF generation task.
arXiv Detail & Related papers (2022-11-18T21:27:13Z) - EfficientTrain: Exploring Generalized Curriculum Learning for Training
Visual Backbones [80.662250618795]
This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers)
As an off-the-shelf method, it reduces the wall-time training cost of a wide variety of popular models by >1.5x on ImageNet-1K/22K without sacrificing accuracy.
arXiv Detail & Related papers (2022-11-17T17:38:55Z) - Pix2Struct: Screenshot Parsing as Pretraining for Visual Language
Understanding [58.70423899829642]
We present Pix2Struct, a pretrained image-to-text model for purely visual language understanding.
We show that a single pretrained model can achieve state-of-the-art results in six out of nine tasks across four domains.
arXiv Detail & Related papers (2022-10-07T06:42:06Z) - GROWN+UP: A Graph Representation Of a Webpage Network Utilizing
Pre-training [0.2538209532048866]
We introduce an agnostic deep graph neural network feature extractor that can ingest webpage structures, pre-train self-supervised on massive unlabeled data, and fine-tune to arbitrary tasks on webpages effectually.
We show that our pre-trained model achieves state-of-the-art results using multiple datasets on two very different benchmarks: webpage boilerplate removal and genre classification.
arXiv Detail & Related papers (2022-08-03T13:37:27Z) - Multi-Stage Progressive Image Restoration [167.6852235432918]
We propose a novel synergistic design that can optimally balance these competing goals.
Our main proposal is a multi-stage architecture, that progressively learns restoration functions for the degraded inputs.
The resulting tightly interlinked multi-stage architecture, named as MPRNet, delivers strong performance gains on ten datasets.
arXiv Detail & Related papers (2021-02-04T18:57:07Z) - A Pipeline for Vision-Based On-Orbit Proximity Operations Using Deep
Learning and Synthetic Imagery [0.0]
Two key challenges currently pose a major barrier to the use of deep learning for vision-based on-orbit proximity operations.
A scarcity of labeled training data (images of a target spacecraft) hinders creation of robust deep learning models.
This paper presents an open-source deep learning pipeline, developed specifically for on-orbit visual navigation applications.
arXiv Detail & Related papers (2021-01-14T15:17:54Z) - Where2Act: From Pixels to Actions for Articulated 3D Objects [54.19638599501286]
We extract highly localized actionable information related to elementary actions such as pushing or pulling for articulated objects with movable parts.
We propose a learning-from-interaction framework with an online data sampling strategy that allows us to train the network in simulation.
Our learned models even transfer to real-world data.
arXiv Detail & Related papers (2021-01-07T18:56:38Z) - SketchyCOCO: Image Generation from Freehand Scene Sketches [71.85577739612579]
We introduce the first method for automatic image generation from scene-level freehand sketches.
Key contribution is an attribute vector bridged Geneversarative Adrial Network called EdgeGAN.
We have built a large-scale composite dataset called SketchyCOCO to support and evaluate the solution.
arXiv Detail & Related papers (2020-03-05T14:54:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.