Smart Specifications to Interface: Using Multimodal AI for Automated Software UI Generation

Authors

  • Syed Mohsin Ali Rizvi
  • Maria Ghayas
  • Abdul Wali
  • Ammad ul Islam
  • Malik Zohaib Hussain
  • Misbah Maqbool
  • Rabia Abbas
  • Bilawal Fiaz

Abstract

Background
Making user interfaces (UIs) by hand takes a lot of work to turn text instructions and design images into working front-end code. With new multimodal large language models (MLLMs), there's a chance to automate this by using both text and images.

Objective
The research intended to construct and validate a hybrid AI system that is able to generate UI code automatically from intelligent instructions, utilizing both textual and visual design components.

Methodology
We developed a framework that combines a BERT-based text encoder, a ResNet 50 image encoder, and a GPT-2-style decoder. It was trained and tested on public datasets such as MultiUI, Web2Code, and VISION2UI. We evaluated how well it performed using metrics such as token-level F1, SSIM, PSNR, semantic consistency, and human feedback.

Results
The model achieved a token-level F1 of 91.6%, an SSIM of 0.937, and a PSNR of 26.8 dB. It successfully compiled 96.2% of the code generated and achieved 88.5% for semantic coherence. Human feedback saw it achieving an average rating of 4.53 out of 5 and outperforming leading models such as UICoder and UICopilot for usability and visual correctness. It also performed faster at 1.2 seconds per sample.

Conclusion
The findings indicate that the integration of step-by-step code construction, multimodal comprehension, and expert datasets is effective for generating UIs. The research is a critical milestone toward intelligent, automated UI creation that links design concepts to functional code.

 

Downloads

Published

2025-08-11

How to Cite

Syed Mohsin Ali Rizvi, Maria Ghayas, Abdul Wali, Ammad ul Islam, Malik Zohaib Hussain, Misbah Maqbool, … Bilawal Fiaz. (2025). Smart Specifications to Interface: Using Multimodal AI for Automated Software UI Generation. Dialogue Social Science Review (DSSR), 3(8), 63–73. Retrieved from https://dialoguessr.com/index.php/2/article/view/840

Issue

Section

Applied Sciences