Paper

gWorld: Generative Visual Code Mobile World Models

2026.02.06

·Web·by 이호민

#LLM#VLM#World Models#GUI Agent#Code Generation

Key Points

1The paper proposes gWorld, a novel visual mobile GUI World Model paradigm that predicts future GUI states by generating renderable web code, effectively combining the linguistic precision of text-based models with the visual fidelity of pixel-based approaches.
2This generative visual code approach virtually eliminates structural errors (<1% Render Fail) and offers a simplified pipeline compared to previous visual models that relied on numerous external components for text rendering.
3gWorld (8B, 32B) establishes a new Pareto frontier in accuracy versus model size on the MWMBench benchmark, significantly outperforming much larger frontier open-weight models while demonstrating predictable performance gains with increased training data.

S_t

Paper

2026.02.06

·Web·by 이호민

#LLM#VLM#World Models#GUI Agent#Code Generation

1The paper proposes gWorld, a novel visual mobile GUI World Model paradigm that predicts future GUI states by generating renderable web code, effectively combining the linguistic precision of text-based models with the visual fidelity of pixel-based approaches.
2This generative visual code approach virtually eliminates structural errors (<1% Render Fail) and offers a simplified pipeline compared to previous visual models that relied on numerous external components for text rendering.
3gWorld (8B, 32B) establishes a new Pareto frontier in accuracy versus model size on the MWMBench benchmark, significantly outperforming much larger frontier open-weight models while demonstrating predictable performance gains with increased training data.

S_t