Open-source multimodal GLM from Z.ai unifying vision, text, and tool calling for long-context reasoning, search, coding, and UI-to-code.
Open‑source VLM agent to control computer GUIs via mouse/keyboard planning and execution.