Models
Explore AI models available on the 0G network
0GM-1.0-35B-A3B
Chat0GM-1.0-35B-A3B deployment by 0G.AI, optimized for agentic coding and tool use. Prefix caching is supported, and cached token usage is reported in usage.prompt_tokens_details.cached_tokens. Thinking/reasoning is enabled by default. To disable, set chat_template_kwargs: {enable_thinking: false} in the request body.
by 0G Foundation
deepseek-v4-flash
ChatDeepSeek-V4-Flash is an efficient lightweight MoE model (284B total / 13B active parameters) with native 1M-token context support. Optimized for fast, low-latency, low-cost inference; well-balanced general capability tuned for high-throughput everyday chat, content creation, basic RAG, and batch text processing. Features native function calling and prompt caching reported via usage.prompt_tokens_details.cached_tokens.
by 0G Foundation
deepseek-v4-pro
ChatDeepSeek-V4-Pro is DeepSeek's flagship LLM, optimized for agentic coding, multi-step workflows, and complex reasoning. Features native function calling and supports a 1M token context window with up to 384K output tokens. Prompt caching is supported and reported via usage.prompt_tokens_details.cached_tokens.
by 0G Foundation
deepseek/deepseek-chat-v3-0324
ChatDeepSeek-V3.2 is a 671B-parameter mixture-of-experts LLM with hybrid thinking mode, excelling at coding, math, and multi-step reasoning. Supports native function calling and context caching.
by 0G Foundation
glm-5
ChatGLM-5 is a next-generation LLM purpose-built for coding and agent workflows, reaching open-source SOTA on complex systems engineering and long-horizon tasks with real-world programming quality approaching Claude Opus. Built on a new 744B-parameter foundation with asynchronous reinforcement learning and sparse attention, advancing from "writing code" to "engineering software systems". 198K context, 16K max output, 32K max reasoning chain. Served via Alibaba Cloud Model Studio (DashScope, glm-5 tier).
by 0G Foundation
glm-5.1
ChatGLM-5.1 is Zhipu AI's flagship model purpose-built for long-horizon tasks. 744B-parameter foundation supporting 200K context and up to 128K output tokens, with strong logical reasoning, long-text comprehension, and code generation. Tuned to balance performance with inference efficiency for intelligent interaction, enterprise applications, and developer assistance. Served via Alibaba Cloud Model Studio (DashScope, glm-5.1 tier).
by 0G Foundation
openai/whisper-large-v3
SpeechHigh-performance automatic speech recognition (ASR) model, providing multilingual transcription and translation.
by 0G Foundation
qwen/qwen3-vl-30b-a3b-instruct
ChatAlibaba's Qwen3-VL is a multimodal vision-language model supporting text and image inputs with text output. Strong at visual reasoning, OCR, and chart/document understanding. Served via Alibaba Cloud Model Studio (DashScope, qwen3-vl-flash tier).
by 0G Foundation
qwen3.6-plus
ChatAlibaba's Qwen3.6-Plus is a flagship LLM with hybrid linear attention and sparse mixture-of-experts routing, optimized for agentic coding, multi-step workflows, and complex reasoning. Features always-on chain-of-thought reasoning with adaptive depth, native function calling, and supports 119 languages. 1M token context window.
by 0G Foundation
qwen3.7-max
ChatAlibaba's Qwen3.7-Max is a flagship LLM with native function calling and web search tool support. Native 1M-token context window with up to 64K output (64K reasoning-mode output, 256K max reasoning chain). Implicit prompt caching is supported and reported via usage.prompt_tokens_details.cached_tokens.
by 0G Foundation
z-image
ImageAsync text-to-image model optimized for Base64 encoded outputs.
by 0G Foundation
zai-org/GLM-5-FP8
ChatZ.ai's flagship GLM-5 reasoning model with native tool calling. Thinking/reasoning is enabled by default. To disable, set chat_template_kwargs: {enable_thinking: false} in request body.
by 0G Foundation
zai-org/GLM-5.1-FP8
ChatGLM-5.1 reasoning model with FP8 quantization, supports tool calling. Thinking/reasoning is enabled by default. To disable, set chat_template_kwargs: {enable_thinking: false} in request body.
by 0G Foundation