zai-org/GLM-5.1-FP8

Chat

by 0G Foundation

GLM-5.1 reasoning model with FP8 quantization, supports tool calling. Thinking/reasoning is enabled by default. To disable, set chat_template_kwargs: {enable_thinking: false} in request body.

TEE VerifiedTeeMLopenaianthropicTool CallingModalities

Context

131K

131,072 tokens

Max Output

33K

32,768 tokens

Input Price

2.0099 0G

per 1M tokens

Output Price

6.3500 0G

per 1M tokens

Providers

Type

chatbot

Supported Parameters

temperaturetop_ptop_kmax_tokensfrequency_penaltypresence_penaltystoptoolstool_choiceresponse_formatchat_template_kwargs

Providers(1)

TEE (dstack)TeeML

Status HealthyUptime 100%Context 131KMax Out 33KIn2.0099Out6.35000G/MFormats openai, anthropic

Cached: 0.4020 0G/M

View API Reference Try in Playground