by 0G Foundation
GLM-5.1 reasoning model with FP8 quantization, supports tool calling. Thinking/reasoning is enabled by default. To disable, set chat_template_kwargs: {enable_thinking: false} in request body.
131K
131,072 tokens
33K
32,768 tokens
2.0099 0G
per 1M tokens
6.3500 0G
per 1M tokens
1
chatbot