Quick start
ollama run glm-4.7-flashAvailable sizes
| Tag | Size | Quantization | Context | Min RAM |
|---|---|---|---|---|
| glm-4.7-flash:latest | 19GB | q4_k_m | 198K context | 23.8 GB |
| glm-4.7-flash:q4_K_M | 19GB | q4_k_m | 198K | 23.8 GB |
| glm-4.7-flash:q8_0 | 32GB | q8_0 | 198K context | 40 GB |
| glm-4.7-flash:bf16 | 60GB | bf16 | 198K context | 75 GB |
Run with
Claude Code
ollama launch claude --model glm-4.7-flashCodex
ollama launch codex --model glm-4.7-flashOpenCode
ollama launch opencode --model glm-4.7-flashOpenClaw
ollama launch openclaw --model glm-4.7-flashStrengths & Limitations
Strengths
- Strong performance in 30B class
- Lightweight deployment option
- Balances performance and efficiency
Benchmarks
| Benchmark | Score | Unit |
|---|---|---|
| AIME | 25 | — |
| GPQA | 75.2 | — |
| SWE-bench Verified | 59.2 | — |
Related models
llama3.1Language
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
110.5M pullsllama3.2Language
Meta's Llama 3.2 goes small with 1B and 3B models.
58.0M pullsmistralLanguage
The 7B model released by Mistral AI, updated to version 0.3.
25.6M pullsqwen2.5Language
Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.
22.0M pulls