Quick start
ollama run glm-ocrAvailable sizes
| Tag | Size | Quantization | Context | Min RAM |
|---|---|---|---|---|
| glm-ocr:q8_0 | 1.6GB | q8_0 | 128K context | 2 GB |
| glm-ocr:latest | 2.2GB | q4_k_m | 128K context | 2.8 GB |
| glm-ocr:bf16 | 2.2GB | bf16 | 128K | 2.8 GB |
Run with
Claude Code
ollama launch claude --model glm-ocrCodex
ollama launch codex --model glm-ocrOpenCode
ollama launch opencode --model glm-ocrOpenClaw
ollama launch openclaw --model glm-ocrStrengths & Limitations
Strengths
- Complex document understanding
- Multimodal OCR
- GLM-V architecture
Related models
llavaMultimodal
🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.
12.9M pullsminicpm-vMultimodal
A series of multimodal LLMs (MLLMs) designed for vision-language understanding.
4.6M pullsllava-llama3Multimodal
A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.
2.1M pullsqwen3-vlMultimodal
The most powerful vision-language model in the Qwen model family to date.
1.6M pulls