Quick start
ollama run deepseek-ocrAvailable sizes
| Tag | Size | Quantization | Context | Min RAM |
|---|---|---|---|---|
| deepseek-ocr:latest | 6.7GB | q4_k_m | 8K context | 8.4 GB |
Strengths & Limitations
Strengths
- Token-efficient OCR
- Vision-language capabilities
- Performs Optical Character Recognition
Related models
gemma3General
The current, most capable model that runs on a single GPU.
32.1M pullsllavaMultimodal
🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.
12.9M pullsminicpm-vMultimodal
A series of multimodal LLMs (MLLMs) designed for vision-language understanding.
4.6M pullsllama3.2-visionVision
Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes.
3.8M pulls