Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
ollama run llama3.1214 models indexed from ollama.com
214 of 214 models
214 of 214 models
Showing 214 of 214 models
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
ollama run llama3.1DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
ollama run deepseek-r1Meta's Llama 3.2 goes small with 1B and 3B models.
ollama run llama3.2A high-performing open embedding model with a large token context window.
ollama run nomic-embed-textThe current, most capable model that runs on a single GPU.
ollama run gemma3The 7B model released by Mistral AI, updated to version 0.3.
ollama run mistralQwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.
ollama run qwen2.5Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.
ollama run qwen3Google Gemma 2 is a high-performing and efficient model available in three sizes: 2B, 9B, and 27B.
ollama run gemma2Meta Llama 3: The most capable openly available LLM to date
ollama run llama3Phi-3 is a family of lightweight 3B (Mini) and 14B (Medium) state-of-the-art open models by Microsoft.
ollama run phi3🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.
ollama run llavaThe latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.
ollama run qwen2.5-coderState-of-the-art large embedding model from mixedbread.ai
ollama run mxbai-embed-largePhi-4 is a 14B parameter, state-of-the-art open model from Microsoft.
ollama run phi4OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases.
ollama run gpt-ossGemma is a family of lightweight, state-of-the-art open models built by Google DeepMind. Updated to version 1.1
ollama run gemmaQwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters
ollama run qwenLlama 2 is a collection of foundation language models ranging from 7B to 70B parameters.
ollama run llama2Qwen2 is a new series of large language models from Alibaba group
ollama run qwen2A series of multimodal LLMs (MLLMs) designed for vision-language understanding.
ollama run minicpm-vA large language model that can use text prompts to generate and discuss code.
ollama run codellamaLlama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes.
ollama run llama3.2-visionThe TinyLlama project is an open endeavor to train a compact 1.1B Llama model on 3 trillion tokens.
ollama run tinyllamaDolphin 3.0 Llama 3.1 8B 🐬 is the next generation of the Dolphin series of instruct-tuned models designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.
ollama run dolphin3A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
ollama run deepseek-v3OLMo 2 is a new family of 7B and 13B models trained on up to 5T tokens. These models are on par with or better than equivalently sized fully open models, and competitive with open-weight models such as Llama 3.1 on English academic benchmarks.
ollama run olmo2A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA.
ollama run mistral-nemoBGE-M3 is a new model from BAAI distinguished for its versatility in Multi-Functionality, Multi-Linguality, and Multi-Granularity.
ollama run bge-m3New state of the art 70B model. Llama 3.3 70B offers similar performance compared to the Llama 3.1 405B model.
ollama run llama3.3Alibaba's performant long context models for agentic and coding tasks.
ollama run qwen3-coderDeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
ollama run deepseek-coderSmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters.
ollama run smollm2Embedding models on very large sentence level datasets.
ollama run all-minilmMistral Small 3 sets a new benchmark in the “small” Large Language Models category below 70B.
ollama run mistral-smallCodeGemma is a collection of powerful, lightweight models that can perform a variety of coding tasks like fill-in-the-middle code completion, code generation, natural language understanding, mathematical reasoning, and instruction following.
ollama run codegemmaThe IBM Granite 1B and 3B models are long-context mixture of experts (MoE) Granite models from IBM designed for low latency usage.
ollama run granite3.1-moeA family of efficient AI models under 10B parameters performant in science, math, and coding through innovative training techniques.
ollama run falcon3A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.
ollama run llava-llama3A suite of text embedding models by Snowflake, optimized for performance.
ollama run snowflake-arctic-embedStarCoder2 is the next generation of transparently trained open code LLMs that comes in three sizes: 3B, 7B and 15B parameters.
ollama run starcoder2QwQ is the reasoning model of the Qwen series.
ollama run qwqA general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.
ollama run orca-miniA set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes.
ollama run mixtralUncensored Llama 2 model by George Sung and Jarrad Hope.
ollama run llama2-uncensoredLFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.
ollama run lfm2An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
ollama run deepseek-coder-v2The most powerful vision-language model in the Qwen model family to date.
ollama run qwen3-vlCogito v1 Preview is a family of hybrid reasoning models by Deep Cogito that outperform the best available open models of the same size, including counterparts from LLaMA, DeepSeek, and Qwen across most standard benchmarks.
ollama run cogitoFlagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL.
ollama run qwen2.5vlAn update to Mistral Small that improves on function calling, instruction following, and less repetition errors.
ollama run mistral-small3.2Gemma 3n models are designed for efficient execution on everyday devices such as laptops, tablets or phones.
ollama run gemma3nMeta's latest collection of multimodal models.
ollama run llama4Phi 4 reasoning and reasoning plus are 14-billion parameter open-weight reasoning models that rival much larger models on complex reasoning tasks.
ollama run phi4-reasoning2.7B uncensored Dolphin model by Eric Hartford, based on the Phi language model by Microsoft Research.
ollama run dolphin-phiA fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
ollama run deepscalerMagistral is a small, efficient reasoning model with 24B parameters.
ollama run magistralDolphin 2.9 is a new model with 8B and 70B sizes by Eric Hartford based on Llama 3 that has a variety of instruction, conversational, and coding skills.
ollama run dolphin-llama3Uncensored, 8x7b and 8x22b fine-tuned models based on the Mixtral mixture of experts models that excels at coding tasks. Created by Eric Hartford.
ollama run dolphin-mixtralPhi-2: a 2.7B language model by Microsoft Research that demonstrates outstanding reasoning and language understanding capabilities.
ollama run phi🪐 A family of small models with 135M, 360M, and 1.7B parameters, trained on a new high-quality dataset.
ollama run smollmBuilding upon the foundational models of the Qwen3 series, Qwen3 Embedding provides a comprehensive range of text embeddings models in various sizes
ollama run qwen3-embeddingPhi-4-mini brings significant enhancements in multilingual support, reasoning, and mathematics, and now, the long-awaited function calling feature is finally supported.
ollama run phi4-miniIBM Granite 2B and 8B models are 128K context length language models that have been fine-tuned for improved reasoning and instruction-following capabilities.
ollama run granite3.3Codestral is Mistral AI’s first-ever code model designed for code generation tasks.
ollama run codestralA fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.
ollama run openthinkerA compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.
ollama run granite3.2-visionDevstral: the best open source model for coding agents
ollama run devstralThe uncensored Dolphin model based on Mistral that excels at coding tasks. Updated to version 2.8.
ollama run dolphin-mistralGranite 4 features improved instruction following (IF) and tool-calling capabilities, making them more effective in enterprise applications.
ollama run granite4Command R is a Large Language Model optimized for conversational interaction and long context tasks.
ollama run command-rQwen3-Coder-Next is a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development.
ollama run qwen3-coder-nextA family of open foundation models by IBM for Code Intelligence
ollama run granite-codeState of the art large language model from Microsoft AI with improved performance on complex chat, multilingual, reasoning and agent use cases.
ollama run wizardlm2DeepCoder is a fully open-Source 14B coder model at O3-mini level, with a 1.5B version also available.
ollama run deepcodermoondream2 is a small vision language model designed to run efficiently on edge devices.
ollama run moondreamHermes 3 is the latest version of the flagship Hermes series of LLMs by Nous Research
ollama run hermes3Building upon Mistral Small 3, Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance.
ollama run mistral-small3.1LFM2.5 is a new family of hybrid models designed for on-device deployment.
ollama run lfm2.5-thinkingYi 1.5 is a high-performing, bilingual language model.
ollama run yiZephyr is a series of fine-tuned versions of the Mistral and Mixtral models that are trained to act as helpful assistants.
ollama run zephyrMistral Large 2 is Mistral's new flagship model that is significantly more capable in code generation, mathematics, and reasoning with 128k context window and support for dozens of languages.
ollama run mistral-largeA lightweight AI model with 3.8 billion parameters with performance overtaking similarly and larger sized models.
ollama run phi3.5Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.
ollama run wizard-vicuna-uncensoredEmbeddingGemma is a 300M parameter embedding model from Google.
ollama run embeddinggemmaBakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.
ollama run bakllavaSentence-transformers model that can be used for tasks like clustering or semantic search.
ollama run paraphrase-multilingualStarCoder is a code generation model trained on 80+ programming languages.
ollama run starcoderGeneral use models based on Llama and Llama 2 from Nous Research.
ollama run nous-hermesEXAONE Deep exhibits superior capabilities in various reasoning tasks including math and coding benchmarks, ranging from 2.4B to 32B parameters developed and released by LG AI Research.
ollama run exaone-deepAn advanced language model crafted with 2 trillion bilingual tokens.
ollama run deepseek-llmA large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.
ollama run falconA strong, economical, and efficient Mixture-of-Experts language model.
ollama run deepseek-v2The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware.
ollama run ministral-3A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks. Updated to version 3.5-0106.
ollama run openchatGeneral use chat model based on Llama and Llama 2 with 2K to 16K context sizes.
ollama run vicunaA strong multi-lingual general language model with competitive performance to Llama 3.
ollama run glm4OpenHermes 2.5 is a 7B model fine-tuned by Teknium on Mistral with fully open datasets.
ollama run openhermesCodeQwen1.5 is a large language model pretrained on a large amount of code data.
ollama run codeqwenQwen2 Math is a series of specialized math language models built upon the Qwen2 LLMs, which significantly outperforms the mathematical capabilities of open-source models and even closed-source models (e.g., GPT4o).
ollama run qwen2-mathAya 23, released by Cohere, is a new family of state-of-the-art, multilingual models that support 23 languages.
ollama run ayaLlama 2 based model fine tuned to improve Chinese dialogue ability.
ollama run llama2-chineseStable Code 3B is a coding model with instruct and code completion variants on par with models such as Code Llama 7B that are 2.5x larger.
ollama run stable-codeA fine-tuned model based on Mistral with good coverage of domain and language.
ollama run neural-chatThe powerful family of models by Nous Research that excels at scientific discussion and coding tasks.
ollama run nous-hermes2OpenCoder is an open and reproducible code LLM family which includes 1.5B and 8B models, supporting chat in English and Chinese languages.
ollama run opencoderSQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks
ollama run sqlcoderState-of-the-art code generation model
ollama run wizardcoderYi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.
ollama run yi-coderStable LM 2 is a state-of-the-art 1.6B and 12B parameter language model trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch.
ollama run stablelm2A model from NVIDIA based on Llama 3 that excels at conversational question answering (QA) and retrieval-augmented generation (RAG).
ollama run llama3-chatqaThe IBM Granite 2B and 8B models are designed to support tool-based use cases and support for retrieval augmented generation (RAG), streamlining code generation, translation and bug fixing.
ollama run granite3-denseThe IBM Granite 2B and 8B models are text-only dense LLMs trained on over 12 trillion tokens of data, demonstrated significant improvements over their predecessors in performance and speed in IBM’s initial testing.
ollama run granite3.1-denseA 7B and 15B uncensored variant of the Dolphin model family that excels at coding, based on StarCoder2.
ollama run dolphincoderModel focused on math and logic problems
ollama run wizard-mathA new collection of open translation models built on Gemma 3, helping people communicate across 55 languages.
ollama run translategemmaThis model extends LLama-3 8B's context length from 8k to over 1m tokens.
ollama run llama3-gradientA companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.
ollama run samantha-mistralDeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode.
ollama run deepseek-v3.1Command R+ is a powerful, scalable large language model purpose-built to excel at real-world enterprise use cases.
ollama run command-r-plusInternLM2.5 is a 7B parameter model tailored for practical scenarios with outstanding reasoning capability.
ollama run internlm2A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.
ollama run llama3-groq-tool-useLlama Guard 3 is a series of models fine-tuned for content safety classification of LLM inputs and responses.
ollama run llama-guard3Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.
ollama run starling-lmCode generation model based on Code Llama.
ollama run phind-codellamaA compact, yet powerful 10.7B large language model designed for single-turn conversation.
ollama run solarConversational model based on Llama 2 that performs competitively on various benchmarks.
ollama run xwinlmCohere For AI's language models trained to perform well across 23 different languages.
ollama run aya-expanseThe IBM Granite 1B and 3B models are the first mixture of experts (MoE) Granite models from IBM designed for low latency usage.
ollama run granite3-moeAn extension of Llama 2 that supports a context of up to 128k tokens.
ollama run yarn-llama2A versatile model for AI software development scenarios, including code completion.
ollama run codegeex4Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.
ollama run mistral-openorcaAn experimental 1.1B parameter model trained on the new Dolphin 2.8 dataset by Eric Hartford and based on TinyLlama.
ollama run tinydolphinOrca 2 is built by Microsoft research, and are a fine-tuned version of Meta's Llama 2 models. The model is designed to excel particularly in reasoning.
ollama run orca2Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.
ollama run stable-belugaThe first installment in the Qwen3-Next series with strong performance in terms of both parameter efficiency and inference speed.
ollama run qwen3-nextA series of models that convert HTML content to Markdown content, which is useful for content conversion tasks.
ollama run reader-lmShieldGemma is set of instruction tuned models for evaluating the safety of text prompt input and text output responses against a set of defined safety policies.
ollama run shieldgemmaAn expansion of Llama 2 that specializes in integrating both general language understanding and domain-specific knowledge, particularly in programming and mathematics.
ollama run llama-proRnj-1 is a family of 8B parameter open-weight, dense models trained from scratch by Essential AI, optimized for code and STEM with capabilities on par with SOTA open-weight models.
ollama run rnj-1An extension of Mistral to support context windows of 64K or 128K.
ollama run yarn-mistralNexus Raven is a 13B instruction tuned model for function calling tasks.
ollama run nexusravenGeneral use model based on Llama 2.
ollama run wizardlmOpen-source medical large language model adapted from Llama 2 to the medical domain.
ollama run meditronAs the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.
ollama run glm-4.7-flashA high-performing model trained with a new technique called Reflection-tuning that teaches a LLM to detect mistakes in its reasoning and correct course.
ollama run reflectionA commercial-friendly small language model by NVIDIA optimized for roleplay, RAG QA, and function calling.
ollama run nemotron-miniGranite-3.2 is a family of long-context AI models from IBM Granite fine-tuned for thinking capabilities.
ollama run granite3.2Uncensored version of Wizard LM model
ollama run wizardlm-uncensoredAthene-V2 is a 72B parameter model which excels at code completion, mathematics, and log extraction tasks.
ollama run athene-v2Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.
ollama run nemotronEXAONE 3.5 is a collection of instruction-tuned bilingual (English and Korean) generative models ranging from 2.4B to 32B parameters, developed and released by LG AI Research.
ollama run exaone3.5Snowflake's frontier embedding model. Arctic Embed 2.0 adds multilingual support without sacrificing English performance or scalability.
ollama run snowflake-arctic-embed2The Nous Hermes 2 model from Nous Research, now trained over Mixtral.
ollama run nous-hermes2-mixtralA version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
ollama run r1-1776Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.
ollama run medllama2Great code generation model based on Llama2.
ollama run codeupUncensored Llama2 based model with support for a 16K context window.
ollama run everythinglmMathΣtral: a 7B model designed for math reasoning and scientific discovery by Mistral AI.
ollama run mathstralSolar Pro Preview: an advanced large language model (LLM) with 22 billion parameters designed to fit into a single GPU
ollama run solar-pro🎩 Magicoder is a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.
ollama run magicoderFalcon2 is an 11B parameters causal decoder-only model built by TII and trained over 5T tokens.
ollama run falcon2A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.
ollama run stablelm-zephyrMegaDolphin-2.2-120b is a transformation of Dolphin-2.2-70b created by interleaving the model with itself.
ollama run megadolphinThe IBM Granite Embedding 30M and 278M models models are text-only dense biencoder embedding models, with 30M available in English only and 278M serving multilingual use cases.
ollama run granite-embedding7B parameter text-to-SQL model made by MotherDuck and Numbers Station.
ollama run duckdb-nsqlA 3.8B model fine-tuned on a private high-quality synthetic dataset for information extraction, based on Phi-3.
ollama run nuextractMistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.
ollama run mistralliteA state-of-the-art fact-checking model developed by Bespoke Labs.
ollama run bespoke-minicheckTülu 3 is a leading instruction following model family, offering fully open-source data, code, and recipes by the The Allen Institute for AI.
ollama run tulu3A top-performing mixture of experts model, fine-tuned with high-quality data.
ollama run notuxA 7B chat model fine-tuned with high-quality data and based on Zephyr.
ollama run notusDeepSeek-OCR is a vision-language model that can perform token-efficient OCR.
ollama run deepseek-ocrWizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.
ollama run wizard-vicunaAn open weights function calling model based on Llama 3, competitive with GPT-4o function calling capabilities.
ollama run firefunction-v2A high-performing code instruct model created by merging two existing code models.
ollama run codeboogaEmbedding model from BAAI mapping texts to vectors.
ollama run bge-largeA new small LLaVA model fine-tuned from Phi 3 Mini.
ollama run llava-phi3Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.
ollama run open-orca-platypus2DBRX is an open, general-purpose LLM created by Databricks.
ollama run dbrxA language model created by combining two fine-tuned Llama 2 70B models into one.
ollama run goliathNemotron 3 Nano - A new Standard for Efficient, Open, and Intelligent Agentic Models
ollama run nemotron-3-nanoSailor2 are multilingual language models made for South-East Asia. Available in 1B, 8B, and 20B parameter sizes.
ollama run sailor2Olmo is a series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.
ollama run olmo-324B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
ollama run devstral-small-2A new small reasoning model fine-tuned from the Qwen 2.5 3B Instruct model.
ollama run smallthinkerThe smallest model in Cohere's R series delivers top-tier speed, efficiency, and quality to build powerful AI applications on commodity GPUs and edge devices.
ollama run command-r7bPhi 4 mini reasoning is a lightweight open model that balances efficiency with advanced reasoning ability.
ollama run phi4-mini-reasoningAn upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
ollama run deepseek-v2.5The IBM Granite Guardian 3.0 2B and 8B models are designed to detect risks in prompts and/or responses.
ollama run granite3-guardian111 billion parameter model optimized for demanding enterprises that require fast, secure, and high-quality AI
ollama run command-aAn open large reasoning model for real-world solutions by the Alibaba International Digital Commerce Group (AIDC-AI).
ollama run marco-o1A robust conversational model designed to be used for both chat and instruct use cases.
ollama run alfredOlmo is a series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.
ollama run olmo-3.1Kimi K2.5 is an open-source, native multimodal agentic model that seamlessly integrates vision and language understanding with advanced agentic capabilities, instant and thinking modes, as well as conversational and agentic paradigms.
ollama run kimi-k2.5123B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
ollama run devstral-2A new state-of-the-art version of the lightweight Command R7B model that excels in advanced Arabic language capabilities for enterprises in the Middle East and Northern Africa.
ollama run command-r7b-arabicThe Cogito v2.1 LLMs are instruction tuned generative models. All models are released under MIT license for commercial use.
ollama run cogito-2.1gpt-oss-safeguard-20b and gpt-oss-safeguard-120b are safety reasoning models built-upon gpt-oss
ollama run gpt-oss-safeguardFunctionGemma is a specialized version of Google's Gemma 3 270M model fine-tuned explicitly for function calling.
ollama run functiongemmaAdvanced agentic, reasoning and coding capabilities.
ollama run glm-4.6Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.
ollama run gemini-3-flash-previewMiniMax M2 is a high-efficiency large language model built for coding and agentic workflows.
ollama run minimax-m2A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks.
ollama run glm-5Advancing the Coding Capability
ollama run glm-4.7MiniMax-M2.5 is a state-of-the-art large language model designed for real-world productivity and coding tasks.
ollama run minimax-m2.5Qwen 3.5 is a family of open-source multimodal models that delivers exceptional utility and performance.
ollama run qwen3.5GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.
ollama run glm-ocrnomic-embed-text-v2-moe is a multilingual MoE text embedding model that excels at multilingual retrieval.
ollama run nomic-embed-text-v2-moeA state-of-the-art mixture-of-experts (MoE) language model. Kimi K2-Instruct-0905 demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks.
ollama run kimi-k2DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.
ollama run deepseek-v3.2Kimi K2 Thinking, Moonshot AI's best open-source thinking model.
ollama run kimi-k2-thinkingA general-purpose multimodal mixture-of-experts model for production-grade tasks and enterprise workloads.
ollama run mistral-large-3Exceptional multilingual capabilities to elevate code engineering
ollama run minimax-m2.1