AI Models Catalog 2026
The 52 AI models engineers actually use in production right now — curated by an industry practitioner with 12+ years in cloud + LLMOps.
Every card links to the official try-it page. No proxies. No reselling. Just the truth about what works.
General LLMs (Chat & Reasoning)
4 models
Claude Opus 4.7
Anthropic
The model GitHub Copilot uses. Best-in-class for nuanced reasoning, long-context analysis, and agentic workflows.
- Industry-leading code generation + tool-use
- Native computer-use + bash/file tools
- Strong refusal behavior — safer for enterprise
GPT-5
OpenAI
OpenAI's flagship. Strongest math/STEM reasoning + widest ecosystem of tools & integrations.
- Best-in-class math + competitive coding
- Native voice + Realtime API
- Function calling + Structured Outputs
Gemini 2.5 Pro
Google's flagship. Unrivaled context length (2M tokens) — drop an entire codebase, get answers.
- Longest production context window (2M)
- Native video understanding
- Free tier via Google AI Studio
Grok 4
xAI
xAI's reasoning-first model with real-time X (Twitter) data access.
- Real-time web + X data
- Strong on STEM benchmarks
- API-first pricing
Coding LLMs
4 models
Claude Sonnet 4.5 (Code)
Anthropic
The default for Cursor, Windsurf, and GitHub Copilot's Agent mode. Reigning champion of agentic coding.
- #1 on SWE-bench Verified (real-world code tasks)
- Excellent tool-use + multi-step debugging
- Available in Cursor, Windsurf, Copilot, Aider
GPT-5 Codex
OpenAI
OpenAI's coding-tuned variant. Powers ChatGPT's Canvas + the Codex agent.
- Top-tier on HumanEval
- Built-in code interpreter
- Long-form refactors
DeepSeek-Coder V3
DeepSeek
Open-weight Chinese coding model — punches at frontier-class on benchmarks, free to self-host.
- Open-weight + huge context
- Strong on multi-file repo tasks
- Free to self-host
Codestral 25.06
Mistral
Mistral's fast, lightweight code completion model — built for IDE autocomplete latency.
- Optimized for FIM (fill-in-middle)
- Sub-100ms first-token latency
- 80+ languages
Open-Weight LLMs (Self-Host)
5 models
Llama 4 (70B Instruct)
Meta
Meta's flagship open-weight model. The de-facto baseline for self-hosted LLM stacks.
- Best 70B-class quality (open)
- Available on Bedrock, Together, Groq
- Permissive license
Qwen 2.5 72B
Alibaba
Alibaba's open-weight model — strong multilingual + math, Apache-2.0 license.
- Apache-2.0 (truly open)
- Excellent multilingual + math
- Many sizes from 0.5B to 72B
Mistral Large 2
Mistral
Mistral's flagship. Open-weight under research license; great agent + tool-use performance.
- Strong function-calling
- Available on Bedrock + Azure AI
- Top European model
Gemma 3 (27B)
Google's open-weight family. Punchy quality per parameter — perfect for Ollama / single-GPU.
- Excellent quality-per-param
- Sizes from 1B to 27B
- First-class Ollama support
Phi-4
Microsoft
Microsoft's small-but-mighty 14B model — frontier-class reasoning at a fraction of the cost.
- Best-in-class reasoning at 14B
- MIT licensed
- Trivial to self-host
Multimodal & Vision
3 models
Claude 4 Vision
Anthropic
Best general-purpose vision LLM — reads diagrams, screenshots, whiteboards as well as humans.
- Excellent OCR + diagram understanding
- Vision-aware tool use
- Screenshot debugging
GPT-5 Vision
OpenAI
OpenAI's vision-enabled flagship. Strong on charts, math problems from photos, and image-to-code.
- Image-to-code (screenshot → JSX)
- Best for chart/graph reading
- Tight Code Interpreter integration
Llama 4 Vision
Meta
Best open-weight vision model. Self-hostable on a single H100 — full data control.
- Open weights + commercial use
- Strong on real-world photo Q&A
- Available on Bedrock
Image Generation
4 models
Flux 1.1 Pro
Black Forest Labs
The current SOTA in text-to-image. Beats Midjourney + DALL-E 3 on prompt adherence + photorealism.
- #1 on Artificial Analysis Image Arena
- Open-weight Schnell variant
- Available on Replicate, fal.ai
Stable Diffusion 3.5
Stability AI
Open-weight champion. Massive ComfyUI + Automatic1111 ecosystem — endless customization.
- Huge LoRA + ControlNet ecosystem
- Self-hostable on consumer GPUs
- Commercial license available
DALL-E 3
OpenAI
OpenAI's image model — strongest at typography + text rendering inside images.
- Best text-in-image rendering
- Native ChatGPT integration
- Strong prompt adherence
Midjourney v7
Midjourney
The artist's favorite. Unmatched aesthetic + style consistency for creative work.
- Best-in-class aesthetic quality
- Style reference + character consistency
- Web app now (no Discord needed)
Speech & Audio
3 models
Whisper Large v3
OpenAI
Open-weight speech-to-text. Industry standard for transcription — 99 languages, free to self-host.
- MIT licensed + self-hostable
- 99 languages including Hindi/Hinglish
- Best free transcription option
ElevenLabs v3
ElevenLabs
The most realistic AI voices on the market. Voice cloning + multilingual TTS at production quality.
- Sub-200ms streaming latency
- 32 languages incl. Hindi
- Voice cloning from 30 sec sample
Deepgram Nova-3
Deepgram
Lowest-latency speech-to-text API. Built for real-time voice agents.
- <300ms end-to-end latency
- Streaming + batch
- Speaker diarization
Embeddings & Retrieval
4 models
text-embedding-3-large
OpenAI
OpenAI's flagship embeddings — the safe default for RAG pipelines.
- Configurable output dims (256–3072)
- Top of MTEB leaderboard (proprietary)
- Cheap at scale ($0.13/M tokens)
Cohere Embed v4
Cohere
Best multilingual embeddings on the market. Native support for 100+ languages incl. Hindi.
- 100+ languages
- Multimodal (text + image)
- Free tier on dashboard
BGE-M3
BAAI
Top open-weight embedding model. Dense + sparse + multi-vector in one model. Free.
- MIT licensed
- Hybrid dense+sparse retrieval
- 100+ languages
nomic-embed-text-v1.5
Nomic
Small, fast, free embeddings — drop into Ollama in one command. Perfect for local RAG.
- Runs on CPU
- Apache-2.0
- Native Ollama support
AWS Bedrock Foundation Models
4 models
Amazon Nova Pro
AWS
Amazon's own flagship FM. Cheap, fast, multimodal — only on Bedrock.
- AWS-native (IAM, VPC, PrivateLink)
- Multimodal incl. video
- Lowest $$/token of frontier models
Claude 4 (on Bedrock)
Anthropic / AWS
Claude with full AWS IAM + VPC + PrivateLink. The enterprise way to use Claude in India.
- Available in Mumbai region
- Provisioned throughput option
- Bedrock Guardrails + KMS
Llama 4 (on Bedrock)
Meta / AWS
Meta's open-weight Llama with managed scaling on Bedrock. No GPU ops needed.
- Fully managed — no GPU ops
- Knowledge Bases integration
- Agents for Bedrock
Mistral Large 2 (Bedrock)
Mistral / AWS
Mistral's flagship with AWS billing + IAM. Good for EU data residency too.
- Strong tool-use
- EU + US regions
- Per-token billing on Bedrock
Agent Frameworks
6 models
LangGraph
LangChain
Stateful agent graphs from the LangChain team. The most-deployed agent framework in production today.
- Graph-based agent state machines
- Native human-in-the-loop
- LangSmith observability built-in
AWS Bedrock Agents
AWS
Fully-managed agents on AWS. Knowledge Bases, action groups, guardrails — no infrastructure to run.
- Zero-infrastructure agents
- Native IAM + KMS + VPC
- Knowledge Bases (RAG) built-in
OpenAI Agents SDK
OpenAI
OpenAI's official agent SDK. Tight integration with GPT-5, function calling, and Responses API.
- Native handoffs between agents
- Built-in tracing
- Python + TypeScript SDKs
Claude Agent SDK
Anthropic
Anthropic's official SDK for building autonomous Claude-powered agents with computer use + bash.
- Computer use + bash tool out-of-box
- File-system aware
- Powers Claude Code
Pydantic AI
Pydantic
Type-safe agent framework from the Pydantic team. FastAPI for the agent world — clean, opinionated, fast.
- Type-safe tool definitions
- Streaming + structured outputs
- Model-agnostic
Vercel AI SDK
Vercel
The fastest way to add streaming AI to a Next.js / React app. Used by ~40% of new AI startups.
- First-class React Server Components
- Streaming UI helpers
- 20+ provider plugins
Multi-Agent Systems
4 models
CrewAI
CrewAI
Role-based multi-agent orchestration. Define agents like a real team — researcher, writer, reviewer.
- Role + goal + backstory primitives
- Sequential + hierarchical processes
- 30k+ GitHub stars
AutoGen v0.4
Microsoft
Microsoft's research-grade multi-agent framework. Event-driven, async, supports complex agent conversations.
- Async event-driven runtime
- Group chat patterns
- Strong code execution agents
LangGraph Swarm
LangChain
Swarm-style handoff agents on top of LangGraph. Inspired by OpenAI Swarm, production-hardened.
- Dynamic agent handoffs
- Persistent state across agents
- Time-travel debugging
AWS Multi-Agent Orchestrator
AWS
AWS's open-source multi-agent framework. Production-ready intent routing across Bedrock agents.
- Intent classifier routing
- Python + TypeScript
- Bedrock + Anthropic + custom
AgentOps & Observability
5 models
LangSmith
LangChain
The Datadog for agents. Trace every LLM call, debug step-by-step, evaluate prompts at scale.
- Full trace tree per agent run
- Datasets + evaluators built-in
- Free tier for solo devs
AgentOps
AgentOps.ai
Vendor-neutral agent monitoring. Track cost, latency, errors, and session replays across any framework.
- Framework-agnostic (CrewAI, AutoGen, LangChain, etc.)
- Session replay
- Cost + token tracking
Langfuse
Langfuse
Open-source LLM observability. Self-hostable, OpenTelemetry-native, vendor-agnostic.
- Self-hostable + cloud option
- OpenTelemetry-native
- Prompt management + datasets
Helicone
Helicone
Proxy-based LLM observability. One line of code change, full request logging + caching + cost analytics.
- Drop-in OpenAI proxy
- Built-in caching
- Custom properties for filtering
Arize Phoenix
Arize AI
Open-source LLM evals + tracing from the Arize team. Best-in-class evaluation framework.
- Strong RAG evals (faithfulness, relevance)
- OpenTelemetry-based
- Self-hostable
Vector DBs & Memory
6 models
Pinecone
Pinecone
Managed serverless vector DB. The default for production RAG when you don't want to run infrastructure.
- Serverless + pay-per-query
- Hybrid (dense + sparse) search
- AWS / GCP / Azure
Qdrant
Qdrant
Rust-based open-source vector DB. Fastest single-node performance + Apache-2.0.
- Best raw throughput in benchmarks
- Self-hostable in Docker / K8s
- Rich filtering
Weaviate
Weaviate
Open-source vector DB with native multi-modal support. Strong for image + text RAG.
- Native multimodal (CLIP)
- GraphQL + REST APIs
- BYO-embeddings or auto-vectorize
pgvector
PostgreSQL
Vector search inside Postgres. Already running Postgres? You probably don't need a separate vector DB.
- No new infrastructure
- ACID transactions for vectors
- Available on RDS, Supabase, Neon
ChromaDB
Chroma
The easiest vector DB to start with. pip install, run in-process, perfect for prototypes.
- Embeddable + standalone modes
- Apache-2.0
- Best DX for learning
Mem0
Mem0
Memory layer for AI agents. Personalized agent recall across sessions — open-source.
- Long-term agent memory
- Vector + graph hybrid
- OpenAI / Anthropic / Bedrock support
Learn to ship production-grade GenAI with Cloudadhar
Our AWS + Agentic AI batch covers Bedrock, LangChain, vector DBs, RAG, evaluation, guardrails — and the operational reality of running LLMs in production.
⚖️ All model names and logos belong to their respective owners. Cloudadhar is an independent educational resource and is not affiliated with or endorsed by any model vendor. Links go to each vendor's official surface — we do not proxy or rehost their models.