Curated · Updated May 2026

AI Models Catalog 2026

The 52 AI models engineers actually use in production right now — curated by an industry practitioner with 12+ years in cloud + LLMOps.

Every card links to the official try-it page. No proxies. No reselling. Just the truth about what works.

52

Models

9

Brand-new in 2026

26

Open weights

Jump to:General LLMs Coding LLMs Open-Weight LLMs Multimodal & Vision Image Generation Speech & Audio Embeddings & Retrieval AWS Bedrock Foundation Models Agent Frameworks Multi-Agent Systems AgentOps & Observability Vector DBs & Memory

General LLMs (Chat & Reasoning)

4 models

Claude Opus 4.7

Anthropic

The model GitHub Copilot uses. Best-in-class for nuanced reasoning, long-context analysis, and agentic workflows.

Frontier (size not disclosed)

Industry-leading code generation + tool-use
Native computer-use + bash/file tools
Strong refusal behavior — safer for enterprise

ProprietaryTry on official site

GPT-5

OpenAI

OpenAI's flagship. Strongest math/STEM reasoning + widest ecosystem of tools & integrations.

textimageaudiocode

Best-in-class math + competitive coding
Native voice + Realtime API
Function calling + Structured Outputs

ProprietaryTry on official site

Gemini 2.5 Pro

Google

Google's flagship. Unrivaled context length (2M tokens) — drop an entire codebase, get answers.

textimageaudiovideo

Longest production context window (2M)
Native video understanding
Free tier via Google AI Studio

ProprietaryTry on official site

Grok 4

xAI

xAI's reasoning-first model with real-time X (Twitter) data access.

Real-time web + X data
Strong on STEM benchmarks
API-first pricing

ProprietaryTry on official site

Coding LLMs

4 models

✨ NEW🔥 POPULAR

Claude Sonnet 4.5 (Code)

Anthropic

The default for Cursor, Windsurf, and GitHub Copilot's Agent mode. Reigning champion of agentic coding.

#1 on SWE-bench Verified (real-world code tasks)
Excellent tool-use + multi-step debugging
Available in Cursor, Windsurf, Copilot, Aider

ProprietaryTry on official site

GPT-5 Codex

OpenAI

OpenAI's coding-tuned variant. Powers ChatGPT's Canvas + the Codex agent.

Top-tier on HumanEval
Built-in code interpreter
Long-form refactors

ProprietaryTry on official site

DeepSeek-Coder V3

DeepSeek

Open-weight Chinese coding model — punches at frontier-class on benchmarks, free to self-host.

236B MoE (21B active)

Open-weight + huge context
Strong on multi-file repo tasks
Free to self-host

Open WeightsTry on official site

Codestral 25.06

Mistral

Mistral's fast, lightweight code completion model — built for IDE autocomplete latency.

Optimized for FIM (fill-in-middle)
Sub-100ms first-token latency
80+ languages

MixedTry on official site

Open-Weight LLMs (Self-Host)

5 models

Llama 4 (70B Instruct)

Meta

Meta's flagship open-weight model. The de-facto baseline for self-hosted LLM stacks.

Best 70B-class quality (open)
Available on Bedrock, Together, Groq
Permissive license

Llama LicenseTry on official site

Qwen 2.5 72B

Alibaba

Alibaba's open-weight model — strong multilingual + math, Apache-2.0 license.

Apache-2.0 (truly open)
Excellent multilingual + math
Many sizes from 0.5B to 72B

Apache 2.0Try on official site

Mistral Large 2

Mistral

Mistral's flagship. Open-weight under research license; great agent + tool-use performance.

Strong function-calling
Available on Bedrock + Azure AI
Top European model

MixedTry on official site

Gemma 3 (27B)

Google

Google's open-weight family. Punchy quality per parameter — perfect for Ollama / single-GPU.

Excellent quality-per-param
Sizes from 1B to 27B
First-class Ollama support

Open WeightsTry on official site

Phi-4

Microsoft

Microsoft's small-but-mighty 14B model — frontier-class reasoning at a fraction of the cost.

Best-in-class reasoning at 14B
MIT licensed
Trivial to self-host

MITTry on official site

Multimodal & Vision

3 models

Claude 4 Vision

Anthropic

Best general-purpose vision LLM — reads diagrams, screenshots, whiteboards as well as humans.

Excellent OCR + diagram understanding
Vision-aware tool use
Screenshot debugging

ProprietaryTry on official site

GPT-5 Vision

OpenAI

OpenAI's vision-enabled flagship. Strong on charts, math problems from photos, and image-to-code.

Image-to-code (screenshot → JSX)
Best for chart/graph reading
Tight Code Interpreter integration

ProprietaryTry on official site

Llama 4 Vision

Meta

Best open-weight vision model. Self-hostable on a single H100 — full data control.

Open weights + commercial use
Strong on real-world photo Q&A
Available on Bedrock

Llama LicenseTry on official site

Image Generation

4 models

Flux 1.1 Pro

Black Forest Labs

The current SOTA in text-to-image. Beats Midjourney + DALL-E 3 on prompt adherence + photorealism.

#1 on Artificial Analysis Image Arena
Open-weight Schnell variant
Available on Replicate, fal.ai

MixedTry on official site

Stable Diffusion 3.5

Stability AI

Open-weight champion. Massive ComfyUI + Automatic1111 ecosystem — endless customization.

Huge LoRA + ControlNet ecosystem
Self-hostable on consumer GPUs
Commercial license available

MixedTry on official site

DALL-E 3

OpenAI

OpenAI's image model — strongest at typography + text rendering inside images.

Best text-in-image rendering
Native ChatGPT integration
Strong prompt adherence

ProprietaryTry on official site

Midjourney v7

Midjourney

The artist's favorite. Unmatched aesthetic + style consistency for creative work.

Best-in-class aesthetic quality
Style reference + character consistency
Web app now (no Discord needed)

ProprietaryTry on official site

Speech & Audio

3 models

Whisper Large v3

OpenAI

Open-weight speech-to-text. Industry standard for transcription — 99 languages, free to self-host.

MIT licensed + self-hostable
99 languages including Hindi/Hinglish
Best free transcription option

MITTry on official site

ElevenLabs v3

ElevenLabs

The most realistic AI voices on the market. Voice cloning + multilingual TTS at production quality.

Sub-200ms streaming latency
32 languages incl. Hindi
Voice cloning from 30 sec sample

ProprietaryTry on official site

Deepgram Nova-3

Deepgram

Lowest-latency speech-to-text API. Built for real-time voice agents.

<300ms end-to-end latency
Streaming + batch
Speaker diarization

ProprietaryTry on official site

Embeddings & Retrieval

4 models

text-embedding-3-large

OpenAI

OpenAI's flagship embeddings — the safe default for RAG pipelines.

Configurable output dims (256–3072)
Top of MTEB leaderboard (proprietary)
Cheap at scale ($0.13/M tokens)

ProprietaryTry on official site

Cohere Embed v4

Cohere

Best multilingual embeddings on the market. Native support for 100+ languages incl. Hindi.

100+ languages
Multimodal (text + image)
Free tier on dashboard

ProprietaryTry on official site

BGE-M3

BAAI

Top open-weight embedding model. Dense + sparse + multi-vector in one model. Free.

MIT licensed
Hybrid dense+sparse retrieval
100+ languages

MITTry on official site

nomic-embed-text-v1.5

Nomic

Small, fast, free embeddings — drop into Ollama in one command. Perfect for local RAG.

Runs on CPU
Apache-2.0
Native Ollama support

Apache 2.0Try on official site

AWS Bedrock Foundation Models

4 models

Amazon Nova Pro

AWS

Amazon's own flagship FM. Cheap, fast, multimodal — only on Bedrock.

AWS-native (IAM, VPC, PrivateLink)
Multimodal incl. video
Lowest $$/token of frontier models

ProprietaryTry on official site

Claude 4 (on Bedrock)

Anthropic / AWS

Claude with full AWS IAM + VPC + PrivateLink. The enterprise way to use Claude in India.

Available in Mumbai region
Provisioned throughput option
Bedrock Guardrails + KMS

ProprietaryTry on official site

Llama 4 (on Bedrock)

Meta / AWS

Meta's open-weight Llama with managed scaling on Bedrock. No GPU ops needed.

Fully managed — no GPU ops
Knowledge Bases integration
Agents for Bedrock

Llama LicenseTry on official site

Mistral Large 2 (Bedrock)

Mistral / AWS

Mistral's flagship with AWS billing + IAM. Good for EU data residency too.

Strong tool-use
EU + US regions
Per-token billing on Bedrock

MixedTry on official site

Agent Frameworks

6 models

LangGraph

LangChain

Stateful agent graphs from the LangChain team. The most-deployed agent framework in production today.

Model-dependent

Graph-based agent state machines
Native human-in-the-loop
LangSmith observability built-in

MITTry on official site

AWS Bedrock Agents

AWS

Fully-managed agents on AWS. Knowledge Bases, action groups, guardrails — no infrastructure to run.

Model-dependent

Zero-infrastructure agents
Native IAM + KMS + VPC
Knowledge Bases (RAG) built-in

ProprietaryTry on official site

OpenAI Agents SDK

OpenAI

OpenAI's official agent SDK. Tight integration with GPT-5, function calling, and Responses API.

Model-dependent

Native handoffs between agents
Built-in tracing
Python + TypeScript SDKs

MITTry on official site

Claude Agent SDK

Anthropic

Anthropic's official SDK for building autonomous Claude-powered agents with computer use + bash.

Computer use + bash tool out-of-box
File-system aware
Powers Claude Code

MITTry on official site

Pydantic AI

Pydantic

Type-safe agent framework from the Pydantic team. FastAPI for the agent world — clean, opinionated, fast.

Model-dependent

Type-safe tool definitions
Streaming + structured outputs
Model-agnostic

MITTry on official site

Vercel AI SDK

Vercel

The fastest way to add streaming AI to a Next.js / React app. Used by ~40% of new AI startups.

Model-dependent

First-class React Server Components
Streaming UI helpers
20+ provider plugins

Apache 2.0Try on official site

Multi-Agent Systems

4 models

CrewAI

CrewAI

Role-based multi-agent orchestration. Define agents like a real team — researcher, writer, reviewer.

Model-dependent

Role + goal + backstory primitives
Sequential + hierarchical processes
30k+ GitHub stars

MITTry on official site

AutoGen v0.4

Microsoft

Microsoft's research-grade multi-agent framework. Event-driven, async, supports complex agent conversations.

Model-dependent

Async event-driven runtime
Group chat patterns
Strong code execution agents

MITTry on official site

LangGraph Swarm

LangChain

Swarm-style handoff agents on top of LangGraph. Inspired by OpenAI Swarm, production-hardened.

Model-dependent

Dynamic agent handoffs
Persistent state across agents
Time-travel debugging

MITTry on official site

AWS Multi-Agent Orchestrator

AWS

AWS's open-source multi-agent framework. Production-ready intent routing across Bedrock agents.

Model-dependent

Intent classifier routing
Python + TypeScript
Bedrock + Anthropic + custom

Apache 2.0Try on official site

AgentOps & Observability

5 models

LangSmith

LangChain

The Datadog for agents. Trace every LLM call, debug step-by-step, evaluate prompts at scale.

Full trace tree per agent run
Datasets + evaluators built-in
Free tier for solo devs

ProprietaryTry on official site

AgentOps

AgentOps.ai

Vendor-neutral agent monitoring. Track cost, latency, errors, and session replays across any framework.

Framework-agnostic (CrewAI, AutoGen, LangChain, etc.)
Session replay
Cost + token tracking

ProprietaryTry on official site

Langfuse

Langfuse

Open-source LLM observability. Self-hostable, OpenTelemetry-native, vendor-agnostic.

Self-host / SaaS

Self-hostable + cloud option
OpenTelemetry-native
Prompt management + datasets

MITTry on official site

Helicone

Helicone

Proxy-based LLM observability. One line of code change, full request logging + caching + cost analytics.

Drop-in OpenAI proxy
Built-in caching
Custom properties for filtering

Apache 2.0Try on official site

Arize Phoenix

Arize AI

Open-source LLM evals + tracing from the Arize team. Best-in-class evaluation framework.

Self-host / SaaS

Strong RAG evals (faithfulness, relevance)
OpenTelemetry-based
Self-hostable

Apache 2.0Try on official site

Vector DBs & Memory

6 models

Pinecone

Pinecone

Managed serverless vector DB. The default for production RAG when you don't want to run infrastructure.

Serverless + pay-per-query
Hybrid (dense + sparse) search
AWS / GCP / Azure

ProprietaryTry on official site

Qdrant

Qdrant

Rust-based open-source vector DB. Fastest single-node performance + Apache-2.0.

Self-host / Cloud

Best raw throughput in benchmarks
Self-hostable in Docker / K8s
Rich filtering

Apache 2.0Try on official site

Weaviate

Weaviate

Open-source vector DB with native multi-modal support. Strong for image + text RAG.

Self-host / Cloud

Native multimodal (CLIP)
GraphQL + REST APIs
BYO-embeddings or auto-vectorize

BSD-3Try on official site

pgvector

PostgreSQL

Vector search inside Postgres. Already running Postgres? You probably don't need a separate vector DB.

No new infrastructure
ACID transactions for vectors
Available on RDS, Supabase, Neon

MITTry on official site

ChromaDB

Chroma

The easiest vector DB to start with. pip install, run in-process, perfect for prototypes.

Embedded / Server

Embeddable + standalone modes
Apache-2.0
Best DX for learning

Apache 2.0Try on official site

Mem0

Mem0

Memory layer for AI agents. Personalized agent recall across sessions — open-source.

Long-term agent memory
Vector + graph hybrid
OpenAI / Anthropic / Bedrock support

Apache 2.0Try on official site

Want to build with these?

Learn to ship production-grade GenAI with Cloudadhar

Our AWS + Agentic AI batch covers Bedrock, LangChain, vector DBs, RAG, evaluation, guardrails — and the operational reality of running LLMs in production.

See the AWS + Agentic AI Batch WhatsApp Me

⚖️ All model names and logos belong to their respective owners. Cloudadhar is an independent educational resource and is not affiliated with or endorsed by any model vendor. Links go to each vendor's official surface — we do not proxy or rehost their models.