Skip to content

Pricing, plans, and API prices

Pricing AI CLI

AI CLIs can be paid for in two ways: through a fixed subscription where usage is included up to a limit, or through an API key where you pay for tokens, tool calls, and model choice.

NeedTypical choice
You want to learn the toolStart with a free tier or small personal subscription.
You use the CLI several times a weekChoose a plan with included CLI usage.
You want automation, CI, or scriptsUse an API key with spend limits.
You work in a teamUse a team or business plan with central billing, SSO, and admin controls.
You work in large repositoriesSet budgets, use cheaper models for routine work, and keep context small.
DriverWhy it mattersPractical control
Repository sizeThe agent may read many files and send more context.Start in the specific app folder, not the whole workspace.
Instruction filesLong AGENTS.md, CLAUDE.md, and GEMINI.md files are often sent as context.Keep root files short and put details close to the subproject.
MCP and toolsEach external tool can add descriptions, output, and extra calls.Enable MCP servers only when the task needs them.
Model choiceStronger models cost more and often use more reasoning.Use mini/Haiku/Flash for routine work and stronger models for hard tasks.
Output lengthLong reports and generated code cost output tokens.Ask for concrete findings, patches, and short summaries.
AutomationLoops, cron, and CI can spend without you watching.Use separate API keys, budgets, rate limits, and logs.

Simple API rule:

price = input tokens + cached input tokens + output tokens + tool calls

For CLI subscriptions, the practical rule is:

usage = number of tasks * task size * model choice * extra tools/subagents

Codex can use ChatGPT login or an OpenAI API key. ChatGPT plans include Codex access across web, CLI, IDE, and sometimes cloud features. API key usage is best for automation and CI, but cloud features such as GitHub code review and Slack are not included.

PlanPriceCodex relevance
Free$0/mo.Quick test tasks and learning.
Go$8/mo.Lightweight coding tasks.
Plus$20/mo.Codex in web, CLI, IDE, and iOS, newer models, and credits.
ProFrom $100/mo.5x or 20x higher limits than Plus and priority processing.
BusinessPay as you goTeam workspace, admin controls, larger VMs, and credits.
Enterprise/EduContact salesEnterprise controls, audit logs, retention, and data residency.
API keyToken-basedLocal Codex tasks in CLI/SDK/IDE, but no cloud code review.
ModelInput / 1M tokensCached input / 1M tokensOutput / 1M tokens
GPT-5.4$2.50$0.25$15.00
GPT-5.4 mini$0.75$0.075$4.50
GPT-5.4 nano$0.20$0.02$1.25

OpenAI states that these standard prices apply to context lengths under 270K. Data residency and regional processing can add a surcharge for models released after March 5, 2026.

Codex credits let you continue after included limits. For new and existing Business customers and new Enterprise customers, OpenAI showed this token-based credit card:

ModelInput / 1M tokensCached input / 1M tokensOutput / 1M tokens
GPT-5.462.50 credits6.250 credits375 credits
GPT-5.4-mini18.75 credits1.875 credits113 credits
GPT-5.3-Codex43.75 credits4.375 credits350 credits
GPT-5.243.75 credits4.375 credits350 credits

Fast mode consumes 2x credits. GPT-5.3-Codex-Spark was a research preview for ChatGPT Pro and was not available in the API at launch.

OpenAI gives limits as ranges because they depend on task size and model. As a practical rule, large repositories, long prompts, and subagents use the allowance faster.

PlanExample local messages per 5 hours
PlusGPT-5.4: 20-100, GPT-5.4-mini: 60-350, GPT-5.3-Codex: 30-150
Pro 5xGPT-5.4: 200-1000, GPT-5.4-mini: 600-3500, GPT-5.3-Codex: 300-1500
Pro 20xGPT-5.4: 400-2000, GPT-5.4-mini: 1200-7000, GPT-5.3-Codex: 600-3000

Cloud tasks and local messages can share the same five-hour window, and weekly limits can also apply.

Claude Code can be used through Claude subscriptions or API token consumption. Claude Code shows token usage with /cost for API users. Pro and Max subscribers have usage included in their plan and should use /stats to understand usage patterns.

PlanPriceClaude Code relevance
Free$0Try Claude in the app.
Pro$17/mo. annual or $20/mo. monthlyMore usage, Claude Code, Claude Cowork, and Projects.
MaxFrom $100/mo.5x or 20x more usage than Pro and higher output limits.
Team standard$20/seat/mo. annual or $25/mo.Team features, Claude Code, SSO, and admin.
Team premium$100/seat/mo. annual or $125/mo.5x more usage than standard seats.
Enterprise$20/seat + API-rate usageEnterprise controls, spend controls, audit logs, and retention.
APIToken-basedPay by model, input, output, cache, and tools.

Anthropic describes typical enterprise Claude Code costs as around $13 per developer per active day and $150-250 per developer per month, with large variation based on model, repository size, and usage pattern.

ModelInput / 1M tokensOutput / 1M tokensCache write / 1MCache read / 1M
Opus 4.6$5$25$6.25$0.50
Sonnet 4.6$3$15$3.75$0.30
Haiku 4.5$1$5$1.25$0.10

US-only inference costs 1.1x input and output prices. Claude web search costs $10 per 1,000 searches. Code execution includes 50 free hours daily per organization, then $0.05 per hour per container.

Gemini CLI, Google AI plans, and Gemini API

Section titled “Gemini CLI, Google AI plans, and Gemini API”

Gemini CLI has several auth paths: Google login, Gemini API key, and Vertex AI. The right pricing model depends on the login method.

Auth or planPrice or modelCLI quota in Gemini CLI docs
Google account, Gemini Code Assist individualFree1,000 requests per user per day
Google AI Pro$19.99/mo.1,500 requests per user per day
Google AI Ultra$249.99/mo.2,000 requests per user per day
Gemini API key, unpaidFree tier250 requests per user per day, Flash only
Gemini API key, paidPay as you goVaries by pricing tier and token usage
Vertex AIPay as you goVaries by Google Cloud quota and model
Workspace Code Assist StandardSeat/license1,500 requests per user per day
Workspace Code Assist EnterpriseSeat/license2,000 requests per user per day

Google AI Plus exists as a consumer subscription, but the Gemini CLI documentation says tiers that are not listed, including Google AI Plus, are not supported as a CLI paid tier.

Model and tierInput / 1M tokensOutput / 1M tokens
Gemini 3.1 Pro Preview, Standard, under 200k prompt$2.00$12.00
Gemini 3.1 Pro Preview, Standard, over 200k prompt$4.00$18.00
Gemini 3.1 Pro Preview, Batch/Flex, under 200k prompt$1.00$6.00
Gemini 3.1 Flash-Lite Preview, Standard text/image/video$0.25$1.50
Gemini 3.1 Flash-Lite Preview, Batch/Flex text/image/video$0.125$0.75
Gemini 3 Flash Preview, Standard text/image/video$0.50$3.00
Gemini 3 Flash Preview, Batch/Flex text/image/video$0.25$1.50

Grounding with Google Search for Gemini 3 had 5,000 prompts per month included on the paid tier, then $14 per 1,000 search queries. Batch API can reduce token prices, but works best for tasks that can wait.

ProfileStarting budgetWhy
New userFree tier or one personal plan for a monthYou learn workflows before wiring in API keys.
Solo developerCodex Plus/Pro, Claude Pro/Max, or Google AI Pro by ecosystemFixed monthly price is easier than token accounting.
AutomationAPI key with a low spend limitScripts can otherwise spend heavily without you seeing it.
TeamBusiness/Team/EnterpriseCentral billing, SSO, audit, retention, and spend controls matter more than the lowest token price.
Large repositorySubscription plus API fallbackUse the subscription for manual development and API keys for measured automation.

The English page shows USD because that is how most official pricing pages publish base prices. Treat these as pre-tax references and check checkout for:

  • VAT or local taxes
  • currency conversion
  • local price changes
  • enterprise discounts
  • promotions and temporary usage boosts

Start with a subscription

For learning and manual use, fixed plans are easier to reason about than API usage.

Set spend limits

API keys should always have budgets, rate limits, and separate keys for test, CI, and production.

Keep context small

Large instruction files, many MCP servers, and broad prompts consume more tokens.

Use a cheaper model first

Use mini, Haiku, or Flash for routine tasks and save the strongest models for hard changes.

Last checked: April 11, 2026.


Comments