Choosing a Model | Chipp Docs

Chipp supports multiple AI models from OpenAI, Anthropic, Google, and more. Each has different strengths, speeds, and costs. This guide helps you choose the right one.

Quick Recommendations

Use Case	Recommended Model
General purpose	GPT-5.4, Claude Sonnet 4.6, or GPT-5.6 Terra
Image analysis	Gemini 3.1 Pro, GPT-5.4, or Claude Sonnet 4.6
Complex reasoning / frontier work	o3, Claude Opus 4.8, GPT-5.5 Pro, or GPT-5.6 Sol
Fast responses	Gemini 3.5 Flash, GPT-5 Mini, or Claude Haiku 4.5
Long documents	Gemini 3.1 Pro or GPT-4.1 (1M token context)
Cost-sensitive / high-volume	Gemini 3.5 Flash, GPT-5 Nano, or GPT-5.6 Luna

Estimate Your Costs

Use our calculator to estimate monthly AI costs based on your expected usage:

🖩 Pricing Calculator

Model

Messages / month

Avg input tokens

Avg output tokens

Input cost $0.00

Output cost $0.00

Total $0.00

With Chipp Pro ($29/mo)

Base price $29.00/mo

Included usage $10.00

Featured Models

Best for General Purpose

These models excel at a wide range of tasks including writing, coding, analysis, and conversation.

GPT-5.4

Claude Sonnet 4.6

Gemini 3.1 Pro

GPT-5.6 Terra

Best for Speed

When response time matters most, these models deliver near-instant results without sacrificing quality.

Gemini 3.5 Flash

GPT-5 Mini

Claude Haiku 4.5

Best for Reasoning

For complex problem-solving, multi-step analysis, and tasks requiring deep thinking.

claude-opus-4.6

o3 Pro

GPT-5.6 Sol

Best Value for Long Documents

Process entire codebases, legal contracts, or book-length content with massive context windows.

Gemini 3.1 Pro

GPT-4.1

Model Deep Dives

OpenAI GPT-5.4

GPT-5.4 OpenAI

General purposeCodingAnalysis

OpenAI GPT-5.6 Sol

GPT-5.6 Sol OpenAI

Frontier reasoningCodingAgentic work

GPT-5.6 Sol is OpenAI’s newest frontier model (also selectable via the gpt-5.6 alias). Priced at $5/$ 30 per million tokens with a 1M-token context window, it targets the same complex-reasoning and coding use cases as GPT-5.5 Pro and Claude Opus 4.8. See Model Benchmarks for official specs.

OpenAI GPT-5.6 Terra

GPT-5.6 Terra OpenAI

BalancedGeneral purposeCost-effective

GPT-5.6 Terra balances intelligence and cost at $2/$ 12 per million tokens — a general-purpose option priced between GPT-5.4 and GPT-5.6 Sol.

OpenAI GPT-5.6 Luna

GPT-5.6 Luna OpenAI

Ultra-fastHigh-volumeCost-sensitive

GPT-5.6 Luna is OpenAI’s most cost-efficient GPT-5.6 tier at $0.20/$ 1.20 per million tokens, aimed at high-volume, cost-sensitive workloads.

Claude Sonnet 4.6

Claude Sonnet 4.6 Anthropic

WritingAnalysisBalanced

Gemini 3.1 Pro

Gemini 3.1 Pro Google

Long contextMultimodalReasoning

OpenAI o3

o3 OpenAI

Deep reasoningMathComplex problems

Claude Haiku 4.5

Claude Haiku 4.5 Anthropic

FastAffordableReliable

Gemini 3.5 Flash

Gemini 3.5 Flash Google

FastAffordableGood quality

Cost-Effective Options

If you’re optimizing for cost, these models offer excellent value:

Compare models:

Gemini 3.5 FlashGPT-5 NanoClaude Haiku 4.5GPT-5 MiniGPT-5.6 Luna

Key Considerations

Vision Support

If your app analyzes images, choose a model with native vision support. Models without vision use a fallback that may be less accurate.

Vision-capable models:

All GPT-4.1 and GPT-5.x variants, including the GPT-5.6 family (not o-series reasoning models)
All Claude models
All Gemini models

No vision support:

OpenAI o-series (o1, o3, o4-mini, etc.)

Response Speed

Speed matters for user experience. Faster models keep conversations flowing naturally.

Fastest: GPT-5 Nano, Claude Haiku 4.5, Gemini 3.5 Flash, GPT-5.6 Luna Medium: GPT-5.4, Claude Sonnet 4.6, Gemini 3.5 Flash, GPT-5.6 Terra Slower: Claude Opus 4.8, o1, o3, GPT-5.6 Sol (reasoning takes time)

Context Window

For processing long documents, choose models with large context windows:

1M+ tokens: GPT-4.1 variants, GPT-5.x variants (including GPT-5.6 Sol/Terra/Luna at 1.05M), Claude Opus 4.8, Claude Sonnet 4.6, all Gemini models 200k tokens: Claude Haiku 4.5, OpenAI o-series

Note: Chipp enforces a 128K runtime ceiling on the entire GPT-5.x family (including GPT-5.6) regardless of the published context window, to stay within OpenAI’s Responses API’s practical limits.

Reasoning Quality

For complex tasks requiring multi-step reasoning:

Best reasoning: o3 Pro, Claude Opus 4.8, GPT-5.5 Pro, GPT-5.6 Sol Very good: GPT-5.4, Claude Sonnet 4.6, o3, o4-mini, GPT-5.6 Terra Good: GPT-5 Mini, Claude Haiku 4.5, Gemini 3.1 Pro, GPT-5.6 Luna

HIPAA-Compliant Models

Building a healthcare app? Chipp holds BAAs with Google (Gemini models) and Anthropic (Claude models, excluding Claude Fable). Choose one of these BAA-covered models for your app’s chat. Enabling HIPAA Compliance in Launch > Access enforces authentication and session controls, but it does not itself restrict which chat model you can pick, so make sure you selected a covered model. See HIPAA-Compliant Models for the full breakdown and important caveats.

Changing Your Model

Go to your app in the Chipp dashboard
Navigate to Build > Configure
Under Model, select your preferred model
Click Save

Changes take effect immediately for new conversations.

Testing Different Models

Not sure which model works best? Try these approaches:

A/B testing: Create two versions of your app with different models and compare user feedback
Specific prompts: Test your most common use cases with different models to see quality differences
Speed vs. quality: Start with a fast model, then upgrade if users need better responses

Using with MCP

You can manage AI models programmatically using the Chipp MCP Server, including AI-powered model recommendations.

Available Tools

Tool	Description
`get_model`	Get current model for an app
`update_model`	Change the AI model for an app
`list_available_models`	List all available AI models
`recommend_model`	Get AI-powered model recommendations

Example Prompts

plaintext

"What model is my FAQ bot using?"
"Switch my support bot to GPT-5.4"
"Which model would you recommend for my customer service app?"
"List all available models with their pricing"

Workflow: AI-Powered Model Selection

plaintext

You: Recommend a model for my customer support bot that needs to be fast
and handle simple FAQ questions

AI: [Calls recommend_model with priority="speed"]

Based on your use case, I recommend:

1. **Gemini 3.5 Flash** (Best choice)
   - Very fast responses (~1.5s)
   - Cost: $0.10/M input, $0.39/M output
   - Great for FAQ and simple queries

2. **GPT-5 Nano** (Budget option)
   - Fastest response time
   - Cost: $0.26/M input, $1.63/M output
   - Good for high-volume, simple tasks

Want me to update your app to use Gemini 3.5 Flash?

You: Yes, update it

AI: [Calls update_model]
Updated FAQ Bot to use Gemini 3.5 Flash. Changes take effect
for new conversations.

ℹ️

MCP server access requires a Builder plan or higher. See the MCP Setup Guide to get started.