Guides

Image Recognition

How to get the best results when analyzing images with your Chipp app

|View as Markdown
Hunter HodnettCPTO at Chipp
|4 min read

Your Chipp app can analyze images uploaded by users. This guide explains how image recognition works and how to get the best results.

How It Works

When a user uploads an image, Chipp processes it in one of two ways depending on your app's model:

Models with native vision capabilities see images directly, just like a human would. The image is embedded in the conversation and the model can reference it naturally.

Vision-capable models:

ProviderModels
OpenAIGPT-5, GPT-5 Mini, GPT-5 Nano, GPT-4.1, GPT-4.1 Mini, GPT-4.1 Nano
AnthropicClaude Opus 4.1, Claude Opus 4, Claude Sonnet 4.5, Claude Sonnet 4, Claude 3.7 Sonnet, Claude 3.5 Haiku
GoogleGemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash Lite, Gemini 2.0 Flash, Gemini 2.0 Flash Lite

Models WITHOUT vision:

  • OpenAI o-series (o1, o1 Pro, o3, o3 Pro, o3-mini, o4 Mini)

Non-Vision Models (Fallback)

For models without native vision (like the o-series reasoning models), Chipp uses a separate image analysis tool powered by OpenAI's GPT-5 to describe the image, then passes that description to your app's model.

This two-step process can sometimes lose context compared to native vision models.

Getting the Best Results

1. Choose the Right Model

For apps that heavily rely on image analysis, select a vision-capable model:

  1. Go to your app in the Chipp dashboard
  2. Navigate to Build > Configure
  3. Under Model, select a vision-capable model like GPT-5 or Claude Sonnet 4

2. Enable Image Recognition

Make sure the capability is turned on:

  1. Go to Build > Actions
  2. Find Image Recognition in the Basic Actions list
  3. Toggle it ON

3. Image Quality Tips

For best recognition accuracy:

  • Resolution: Higher resolution images produce better results
  • Lighting: Well-lit images are easier to analyze
  • Focus: Ensure the subject is in focus and not blurry
  • Format: JPEG, PNG, and WebP are all supported

4. Prompting for Analysis

Guide users to be specific about what they want analyzed:

Good prompts:

  • "What math equations are shown in this image?"
  • "Identify all the plants in this photo"
  • "What text appears on this sign?"

Vague prompts:

  • "What is this?"
  • "Tell me about the image"

Comparing to ChatGPT

If you notice differences between your Chipp app and ChatGPT's native image analysis, consider:

  1. Model selection: ChatGPT uses GPT-5 by default. Ensure your Chipp app also uses GPT-5 or a comparable vision model like Claude Sonnet 4.

  2. System prompt: Your app's personality and instructions may influence how it interprets images. ChatGPT has different default behaviors.

  3. Context window: ChatGPT maintains conversation context differently. For complex multi-image analysis, results may vary.

Troubleshooting

Images Not Being Analyzed

  • Verify Image Recognition is enabled in your app's Actions
  • Check that the user is uploading supported formats (JPEG, PNG, WebP, GIF)
  • Ensure images are from supported sources (direct uploads, not all external URLs)

Poor Accuracy

  • Switch to a vision-capable model (GPT-5, Claude Sonnet 4, etc.)
  • Ask users to provide clearer, higher-resolution images
  • Add specific instructions in your system prompt about how to analyze images

Slow Response Times

Image analysis typically takes 5-15 seconds depending on image complexity. For faster responses:

  • Use smaller image files when possible
  • Consider GPT-5 Nano or Claude 3.5 Haiku for speed vs. accuracy tradeoff

Model Comparison for Vision

ModelVision QualitySpeedBest For
GPT-5ExcellentMediumComplex reasoning about images
Claude Sonnet 4.5ExcellentFastDetailed visual descriptions
Gemini 2.5 ProExcellentMediumMulti-image comparisons, long context
GPT-5 MiniVery GoodFastBalanced performance
GPT-5 NanoGoodVery FastQuick, simple analysis
Claude 3.5 HaikuGoodVery FastFast responses
Gemini 2.5 FlashVery GoodFastBalanced speed and quality