Creative Genius Creative Genius
Lesson 2 of 2 · 14 min read

Model Routing

Don't use GPT-4o for tasks GPT-4o-mini can do. Don't use o1 for tasks GPT-4o can do.

Model routing = picking the cheapest model that meets quality for each request type. A simple two-tier router can cut costs 60%+ with no quality loss.

← Caching: Free Money on the Table Back to course