Model routing = picking the cheapest model that meets quality for each request type. A simple two-tier router can cut costs 60%+ with no quality loss.
Lesson 2 of 2 · 14 min read
Model Routing
Don't use GPT-4o for tasks GPT-4o-mini can do. Don't use o1 for tasks GPT-4o can do.