Optimizing Selection
If you're building with AI, the model you select to power your product will have a very important impact both on user experience, as well as your profitability. Selecting the right model for the right job is an important part of building a product that can scale.
As mentioned in the prior topic, your goal should be to maximize customer value, while optimizing cost and consuming the fewest tokens necessary.
Model selection becomes a key component of optimizing cost.



The Car Analogy
LLM providers often deliver a variety of models catered to various use cases. Models may differ based on speed, cost, and capability. Highly intelligent reasoning models cost more than entry-level models used for simple use cases like chat bots.
This is similar to the portfolio of models in a car manufacturer's lineup. For example, a single automaker may have wide offerings such as sedans, pickup trucks, minivans, and sports cars.
Each car model has its strengths and weaknesses. An entry-level sedan may be incredibly fuel efficient and come at a low cost. Pickup trucks tend to consume more fuel, but they are great for towing. Minivans are great for large families.
When you purchase a car, you likely base your selection on a number of factors — price, intended use, fuel economy, safety rating. If you solely chose based on the newest, nicest model you'd likely end up being disappointed in your choice in the long run.
The same applies to LLM selection. If you take the time to select a model based on its capabilities, security, price, and intended use, you're more likely to select a model that optimizes for your customers while ensuring your product is financially viable and can scale.
Let's look at a real-world example.
Case Study:
Anthropic's Lineup
Anthropic is a leading AI company.01 Similar to OpenAI and Google, Anthropic offers various model types, each with its own intended uses and benefits.
Let's compare a couple of Anthropic's models below based on cost, use case, and other technical aspects. Here, we're only comparing Haiku versus Sonnet. This is not an exhaustive list. This information can all be found on Anthropic's model comparison page.02
Claude Haiku 03
Claude Haiku is designed to be a fast and cost-effective model. It has lower evaluation scores versus Claude Sonnet, pretty much across the board. That being said, it is fast and still scores well on general-knowledge tasks.
- Use Cases: Sales and service chatbots, data categorization, education
- Cost per 1M tokens (Mtok): $0.80 Input / $4.00 Output
- Context Window: 200k
- Latency: Very Fast
Claude Sonnet 04
Sonnet comes in both a 3.5 and 3.7 variant. Both have the same price per Mtok, but 3.7 is more of a reasoning model that is capable of extended thinking. Sonnet outperforms Haiku in virtually all categories except speed and price.
- Use Cases: Excels at coding and reasoning applications
- Cost per Mtok: $3.00 Input / $15.00 Output
- Context Window: 200k
- Latency: Fast. Slower than Haiku
Comparison
Both Claude Haiku and Claude Sonnet are sophisticated models capable of driving efficiencies and customer value. Both offer the same context window sizes. One is much faster than the other but is better suited for more general, simple use cases. The other — Sonnet — is more expensive but a much more intelligent model for more complex use cases.
Now let's take a look at how model selection impacts product profitability below.

Imagine that you've built a cool new product for lawyers. Your product has access to a large database of prior cases and regulations. Paying customers (laywers) can type questions and get detailed answers to help them in their preparation for important trials.
Let's pretend you charge $10 per user per month. Each time a customer uses your product, it consumes 50k input tokens — your product digs through quite a bit of data to generate a response (but within the model context windows). It then generates an average 300 tokens as a response for your customer. Your customers use this feature an average 75 times per month.
Both Haiku and Sonnet would work here! But Sonnet's token prices are 3.75x versus Haiku.
Let's pretend your only costs are token expenses (don't bank on that). Using Haiku, your product gross margin would be 69%; if you use Sonnet, it'd be -16%. In that case you'd be losing money on every customer!
That doesn't mean Haiku is right for your product, though! Your product may require more sophisticated reasoning that would require Sonnet. In this case you'd need to find a way to charge more or optimize your token usage to keep healthy margins while giving customers the value they need.
Wrapping Up
In prior topics we've discussed tokens and model cost drivers. Here you've seen how those drivers play out in a (sort of) real-life scenario.
Models come in various shapes and sizes. Each has strengths and weaknesses. As you leverage AI in your product or tech stack, optimal model selection will set the foundations for a scalable product.
Piton Ventures has deep expertise in AI economics and its overlap with product profitability. If you'd like to talk more about your vision for leveraging AI, get in touch with us. We can help you ensure you're making the right decisions upfront to invest responsibly and build a great and lasting product!
Let's build something great together.
Phone
+1 (907) 952-6599
Give us a call and chat directly with our friendly team. We're always happy to answer any questions.
Social Media
Connect with us on social media to see key updates and industry insights.