The wrong question is “which AI to choose”. The right one is “who controls the inference”. Three families of arbitration, which do not have the same players at the end, not the same contractual commitments, and not the same eighteen-month regulatory exposure.
The proprietary API of an American player
You call an API. You pay per token. You don’t have to worry about infrastructure, model maintenance or updates.
What this means in practice is that you outsource the intelligence layer of your processes to an American player. You have no control over model changes, price changes, availability or usage policies. You have a contractual SLA, but if OpenAI changes its data training policy or pricing, you adapt or change.
When suitable: rapid prototyping, non-critical use, non-sensitive data, no regulatory constraints on location. Opex budget preferable to capex.
The underestimated risk: supplier dependency. Migrating from one model to another is feasible, but costly. The prompt engineering developed for GPT-4 cannot be transferred identically to Claude or Llama. There’s a real migration debt.
A vertical solution purchased from a third-party vendor
You buy a solution from a vendor who has built an AI application for your industry: CRM with integrated AI, legal generation tool, customer support assistant.
In practice, it’s almost always an API wrapper (GPT, Claude or other) with an application layer. The added value is in the interface, the system prompt, and possibly fine-tuning on sector-specific data.
When it’s right: no in-house technical resources, the solution precisely covers your use case, and supplier dependency is acceptable.
What we forget to look at: you have a double dependency. On the solution vendor. And on the underlying model. If the publisher disappears, changes its model or prices, your ability to migrate is limited by the reality of signed contracts.
The open weights model on your own infra
You deploy an open weights model (Llama 3, Mistral, Mixtral, Qwen…) on your infrastructure. You control inference, data and versions.
What it requires: infrastructure capex (GPU/APU) or GPU cloud costs, plus in-house MLOps skills. Performance is inferior to proprietary frontier models on generic tasks, but often comparable on specialized tasks after fine-tuning.
When appropriate: sensitive data, regulatory constraints (RGPD, industrial secrecy, defense secrecy), high inference volumes that make capex profitable, desire for long-term control.
On cost: a server with 2 H100 GPUs to infer a model of 70 billion parameters costs 50 to 70,000 euros to purchase. Amortization over 3 years can be lower than equivalent API costs if the volume is sufficient.
The missing variable: sovereignty
Most build/buy/api comparisons stop at costs. They forget the question of sovereignty.
If your competitive differentiation is based on your proprietary data, sending this data to a third-party API for fine-tuning (even with a contractual guarantee of non-reuse) creates a risk that you need to consciously assess. Critical dependence on a US API also creates exposure to regulatory decisions (export controls, sector restrictions), breakdowns and policy changes. This risk is low in probability, potentially high in impact.
The RGPD, AI Act obligations, and sector-specific regulations (healthcare, finance) can impose constraints on data localization and processing that render certain API architectures non-compliant.
The right arbitration
The right trade-off is not an answer, it’s a grid. Data sensitivity, inference volume, internal resources, dependency tolerance, 36-month regulatory exposure. No two companies sign on all five dimensions in the same way. What’s constant: going for the best-known API because that’s what your supplier is talking about, without having looked at the open-weights option on your own infra, means you haven’t made the decision. It’s letting someone else make it for you.