Pick the model by the result, not the reputation.
Every agent call is hard to place — capability, cost and latency trade off differently each time. Ainfera scores the candidates against the task and routes to the one most likely to finish it. Here's what goes into that, and the proof it leaves behind.
Four inputs decide whether a call finishes.
These are the signals every candidate is scored on. How we weigh them is the part that compounds with traffic — so the weights stay ours — but the inputs are no secret.
What the call is
A drafting call and a tool-use call don't want the same model. We read the shape of the request first.
What it costs
Live per-token price for each candidate, against the ceiling you set.
How fast it answers
Measured on rolling production traffic, not vendor-published numbers.
Whether it's healthy now
A provider that's erroring or rate-limiting this minute drops out, and comes back when it recovers.
Intelligence: Artificial Analysis · artificialanalysis.ai
We route to the model most likely to finish the task.
Not the biggest name, not a model you pinned six months ago and forgot. The pick is made per call and changes as price, speed and health change — so the cheapest model that still clears the bar is the one that runs.
Intelligence + speed: Artificial Analysis · artificialanalysis.ai
You set the box. We pick the model inside it.
Routing is yours to bound. Three controls, settable per agent or per task type.
Set the box
Per-call cost ceilings and latency targets, per agent or per task type. If nothing fits, we tell you — we never quietly downgrade.
Force a model
Pin a specific model or provider when you need it, and keep routing everywhere else.
Stay up
On a 429, 5xx, timeout or refusal we retry the next eligible candidate inside your caps — logged and audited like any other call.
Every decision is signed, on a public chain.
No black box and no dashboard claim. Every routed call is hashed, Ed25519-signed, and appended to an append-only chain. Verify any one of them with a single keyless request — no account, no key.
# the public chain is keyless curl https://api.ainfera.ai/v1/audit/public # → each entry: the routed model, provider, # sequence, block height and the Ed25519 # signature. Re-hash it yourself to verify.
Stop picking models. Start finishing tasks.
One endpoint, every provider, every decision on chain.