LLM Benchmark Cost Calculator

Estimate what a benchmark run will cost across a set of models. Pick models, set the input/output token sizes and how many calls each model gets.

Models

gpt-5-nanogpt-5.4-nanogroq:llama-3.1-8bmistral-small-4grok-4.1-fastmercury-2together:gemma-4-31bgemini-3.1-flash-litehaiku-4.5together:qwen3.5-9b

Input tokens per call

Output tokens per call

Calls per model

Total benchmark cost

$0.0132

10 calls · 10 models

Per-model breakdown

Model	Calls	Input tokens	Output tokens	Cost
gpt-5-nano	1	1,000	1,000	$0.000450
gpt-5.4-nano	1	1,000	1,000	$0.001450
groq:llama-3.1-8b	1	1,000	1,000	$0.000130
mistral-small-4	1	1,000	1,000	$0.000750
grok-4.1-fast	1	1,000	1,000	$0.000700
mercury-2	1	1,000	1,000	$0.001000
together:gemma-4-31b	1	1,000	1,000	$0.000700
gemini-3.1-flash-lite	1	1,000	1,000	$0.001750
haiku-4.5	1	1,000	1,000	$0.006000
together:qwen3.5-9b	1	1,000	1,000	$0.000250

gpt-5-nanogpt-5.4-nanogroq:llama-3.1-8bmistral-small-4grok-4.1-fastmercury-2together:gemma-4-31bgemini-3.1-flash-litehaiku-4.5together:qwen3.5-9b

Cost vs input tokens

Output held at 1,000 · 1 call/model

Cost vs output tokens

Input held at 1,000 · 1 call/model