Ollama Cloud Engine Comparison
Models Tested12
Successful9
Failed3
Fastest0.9s
Slowest33.7s
Concurrency3 req (Pro)
API FormatOllama-native
Rate Limit3 concurrent
Vision/api/generate + images[]
Featherless Engine Comparison
Models Tested10
Successful4
Failed6
Fastest14.2s
Slowest21.4s
Concurrency1 req
API FormatOpenAI-compatible
Rate Limit8 model switches/min
Arli AI Engine Comparison
Models Tested5
Successful5
Failed0
Fastest6.7s
Slowest25.6s
Concurrency6 req
API FormatOpenAI-compatible
Rate Limit6 parallel total
Ollama Cloud
Models available34 cloud
Best textgemma3:4b (0.9s)
Best visiondevstral-small-2:24b (0.9s)
Best reasoningdeepseek-v3.2 (5.9s)
Best throughputgpt-oss:20b (130 t/s)
CapacityReliable
VS
Featherless
Models available5,700+
Best textMistral-7B-Instruct (14.2s)
Best vision-
Best reasoning-
Best throughputQwen3-8B (12.4 t/s)
CapacityOften busy
VS
Arli AI
Models available117 text + 80 image
Best textLlama-3.3-70B (6.7s)
Best visionQwen3.5-27B (19.5s)
Image genFLUX, SDXL, Qwen-Image
Best throughputLlama-3.3-70B (14.8 t/s)
CapacityReliable
Fastest Text
gemma3:4b
0.9s · 56 tokens
Ollama Cloud
Best Quality Text
gpt-oss:120b
2.2s · 246 tokens · 112 t/s
Ollama Cloud
Best Vision / Screenshot QA
devstral-small-2:24b
0.9s · via /api/generate
Ollama Cloud
Best Reasoning
deepseek-v3.2
5.9s · 688B params
Ollama Cloud
Best Throughput (tokens/s)
gpt-oss:20b
130 tokens/s · 310 tokens total
Ollama Cloud
Reliable Fallback Chain
1st: Ollama gemma3:4b (0.9s)
2nd: Arli AI Llama-3.3-70B (6.7s)
3rd: Featherless Mistral-7B (14.2s)
Image Generation
FLUX.2-klein-4B
16.6s · txt2img
Arli AI
Vision Multi-Model
Mix Ollama + Featherless + Arli
Up to 3 models per request
| # | Model | Engine | Family | Size | Time | Tokens | t/s | Preview |
| 1 | gemma3:4b | Ollama Cloud | Gemma 3 | 4B | 0.9s | 56 | 62.2 | Quantum computing utilizes the principles of quantum... |
| 2 | ministral-3:8b | Ollama Cloud | Ministral 3 | 8B | 1.2s | 66 | 55.0 | Quantum computing leverages the principles of **qu... |
| 3 | gemma3:12b | Ollama Cloud | Gemma 3 | 12B | 1.4s | 63 | 45.0 | Quantum computing harnesses the principles of quan... |
| 4 | ministral-3:14b | Ollama Cloud | Ministral 3 | 14B | 1.3s | 99 | 76.2 | Quantum computing leverages the principles of **qu... |
| 5 | gemma3:27b | Ollama Cloud | Gemma 3 | 27B | 1.7s | 82 | 48.2 | Quantum computing harnesses the bizarre principles... |
| 6 | gpt-oss:20b | Ollama Cloud | GPT-OSS | 20B | 1.8s | 234 | 130.0 | Quantum computing uses qubits that can exist in su... |
| 7 | gpt-oss:120b | Ollama Cloud | GPT-OSS | 120B | 2.2s | 246 | 111.8 | Quantum computing leverages qubits, which can exis... |
| 8 | Llama-3.3-70B-Instruct | Arli AI | Llama 3.3 | 70B | 6.7s | 144 | 14.8 | Quantum computing is a revolutionary technology th... |
| 9 | Llama-3.3-70B-ArliAI-RPMax-v3 | Arli AI | Llama 3.3 | 70B | 11.2s | 111 | 5.9 | Quantum computing is a type of computing that uses... |
| 10 | mistralai/Mistral-7B-Instruct-v0.3 | Featherless | Mistral | 7B | 14.2s | 53 | 2.8 | Quantum computing is a revolutionary technology t... |
| 11 | Qwen/Qwen3-8B | Featherless | Qwen 3 | 8B | 16.1s | 218 | 12.4 | Quantum computing leverages qubits, which can exis... |
| 12 | Qwen/Qwen2.5-72B-Instruct | Featherless | Qwen 2.5 | 72B | 18.1s | 112 | 4.0 | Quantum computing harnesses the principles of quan... |
| 13 | Qwen/Qwen2.5-7B-Instruct-1M | Featherless | Qwen 2.5 | 7B | 21.4s | 71 | 2.0 | Quantum computing utilizes qubits that can exist i... |
| 14 | Llama-3.3-70B-Instruct-Abliterated | Arli AI | Llama 3.3 | 70B | 25.6s | 138 | 3.6 | Quantum computing is a revolutionary technology th... |
| 15 | devstral-small-2:24b | Ollama Cloud | Devstral | 24B | 24.6s | 75 | 3.0 | Quantum computing uses quantum bits (qubits) that... |
| Model | Engine | Time | Error |
| qwen3.5:27b | Ollama Cloud | 0.2s | model not found |
| qwen3-vl:8b | Ollama Cloud | 0.2s | cannot unmarshal array into Go struct (wrong format) |
| qwen3-vl:32b | Ollama Cloud | 0.2s | cannot unmarshal array into Go struct (wrong format) |
| Qwen/Qwen2.5-32B-Instruct | Featherless | 15.7s | Qwen/Qwen2.5-32B-Instruct is temporarily at capacity. |
| Qwen/Qwen2.5-7B-Instruct | Featherless | 9.6s | Qwen/Qwen2.5-7B-Instruct is temporarily at capacity. |
| Qwen/Qwen3-32B | Featherless | 11.7s | Qwen/Qwen3-32B is temporarily at capacity. |
| Qwen/QwQ-32B | Featherless | 12s | Qwen/QwQ-32B is temporarily at capacity. |
| mistralai/Mistral-Small-3.1-24B-Instruct-2503 | Featherless | 13.1s | mistralai/Mistral-Small-3.1-24B-Instruct-2503 is temporarily at capacity. |
| mistralai/Magistral-Small-2506 | Featherless | 13.7s | mistralai/Magistral-Small-2506 is temporarily at capacity. |