Complete reference for Featherless AI & Arli AI — text, vision, OCR, image generation, and upscaling
All endpoints require a Bearer token in the Authorization header:
Authorization: Bearer sk-your-api-key-here
All APIs use REST over HTTPS with JSON request/response bodies. Both providers follow OpenAI-compatible formats for text endpoints.
Base URL: https://api.featherless.ai/v1
Chat and text generation with conversation history support.
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | yes | Model identifier (e.g. Qwen/Qwen2.5-72B-Instruct) |
| messages | array | yes | Array of {role, content} objects. Roles: system, user, assistant |
| temperature | float | no | Sampling temperature (0–2). Lower = more focused. Default: 1.0 |
| top_p | float | no | Nucleus sampling threshold (0–1). Default: 1.0 |
| top_k | int | no | Top-K sampling. Limits to K most probable tokens |
| min_p | float | no | Minimum probability threshold. Removes tokens below this relative prob |
| max_tokens | int | no | Maximum tokens to generate. Default varies by model |
| stop | string/array | no | Stop sequences — generation halts when encountered |
| presence_penalty | float | no | Penalize new tokens based on presence (0–2) |
| frequency_penalty | float | no | Penalize new tokens based on frequency (0–2) |
| repetition_penalty | float | no | Penalty for repeating tokens (1.0 = off) |
| seed | int | no | Seed for deterministic sampling |
curl https://api.featherless.ai/v1/chat/completions \
-H "Authorization: Bearer sk-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-72B-Instruct",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in one sentence."}
],
"temperature": 0.7,
"max_tokens": 200
}'
Best for: chat, coding assistance, reasoning, OCR (with vision models)
Raw text completion (no chat formatting). Uses prompt (string) instead of messages. Same parameters otherwise.
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | yes | Model identifier |
| prompt | string | yes | Plain text prompt (replaces messages) |
| temperature | float | no | Sampling temperature. Default: 1.0 |
| max_tokens | int | no | Max tokens to generate |
| stop | string/array | no | Stop sequences |
| seed | int | no | Deterministic seed |
curl https://api.featherless.ai/v1/completions \
-H "Authorization: Bearer sk-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-72B-Instruct",
"prompt": "The meaning of life is",
"max_tokens": 50,
"temperature": 0.8
}'
Count tokens for a given input without generating a response.
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | yes | Model to count tokens for |
| messages | array | no | Chat messages to tokenize |
| prompt | string | no | Plain prompt to tokenize |
curl https://api.featherless.ai/v1/tokenize \
-H "Authorization: Bearer sk-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-72B-Instruct",
"messages": [{"role": "user", "content": "Hello, how are you?"}]
}'
List all available text models.
| Parameter | Type | Description |
|---|---|---|
| available_on_current_plan | int | Filter: 0 = all models, 1 = models available on your plan |
curl https://api.featherless.ai/v1/models \
-H "Authorization: Bearer sk-YOUR_KEY"
curl https://api.featherless.ai/v1/models?available_on_current_plan=1 \
-H "Authorization: Bearer sk-YOUR_KEY"
Vision models accept images alongside text in the messages array. Images can be provided via URL or base64.
| Model | Size | Notes |
|---|---|---|
| google/gemma-3-27b-it | 27B | Best OCR accuracy |
| google/gemma-3-12b-it | 12B | Good balance of speed/quality |
| google/gemma-3-4b-it | 4B | Fastest vision model |
| mistralai/Mistral-Small-3.1-24B-Instruct-2503 | 24B | Mistral Small 3.1 vision |
| mistralai/Mistral-Small-3.2-24B-Instruct-2506 | 24B | Mistral Small 3.2 vision |
| mistralai/Magistral-Small-2506-24B-Instruct | 24B | Magistral vision |
Use the content array format in a message. Each content item has a type of text or image_url.
{
"model": "google/gemma-3-27b-it",
"messages": [{
"role": "user",
"content": [
{ "type": "text", "text": "What is in this image?" },
{ "type": "image_url", "image_url": { "url": "https://example.com/photo.jpg" } }
]
}]
}
{
"model": "google/gemma-3-27b-it",
"messages": [{
"role": "user",
"content": [
{ "type": "text", "text": "Extract all text from this image." },
{ "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0KGgo..." } }
]
}]
}
curl https://api.featherless.ai/v1/chat/completions \
-H "Authorization: Bearer sk-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemma-3-27b-it",
"messages": [{
"role": "user",
"content": [
{ "type": "text", "text": "Extract all text from this image. Output only the text, nothing else." },
{ "type": "image_url", "image_url": { "url": "https://example.com/receipt.jpg" } }
]
}],
"max_tokens": 500
}'
Base URL: https://api.arliai.com/v1
Arli AI text endpoints are OpenAI-compatible with additional sampling and guidance parameters. Offers 117 text models.
Chat and text generation. Same base parameters as Featherless, plus Arli-specific extras:
| Parameter | Type | Description |
|---|---|---|
| model | string | Model identifier |
| messages | array | {role, content} conversation array |
| temperature | float | Sampling temperature (0–2) |
| top_p | float | Nucleus sampling (0–1) |
| top_k | int | Top-K sampling |
| max_tokens | int | Maximum tokens to generate |
| stop | string/array | Stop sequences |
| presence_penalty | float | Presence penalty (0–2) |
| frequency_penalty | float | Frequency penalty (0–2) |
| seed | int | Deterministic seed |
| Parameter | Type | Description |
|---|---|---|
| top_a | float | Top-A sampling — alternative to top_p |
| tfs | float | Tail-free sampling (0–1) |
| typical_p | float | Locally typical sampling threshold |
| min_p | float | Minimum probability threshold |
| repetition_penalty | float | Repetition penalty (1.0 = off) |
| no_repeat_ngram_size | int | Prevent repeating n-grams of this size |
| dry_multiplier | float | DRY (Dynamic Repetition Yield) multiplier |
| dry_base | float | DRY base value |
| mirostat | int | Mirostat mode (0 = off, 1 or 2) |
| guided_json | object | JSON schema to constrain output structure |
| guided_regex | string | Regex pattern to constrain output |
| guided_choice | array | Array of allowed choices for output |
curl https://api.arliai.com/v1/chat/completions \
-H "Authorization: Bearer sk-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-72B-Instruct",
"messages": [
{"role": "system", "content": "You are a coding expert."},
{"role": "user", "content": "Write a Python function to sort a list."}
],
"temperature": 0.6,
"top_p": 0.9,
"max_tokens": 500,
"repetition_penalty": 1.1
}'
Raw text completion. Uses prompt instead of messages. Same extended parameters as chat.
curl https://api.arliai.com/v1/completions \
-H "Authorization: Bearer sk-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-72B-Instruct",
"prompt": "Complete this: The future of AI is",
"max_tokens": 100,
"temperature": 0.8
}'
List all 117 available text models.
curl https://api.arliai.com/v1/models \
-H "Authorization: Bearer sk-YOUR_KEY"
Count tokens for a given input.
curl https://api.arliai.com/v1/tokenize \
-H "Authorization: Bearer sk-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-72B-Instruct",
"messages": [{"role": "user", "content": "Count my tokens"}]
}'
Base URL: https://api.arliai.com/v1
Arli AI provides Stable Diffusion-based image generation with 74 image models, including FLUX, SDXL, SD 1.5, and anime/pixel art variants.
Generate images from text prompts.
| Parameter | Type | Required | Description |
|---|---|---|---|
| sd_model_checkpoint | string | yes | Model name (e.g. FLUX.2-klein-4B) |
| prompt | string | yes | Text description of the desired image |
| negative_prompt | string | no | Things to avoid in the image |
| steps | int | no | Denoising steps. Higher = more quality, slower. Default: 20 |
| sampler_name | string | no | Sampler algorithm (e.g. euler, dpmpp_2m) |
| width | int | no | Image width in pixels |
| height | int | no | Image height in pixels |
| seed | int | no | Random seed for reproducibility. -1 = random |
| cfg_scale | float | no | Classifier-free guidance scale. Higher = closer to prompt. Default: 7 |
| batch_size | int | no | Number of images to generate. Default: 1 |
curl https://api.arliai.com/v1/txt2img \
-H "Authorization: Bearer sk-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"sd_model_checkpoint": "FLUX.2-klein-4B",
"prompt": "a serene mountain lake at sunset, photorealistic, 4K",
"negative_prompt": "blurry, low quality, distorted",
"steps": 25,
"sampler_name": "euler",
"width": 1024,
"height": 768,
"seed": -1,
"cfg_scale": 7
}' | python3 -c "
import sys, json, base64
data = json.load(sys.stdin)
img = base64.b64decode(data['images'][0])
with open('output.png', 'wb') as f: f.write(img)
print('Saved output.png')
"
Transform an existing image based on a text prompt.
| Parameter | Type | Required | Description |
|---|---|---|---|
| sd_model_checkpoint | string | yes | Model name |
| prompt | string | yes | Desired transformation |
| init_images | array | yes | Array of base64-encoded input images |
| negative_prompt | string | no | Things to avoid |
| denoising_strength | float | no | How much to alter the image (0–1). Higher = more change. Default: 0.75 |
| mask | string | no | Base64 mask image (white = inpaint, black = preserve) |
| mask_blur | int | no | Blur radius for mask edges |
| steps | int | no | Denoising steps |
| sampler_name | string | no | Sampler algorithm |
| cfg_scale | float | no | CFG scale |
curl https://api.arliai.com/v1/img2img \
-H "Authorization: Bearer sk-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"sd_model_checkpoint": "FLUX.2-klein-4B",
"prompt": "turn daytime scene into night with stars",
"init_images": ["'"$(base64 -w0 input.png)"'"],
"denoising_strength": 0.6,
"steps": 25,
"cfg_scale": 7
}'
Upscale an image using ESRGAN models.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | yes | Base64-encoded image to upscale |
| upscaler_1 | string | no | Upscaler model name. Default: ESRGAN_4x |
curl https://api.arliai.com/v1/upscale-img \
-H "Authorization: Bearer sk-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"image": "'"$(base64 -w0 photo.png)"'",
"upscaler_1": "ESRGAN_4x"
}' | python3 -c "
import sys, json, base64
data = json.load(sys.stdin)
img = base64.b64decode(data['images'][0])
with open('upscaled.png', 'wb') as f: f.write(img)
print('Saved upscaled.png (4x)')
"
List all 74 available image generation models.
curl https://api.arliai.com/sdapi/v1/sd-models \
-H "Authorization: Bearer sk-YOUR_KEY"
List available sampler algorithms.
curl https://api.arliai.com/v1/img-samplers \
-H "Authorization: Bearer sk-YOUR_KEY"
List available upscaler models.
curl https://api.arliai.com/v1/upscalers \
-H "Authorization: Bearer sk-YOUR_KEY"
Get current image generation options and defaults.
curl https://api.arliai.com/v1/img-options \
-H "Authorization: Bearer sk-YOUR_KEY"
Check rate limits for image generation.
curl https://api.arliai.com/v1/parallel-requests \
-H "Authorization: Bearer sk-YOUR_KEY"
Images accepted via URL or base64 (data URI format).
| Model | Best For | Image Input |
|---|---|---|
| google/gemma-3-27b-it | OCR, general vision | URL or base64 |
| google/gemma-3-12b-it | Balanced speed/quality | URL or base64 |
| google/gemma-3-4b-it | Fast vision analysis | URL or base64 |
| mistralai/Mistral-Small-3.1-24B-Instruct-2503 | Document understanding | URL or base64 |
| mistralai/Mistral-Small-3.2-24B-Instruct-2506 | Document understanding | URL or base64 |
| mistralai/Magistral-Small-2506-24B-Instruct | Multimodal tasks | URL or base64 |
Images accepted via base64 ONLY (URL input is not supported).
| Model | Best For | Image Input |
|---|---|---|
| Qwen/Qwen3.5-27B-Derestricted | General vision | base64 only |
| Qwen/Qwen3.5-VL | Document/vision | base64 only |
Use google/gemma-3-27b-it on Featherless with a clear OCR prompt:
{
"model": "google/gemma-3-27b-it",
"messages": [{
"role": "user",
"content": [
{ "type": "text", "text": "Extract all text from this image. Preserve formatting and line breaks." },
{ "type": "image_url", "image_url": { "url": "https://example.com/document.png" } }
]
}],
"max_tokens": 2000,
"temperature": 0.1
}
| Task | Provider | Best Model | Why |
|---|---|---|---|
| Chat | Featherless | Qwen/Qwen2.5-72B-Instruct | Largest context, smartest |
| Coding | Featherless | Qwen/Qwen2.5-Coder-32B-Instruct | Dedicated code model |
| Reasoning | Featherless | deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | Chain of thought |
| Vision / OCR | Featherless | google/gemma-3-27b-it | Best OCR accuracy |
| Image Gen | Arli AI | FLUX.2-klein-4B | Fastest, good quality |
| Anime Art | Arli AI | SDXL-zukiBestAnimeMix | Specialized anime model |
| Photo Realistic | Arli AI | LUMINA-Turbo | High quality photos |
| Upscaling | Arli AI | ESRGAN_4x | Only upscaler available |
| Fast Text | Featherless | Qwen/Qwen2.5-0.5B-Instruct | Under 1s responses |
| Creative Writing | Featherless | vicgalle/Roleplay-Llama-3-8B | Roleplay specialist |
Pick your goal and find the right endpoint and example.
Featherless — POST /v1/chat/completions with google/gemma-3-27b-it. Send image URL or base64 in messages content array.
Arli AI — POST /v1/txt2img with sd_model_checkpoint and prompt. Returns base64 image(s).
Featherless — POST /v1/chat/completions with google/gemma-3-27b-it. Use prompt: "Extract all text from this image." + image in content array. Set temperature: 0.1.
Arli AI — POST /v1/upscale-img with base64 image. Use upscaler_1: "ESRGAN_4x". Returns 4x upscaled base64 image.
Arli AI — POST /v1/img2img with init_images (base64 array) + prompt. Control change intensity with denoising_strength.
Featherless — POST /v1/chat/completions with Qwen/Qwen2.5-72B-Instruct. Maintain conversation history in the messages array.
Featherless — POST /v1/chat/completions with Qwen/Qwen2.5-Coder-32B-Instruct. Optimized for code generation, explanation, and debugging.
Featherless — POST /v1/chat/completions with vicgalle/Roleplay-Llama-3-8B. Specialized for creative and narrative tasks.
| Code | Meaning | Cause |
|---|---|---|
| 400 | Bad Request | Missing required parameters, invalid JSON, or malformed request body |
| 401 | Unauthorized | Missing or invalid Authorization header |
| 403 | Forbidden | API key does not have access to the requested model or endpoint |
| 404 | Not Found | Endpoint or model does not exist |
| 422 | Validation Error | Parameter values out of range or wrong type |
| 429 | Rate Limited | Too many requests. Retry after the time in the Retry-After header |
| 500 | Server Error | Internal server error. Retry with exponential backoff |
| 502/503 | Service Unavailable | Model is loading or service is temporarily down |
429 with Retry-After header.6 concurrent requests. Check GET /v1/parallel-requests for remaining capacity.429 by reading the Retry-After header and waiting before retrying500 errors (wait 1s, 2s, 4s...)GET /v1/models to verify model availability before sending requestsPOST /v1/tokenize to count tokens before sending large promptsInteractive web UIs for testing all providers without writing code.
| Playground | URL | Description |
|---|---|---|
| Main Hub | test.kim8.s4s.host | Central dashboard with links to all tools |
| Vision / OCR | vision.kim8.s4s.host | Upload images and analyze with vision models |
| Text Models | featherless.kim8.s4s.host/test.php | Text model playground with model selector |
| Per-Model | playground.php?model=X | Direct link to a specific model |
| Image Generation | generate.kim8.s4s.host | Text-to-image, img2img, upscaling |