← Documentation

API Reference

Complete reference for Featherless AI & Arli AI — text, vision, OCR, image generation, and upscaling

Overview
Featherless AI — Text Endpoints
Featherless AI — Vision Models
Arli AI — Text Endpoints
Arli AI — Image Generation
Arli AI — Upscaling
Arli AI — Utility Endpoints
Vision / OCR Guide
Model Recommendations
Quick Start by Use Case
Error Handling
Playground Links

Overview

Featherless AI

Text generation, chat, vision, OCR, tokenization

Base: https://api.featherless.ai/v1

Arli AI — Text

Text generation, chat, 117 models, extended params

Base: https://api.arliai.com/v1

Arli AI — Image

Text-to-image, img2img, upscaling, 74 image models

Base: https://api.arliai.com/v1

Authentication

All endpoints require a Bearer token in the Authorization header:

Authorization: Bearer sk-your-api-key-here

All APIs use REST over HTTPS with JSON request/response bodies. Both providers follow OpenAI-compatible formats for text endpoints.

Featherless AI — Text

Featherless

Base URL: https://api.featherless.ai/v1

POST /v1/chat/completions

Chat and text generation with conversation history support.

Parameters

Parameter	Type	Required	Description
model	string	yes	Model identifier (e.g. `Qwen/Qwen2.5-72B-Instruct`)
messages	array	yes	Array of `{role, content}` objects. Roles: `system`, `user`, `assistant`
temperature	float	no	Sampling temperature (0–2). Lower = more focused. Default: `1.0`
top_p	float	no	Nucleus sampling threshold (0–1). Default: `1.0`
top_k	int	no	Top-K sampling. Limits to K most probable tokens
min_p	float	no	Minimum probability threshold. Removes tokens below this relative prob
max_tokens	int	no	Maximum tokens to generate. Default varies by model
stop	string/array	no	Stop sequences — generation halts when encountered
presence_penalty	float	no	Penalize new tokens based on presence (0–2)
frequency_penalty	float	no	Penalize new tokens based on frequency (0–2)
repetition_penalty	float	no	Penalty for repeating tokens (1.0 = off)
seed	int	no	Seed for deterministic sampling

Response

{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1742668800, "model": "Qwen/Qwen2.5-72B-Instruct", "choices": [{ "index": 0, "message": { "role": "assistant", "content": "Hello! How can I help you today?" }, "finish_reason": "stop" }], "usage": { "prompt_tokens": 15, "completion_tokens": 8, "total_tokens": 23 } }

Examples

curl https://api.featherless.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing in one sentence."}
    ],
    "temperature": 0.7,
    "max_tokens": 200
  }'

<?php
$ch = curl_init('https://api.featherless.ai/v1/chat/completions');
curl_setopt_array($ch, [
    CURLOPT_POST => true,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_HTTPHEADER => [
        'Authorization: Bearer sk-YOUR_KEY',
        'Content-Type: application/json'
    ],
    CURLOPT_POSTFIELDS => json_encode([
        'model' => 'Qwen/Qwen2.5-72B-Instruct',
        'messages' => [
            ['role' => 'system', 'content' => 'You are a helpful assistant.'],
            ['role' => 'user', 'content' => 'Explain quantum computing in one sentence.']
        ],
        'temperature' => 0.7,
        'max_tokens' => 200
    ])
]);
$response = json_decode(curl_exec($ch), true);
echo $response['choices'][0]['message']['content'];

const response = await fetch('https://api.featherless.ai/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-YOUR_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'Qwen/Qwen2.5-72B-Instruct',
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'Explain quantum computing in one sentence.' }
    ],
    temperature: 0.7,
    max_tokens: 200
  })
});
const data = await response.json();
console.log(data.choices[0].message.content);

Best for: chat, coding assistance, reasoning, OCR (with vision models)

POST /v1/completions

Raw text completion (no chat formatting). Uses prompt (string) instead of messages. Same parameters otherwise.

Key Difference from Chat

Parameter	Type	Required	Description
model	string	yes	Model identifier
prompt	string	yes	Plain text prompt (replaces `messages`)
temperature	float	no	Sampling temperature. Default: `1.0`
max_tokens	int	no	Max tokens to generate
stop	string/array	no	Stop sequences
seed	int	no	Deterministic seed

Response

{ "id": "cmpl-abc123", "object": "text_completion", "model": "Qwen/Qwen2.5-72B-Instruct", "choices": [{ "text": "The answer is 42.", "index": 0, "finish_reason": "stop" }], "usage": { "prompt_tokens": 8, "completion_tokens": 5, "total_tokens": 13 } }

Example

curl https://api.featherless.ai/v1/completions \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "prompt": "The meaning of life is",
    "max_tokens": 50,
    "temperature": 0.8
  }'

POST /v1/tokenize

Count tokens for a given input without generating a response.

Request Body

Parameter	Type	Required	Description
model	string	yes	Model to count tokens for
messages	array	no	Chat messages to tokenize
prompt	string	no	Plain prompt to tokenize

Example

curl https://api.featherless.ai/v1/tokenize \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [{"role": "user", "content": "Hello, how are you?"}]
  }'

GET /v1/models

List all available text models.

Query Parameters

Parameter	Type	Description
available_on_current_plan	int	Filter: `0` = all models, `1` = models available on your plan

Example

curl https://api.featherless.ai/v1/models \
  -H "Authorization: Bearer sk-YOUR_KEY"

curl https://api.featherless.ai/v1/models?available_on_current_plan=1 \
  -H "Authorization: Bearer sk-YOUR_KEY"

{ "object": "list", "data": [ { "id": "Qwen/Qwen2.5-72B-Instruct", "object": "model", "owned_by": "featherless" }, { "id": "Qwen/Qwen2.5-Coder-32B-Instruct", "object": "model", "owned_by": "featherless" } ] }

Featherless AI — Vision Models

Featherless

Vision models accept images alongside text in the messages array. Images can be provided via URL or base64.

Supported Vision Models

Model	Size	Notes
google/gemma-3-27b-it	27B	Best OCR accuracy
google/gemma-3-12b-it	12B	Good balance of speed/quality
google/gemma-3-4b-it	4B	Fastest vision model
mistralai/Mistral-Small-3.1-24B-Instruct-2503	24B	Mistral Small 3.1 vision
mistralai/Mistral-Small-3.2-24B-Instruct-2506	24B	Mistral Small 3.2 vision
mistralai/Magistral-Small-2506-24B-Instruct	24B	Magistral vision

Sending an Image

Use the content array format in a message. Each content item has a type of text or image_url.

Via URL

{
  "model": "google/gemma-3-27b-it",
  "messages": [{
    "role": "user",
    "content": [
      { "type": "text", "text": "What is in this image?" },
      { "type": "image_url", "image_url": { "url": "https://example.com/photo.jpg" } }
    ]
  }]
}

Via Base64

{
  "model": "google/gemma-3-27b-it",
  "messages": [{
    "role": "user",
    "content": [
      { "type": "text", "text": "Extract all text from this image." },
      { "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0KGgo..." } }
    ]
  }]
}

Full cURL Example (OCR)

curl https://api.featherless.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemma-3-27b-it",
    "messages": [{
      "role": "user",
      "content": [
        { "type": "text", "text": "Extract all text from this image. Output only the text, nothing else." },
        { "type": "image_url", "image_url": { "url": "https://example.com/receipt.jpg" } }
      ]
    }],
    "max_tokens": 500
  }'

Arli AI — Text

Arli AI Text

Base URL: https://api.arliai.com/v1

Arli AI text endpoints are OpenAI-compatible with additional sampling and guidance parameters. Offers 117 text models.

POST /v1/chat/completions

Chat and text generation. Same base parameters as Featherless, plus Arli-specific extras:

Standard Parameters

Parameter	Type	Description
model	string	Model identifier
messages	array	`{role, content}` conversation array
temperature	float	Sampling temperature (0–2)
top_p	float	Nucleus sampling (0–1)
top_k	int	Top-K sampling
max_tokens	int	Maximum tokens to generate
stop	string/array	Stop sequences
presence_penalty	float	Presence penalty (0–2)
frequency_penalty	float	Frequency penalty (0–2)
seed	int	Deterministic seed

Arli-Specific Parameters

Parameter	Type	Description
top_a	float	Top-A sampling — alternative to top_p
tfs	float	Tail-free sampling (0–1)
typical_p	float	Locally typical sampling threshold
min_p	float	Minimum probability threshold
repetition_penalty	float	Repetition penalty (1.0 = off)
no_repeat_ngram_size	int	Prevent repeating n-grams of this size
dry_multiplier	float	DRY (Dynamic Repetition Yield) multiplier
dry_base	float	DRY base value
mirostat	int	Mirostat mode (`0` = off, `1` or `2`)
guided_json	object	JSON schema to constrain output structure
guided_regex	string	Regex pattern to constrain output
guided_choice	array	Array of allowed choices for output

Example

curl https://api.arliai.com/v1/chat/completions \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [
      {"role": "system", "content": "You are a coding expert."},
      {"role": "user", "content": "Write a Python function to sort a list."}
    ],
    "temperature": 0.6,
    "top_p": 0.9,
    "max_tokens": 500,
    "repetition_penalty": 1.1
  }'

POST /v1/completions

Raw text completion. Uses prompt instead of messages. Same extended parameters as chat.

curl https://api.arliai.com/v1/completions \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "prompt": "Complete this: The future of AI is",
    "max_tokens": 100,
    "temperature": 0.8
  }'

GET /v1/models

List all 117 available text models.

curl https://api.arliai.com/v1/models \
  -H "Authorization: Bearer sk-YOUR_KEY"

POST /v1/tokenize

Count tokens for a given input.

curl https://api.arliai.com/v1/tokenize \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [{"role": "user", "content": "Count my tokens"}]
  }'

Arli AI — Image Generation

Arli AI Image

Base URL: https://api.arliai.com/v1

Arli AI provides Stable Diffusion-based image generation with 74 image models, including FLUX, SDXL, SD 1.5, and anime/pixel art variants.

POST /v1/txt2img

Generate images from text prompts.

Parameters

Parameter	Type	Required	Description
sd_model_checkpoint	string	yes	Model name (e.g. `FLUX.2-klein-4B`)
prompt	string	yes	Text description of the desired image
negative_prompt	string	no	Things to avoid in the image
steps	int	no	Denoising steps. Higher = more quality, slower. Default: `20`
sampler_name	string	no	Sampler algorithm (e.g. `euler`, `dpmpp_2m`)
width	int	no	Image width in pixels
height	int	no	Image height in pixels
seed	int	no	Random seed for reproducibility. `-1` = random
cfg_scale	float	no	Classifier-free guidance scale. Higher = closer to prompt. Default: `7`
batch_size	int	no	Number of images to generate. Default: `1`

Response

{ "images": ["iVBORw0KGgoAAAANSUhEUgAA...base64..."], "info": { "prompt": "a sunset over mountains", "seed": 1234567890, "steps": 20, "cfg_scale": 7, "sampler_name": "euler", "width": 512, "height": 512 } }

Example

curl https://api.arliai.com/v1/txt2img \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sd_model_checkpoint": "FLUX.2-klein-4B",
    "prompt": "a serene mountain lake at sunset, photorealistic, 4K",
    "negative_prompt": "blurry, low quality, distorted",
    "steps": 25,
    "sampler_name": "euler",
    "width": 1024,
    "height": 768,
    "seed": -1,
    "cfg_scale": 7
  }' | python3 -c "
import sys, json, base64
data = json.load(sys.stdin)
img = base64.b64decode(data['images'][0])
with open('output.png', 'wb') as f: f.write(img)
print('Saved output.png')
"

POST /v1/img2img

Transform an existing image based on a text prompt.

Parameters

Parameter	Type	Required	Description
sd_model_checkpoint	string	yes	Model name
prompt	string	yes	Desired transformation
init_images	array	yes	Array of base64-encoded input images
negative_prompt	string	no	Things to avoid
denoising_strength	float	no	How much to alter the image (0–1). Higher = more change. Default: `0.75`
mask	string	no	Base64 mask image (white = inpaint, black = preserve)
mask_blur	int	no	Blur radius for mask edges
steps	int	no	Denoising steps
sampler_name	string	no	Sampler algorithm
cfg_scale	float	no	CFG scale

Example

curl https://api.arliai.com/v1/img2img \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sd_model_checkpoint": "FLUX.2-klein-4B",
    "prompt": "turn daytime scene into night with stars",
    "init_images": ["'"$(base64 -w0 input.png)"'"],
    "denoising_strength": 0.6,
    "steps": 25,
    "cfg_scale": 7
  }'

Arli AI — Upscaling

Arli AI Image

POST /v1/upscale-img

Upscale an image using ESRGAN models.

Parameters

Parameter	Type	Required	Description
image	string	yes	Base64-encoded image to upscale
upscaler_1	string	no	Upscaler model name. Default: `ESRGAN_4x`

Response

{ "images": ["iVBORw0KGgoAAAANSUhEUgAA...base64..."], "info": { "upscaler": "ESGAN_4x", "scale_factor": 4 } }

Example

curl https://api.arliai.com/v1/upscale-img \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image": "'"$(base64 -w0 photo.png)"'",
    "upscaler_1": "ESRGAN_4x"
  }' | python3 -c "
import sys, json, base64
data = json.load(sys.stdin)
img = base64.b64decode(data['images'][0])
with open('upscaled.png', 'wb') as f: f.write(img)
print('Saved upscaled.png (4x)')
"

Arli AI — Utility Endpoints

Arli AI Image

GET /sdapi/v1/sd-models

List all 74 available image generation models.

curl https://api.arliai.com/sdapi/v1/sd-models \
  -H "Authorization: Bearer sk-YOUR_KEY"

[ { "title": "FLUX.2-klein-4B", "model_name": "FLUX.2-klein-4B" }, { "title": "SDXL-zukiBestAnimeMix", "model_name": "SDXL-zukiBestAnimeMix" }, { "title": "LUMINA-Turbo", "model_name": "LUMINA-Turbo" } ]

GET /v1/img-samplers

List available sampler algorithms.

curl https://api.arliai.com/v1/img-samplers \
  -H "Authorization: Bearer sk-YOUR_KEY"

GET /v1/upscalers

List available upscaler models.

curl https://api.arliai.com/v1/upscalers \
  -H "Authorization: Bearer sk-YOUR_KEY"

GET /v1/img-options

Get current image generation options and defaults.

curl https://api.arliai.com/v1/img-options \
  -H "Authorization: Bearer sk-YOUR_KEY"

GET /v1/parallel-requests

Check rate limits for image generation.

curl https://api.arliai.com/v1/parallel-requests \
  -H "Authorization: Bearer sk-YOUR_KEY"

{ "parallel_requests": 6, "remaining": 5, "message": "You can run up to 6 parallel requests" }

Vision / OCR Guide

Featherless AI Vision

Images accepted via URL or base64 (data URI format).

Model	Best For	Image Input
google/gemma-3-27b-it	OCR, general vision	URL or base64
google/gemma-3-12b-it	Balanced speed/quality	URL or base64
google/gemma-3-4b-it	Fast vision analysis	URL or base64
mistralai/Mistral-Small-3.1-24B-Instruct-2503	Document understanding	URL or base64
mistralai/Mistral-Small-3.2-24B-Instruct-2506	Document understanding	URL or base64
mistralai/Magistral-Small-2506-24B-Instruct	Multimodal tasks	URL or base64

Arli AI Vision

Images accepted via base64 ONLY (URL input is not supported).

Model	Best For	Image Input
Qwen/Qwen3.5-27B-Derestricted	General vision	base64 only
Qwen/Qwen3.5-VL	Document/vision	base64 only

OCR Best Practice

Use google/gemma-3-27b-it on Featherless with a clear OCR prompt:

{
  "model": "google/gemma-3-27b-it",
  "messages": [{
    "role": "user",
    "content": [
      { "type": "text", "text": "Extract all text from this image. Preserve formatting and line breaks." },
      { "type": "image_url", "image_url": { "url": "https://example.com/document.png" } }
    ]
  }],
  "max_tokens": 2000,
  "temperature": 0.1
}

Model Recommendations

Task	Provider	Best Model	Why
Chat	Featherless	Qwen/Qwen2.5-72B-Instruct	Largest context, smartest
Coding	Featherless	Qwen/Qwen2.5-Coder-32B-Instruct	Dedicated code model
Reasoning	Featherless	deepseek-ai/DeepSeek-R1-Distill-Qwen-32B	Chain of thought
Vision / OCR	Featherless	google/gemma-3-27b-it	Best OCR accuracy
Image Gen	Arli AI	FLUX.2-klein-4B	Fastest, good quality
Anime Art	Arli AI	SDXL-zukiBestAnimeMix	Specialized anime model
Photo Realistic	Arli AI	LUMINA-Turbo	High quality photos
Upscaling	Arli AI	ESRGAN_4x	Only upscaler available
Fast Text	Featherless	Qwen/Qwen2.5-0.5B-Instruct	Under 1s responses
Creative Writing	Featherless	vicgalle/Roleplay-Llama-3-8B	Roleplay specialist

Quick Start by Use Case

Pick your goal and find the right endpoint and example.

Analyze a screenshot

Featherless — POST /v1/chat/completions with google/gemma-3-27b-it. Send image URL or base64 in messages content array.

Generate an image

Arli AI — POST /v1/txt2img with sd_model_checkpoint and prompt. Returns base64 image(s).

Extract text from an image (OCR)

Featherless — POST /v1/chat/completions with google/gemma-3-27b-it. Use prompt: "Extract all text from this image." + image in content array. Set temperature: 0.1.

Upscale an image

Arli AI — POST /v1/upscale-img with base64 image. Use upscaler_1: "ESRGAN_4x". Returns 4x upscaled base64 image.

Edit / transform an image

Arli AI — POST /v1/img2img with init_images (base64 array) + prompt. Control change intensity with denoising_strength.

Build a chatbot

Featherless — POST /v1/chat/completions with Qwen/Qwen2.5-72B-Instruct. Maintain conversation history in the messages array.

Get coding help

Featherless — POST /v1/chat/completions with Qwen/Qwen2.5-Coder-32B-Instruct. Optimized for code generation, explanation, and debugging.

Creative writing / roleplay

Featherless — POST /v1/chat/completions with vicgalle/Roleplay-Llama-3-8B. Specialized for creative and narrative tasks.

Error Handling

Common Error Codes

Code	Meaning	Cause
400	Bad Request	Missing required parameters, invalid JSON, or malformed request body
401	Unauthorized	Missing or invalid `Authorization` header
403	Forbidden	API key does not have access to the requested model or endpoint
404	Not Found	Endpoint or model does not exist
422	Validation Error	Parameter values out of range or wrong type
429	Rate Limited	Too many requests. Retry after the time in the `Retry-After` header
500	Server Error	Internal server error. Retry with exponential backoff
502/503	Service Unavailable	Model is loading or service is temporarily down

Error Response Format

{ "error": { "message": "Model 'nonexistent-model' not found", "type": "invalid_request_error", "code": "model_not_found" } }

Rate Limiting Behavior

Featherless: Subject to per-plan request limits. Returns 429 with Retry-After header.
Arli AI Text: Standard rate limits apply per API key.
Arli AI Image: Maximum 6 concurrent requests. Check GET /v1/parallel-requests for remaining capacity.

Best Practices

Always handle 429 by reading the Retry-After header and waiting before retrying
Use exponential backoff for 500 errors (wait 1s, 2s, 4s...)
Check GET /v1/models to verify model availability before sending requests
Use POST /v1/tokenize to count tokens before sending large prompts

Playground Links

Interactive web UIs for testing all providers without writing code.

Playground	URL	Description
Main Hub	test.kim8.s4s.host	Central dashboard with links to all tools
Vision / OCR	vision.kim8.s4s.host	Upload images and analyze with vision models
Text Models	featherless.kim8.s4s.host/test.php	Text model playground with model selector
Per-Model	playground.php?model=X	Direct link to a specific model
Image Generation	generate.kim8.s4s.host	Text-to-image, img2img, upscaling

API Reference

Contents

Overview

Authentication

Featherless AI — Text

POST /v1/chat/completions

Parameters

Response

Examples

POST /v1/completions

Key Difference from Chat

Response

Example

POST /v1/tokenize

Request Body

Example

GET /v1/models

Query Parameters

Example

Featherless AI — Vision Models

Supported Vision Models

Sending an Image

Via URL

Via Base64

Full cURL Example (OCR)

Arli AI — Text

POST /v1/chat/completions

Standard Parameters

Arli-Specific Parameters

Example

POST /v1/completions

GET /v1/models

POST /v1/tokenize

Arli AI — Image Generation

POST /v1/txt2img

Parameters

Response

Example

POST /v1/img2img

Parameters

Example

Arli AI — Upscaling

POST /v1/upscale-img

Parameters

Response

Example

Arli AI — Utility Endpoints

GET /sdapi/v1/sd-models

GET /v1/img-samplers

GET /v1/upscalers

GET /v1/img-options

GET /v1/parallel-requests

Vision / OCR Guide

Featherless AI Vision

Arli AI Vision

OCR Best Practice

Model Recommendations

Quick Start by Use Case

Analyze a screenshot

Generate an image

Extract text from an image (OCR)

Upscale an image

Edit / transform an image

Build a chatbot

Get coding help

Creative writing / roleplay

Error Handling

Common Error Codes

Error Response Format

Rate Limiting Behavior

Best Practices

Playground Links