← Documentation

API Reference

Complete reference for Featherless AI & Arli AI — text, vision, OCR, image generation, and upscaling

Contents

Overview

Authentication

All endpoints require a Bearer token in the Authorization header:

Authorization: Bearer sk-your-api-key-here

All APIs use REST over HTTPS with JSON request/response bodies. Both providers follow OpenAI-compatible formats for text endpoints.

Featherless AI — Text

Featherless

Base URL: https://api.featherless.ai/v1

POST /v1/chat/completions

Chat and text generation with conversation history support.

Parameters

ParameterTypeRequiredDescription
modelstringyesModel identifier (e.g. Qwen/Qwen2.5-72B-Instruct)
messagesarrayyesArray of {role, content} objects. Roles: system, user, assistant
temperaturefloatnoSampling temperature (0–2). Lower = more focused. Default: 1.0
top_pfloatnoNucleus sampling threshold (0–1). Default: 1.0
top_kintnoTop-K sampling. Limits to K most probable tokens
min_pfloatnoMinimum probability threshold. Removes tokens below this relative prob
max_tokensintnoMaximum tokens to generate. Default varies by model
stopstring/arraynoStop sequences — generation halts when encountered
presence_penaltyfloatnoPenalize new tokens based on presence (0–2)
frequency_penaltyfloatnoPenalize new tokens based on frequency (0–2)
repetition_penaltyfloatnoPenalty for repeating tokens (1.0 = off)
seedintnoSeed for deterministic sampling

Response

{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1742668800, "model": "Qwen/Qwen2.5-72B-Instruct", "choices": [{ "index": 0, "message": { "role": "assistant", "content": "Hello! How can I help you today?" }, "finish_reason": "stop" }], "usage": { "prompt_tokens": 15, "completion_tokens": 8, "total_tokens": 23 } }

Examples

curl https://api.featherless.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing in one sentence."}
    ],
    "temperature": 0.7,
    "max_tokens": 200
  }'

Best for: chat, coding assistance, reasoning, OCR (with vision models)

POST /v1/completions

Raw text completion (no chat formatting). Uses prompt (string) instead of messages. Same parameters otherwise.

Key Difference from Chat

ParameterTypeRequiredDescription
modelstringyesModel identifier
promptstringyesPlain text prompt (replaces messages)
temperaturefloatnoSampling temperature. Default: 1.0
max_tokensintnoMax tokens to generate
stopstring/arraynoStop sequences
seedintnoDeterministic seed

Response

{ "id": "cmpl-abc123", "object": "text_completion", "model": "Qwen/Qwen2.5-72B-Instruct", "choices": [{ "text": "The answer is 42.", "index": 0, "finish_reason": "stop" }], "usage": { "prompt_tokens": 8, "completion_tokens": 5, "total_tokens": 13 } }

Example

curl https://api.featherless.ai/v1/completions \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "prompt": "The meaning of life is",
    "max_tokens": 50,
    "temperature": 0.8
  }'

POST /v1/tokenize

Count tokens for a given input without generating a response.

Request Body

ParameterTypeRequiredDescription
modelstringyesModel to count tokens for
messagesarraynoChat messages to tokenize
promptstringnoPlain prompt to tokenize

Example

curl https://api.featherless.ai/v1/tokenize \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [{"role": "user", "content": "Hello, how are you?"}]
  }'

GET /v1/models

List all available text models.

Query Parameters

ParameterTypeDescription
available_on_current_planintFilter: 0 = all models, 1 = models available on your plan

Example

curl https://api.featherless.ai/v1/models \
  -H "Authorization: Bearer sk-YOUR_KEY"

curl https://api.featherless.ai/v1/models?available_on_current_plan=1 \
  -H "Authorization: Bearer sk-YOUR_KEY"
{ "object": "list", "data": [ { "id": "Qwen/Qwen2.5-72B-Instruct", "object": "model", "owned_by": "featherless" }, { "id": "Qwen/Qwen2.5-Coder-32B-Instruct", "object": "model", "owned_by": "featherless" } ] }

Featherless AI — Vision Models

Featherless

Vision models accept images alongside text in the messages array. Images can be provided via URL or base64.

Supported Vision Models

ModelSizeNotes
google/gemma-3-27b-it27BBest OCR accuracy
google/gemma-3-12b-it12BGood balance of speed/quality
google/gemma-3-4b-it4BFastest vision model
mistralai/Mistral-Small-3.1-24B-Instruct-250324BMistral Small 3.1 vision
mistralai/Mistral-Small-3.2-24B-Instruct-250624BMistral Small 3.2 vision
mistralai/Magistral-Small-2506-24B-Instruct24BMagistral vision

Sending an Image

Use the content array format in a message. Each content item has a type of text or image_url.

Via URL

{
  "model": "google/gemma-3-27b-it",
  "messages": [{
    "role": "user",
    "content": [
      { "type": "text", "text": "What is in this image?" },
      { "type": "image_url", "image_url": { "url": "https://example.com/photo.jpg" } }
    ]
  }]
}

Via Base64

{
  "model": "google/gemma-3-27b-it",
  "messages": [{
    "role": "user",
    "content": [
      { "type": "text", "text": "Extract all text from this image." },
      { "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0KGgo..." } }
    ]
  }]
}

Full cURL Example (OCR)

curl https://api.featherless.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemma-3-27b-it",
    "messages": [{
      "role": "user",
      "content": [
        { "type": "text", "text": "Extract all text from this image. Output only the text, nothing else." },
        { "type": "image_url", "image_url": { "url": "https://example.com/receipt.jpg" } }
      ]
    }],
    "max_tokens": 500
  }'

Arli AI — Text

Arli AI Text

Base URL: https://api.arliai.com/v1

Arli AI text endpoints are OpenAI-compatible with additional sampling and guidance parameters. Offers 117 text models.

POST /v1/chat/completions

Chat and text generation. Same base parameters as Featherless, plus Arli-specific extras:

Standard Parameters

ParameterTypeDescription
modelstringModel identifier
messagesarray{role, content} conversation array
temperaturefloatSampling temperature (0–2)
top_pfloatNucleus sampling (0–1)
top_kintTop-K sampling
max_tokensintMaximum tokens to generate
stopstring/arrayStop sequences
presence_penaltyfloatPresence penalty (0–2)
frequency_penaltyfloatFrequency penalty (0–2)
seedintDeterministic seed

Arli-Specific Parameters

ParameterTypeDescription
top_afloatTop-A sampling — alternative to top_p
tfsfloatTail-free sampling (0–1)
typical_pfloatLocally typical sampling threshold
min_pfloatMinimum probability threshold
repetition_penaltyfloatRepetition penalty (1.0 = off)
no_repeat_ngram_sizeintPrevent repeating n-grams of this size
dry_multiplierfloatDRY (Dynamic Repetition Yield) multiplier
dry_basefloatDRY base value
mirostatintMirostat mode (0 = off, 1 or 2)
guided_jsonobjectJSON schema to constrain output structure
guided_regexstringRegex pattern to constrain output
guided_choicearrayArray of allowed choices for output

Example

curl https://api.arliai.com/v1/chat/completions \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [
      {"role": "system", "content": "You are a coding expert."},
      {"role": "user", "content": "Write a Python function to sort a list."}
    ],
    "temperature": 0.6,
    "top_p": 0.9,
    "max_tokens": 500,
    "repetition_penalty": 1.1
  }'

POST /v1/completions

Raw text completion. Uses prompt instead of messages. Same extended parameters as chat.

curl https://api.arliai.com/v1/completions \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "prompt": "Complete this: The future of AI is",
    "max_tokens": 100,
    "temperature": 0.8
  }'

GET /v1/models

List all 117 available text models.

curl https://api.arliai.com/v1/models \
  -H "Authorization: Bearer sk-YOUR_KEY"

POST /v1/tokenize

Count tokens for a given input.

curl https://api.arliai.com/v1/tokenize \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [{"role": "user", "content": "Count my tokens"}]
  }'

Arli AI — Image Generation

Arli AI Image

Base URL: https://api.arliai.com/v1

Arli AI provides Stable Diffusion-based image generation with 74 image models, including FLUX, SDXL, SD 1.5, and anime/pixel art variants.

POST /v1/txt2img

Generate images from text prompts.

Parameters

ParameterTypeRequiredDescription
sd_model_checkpointstringyesModel name (e.g. FLUX.2-klein-4B)
promptstringyesText description of the desired image
negative_promptstringnoThings to avoid in the image
stepsintnoDenoising steps. Higher = more quality, slower. Default: 20
sampler_namestringnoSampler algorithm (e.g. euler, dpmpp_2m)
widthintnoImage width in pixels
heightintnoImage height in pixels
seedintnoRandom seed for reproducibility. -1 = random
cfg_scalefloatnoClassifier-free guidance scale. Higher = closer to prompt. Default: 7
batch_sizeintnoNumber of images to generate. Default: 1

Response

{ "images": ["iVBORw0KGgoAAAANSUhEUgAA...base64..."], "info": { "prompt": "a sunset over mountains", "seed": 1234567890, "steps": 20, "cfg_scale": 7, "sampler_name": "euler", "width": 512, "height": 512 } }

Example

curl https://api.arliai.com/v1/txt2img \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sd_model_checkpoint": "FLUX.2-klein-4B",
    "prompt": "a serene mountain lake at sunset, photorealistic, 4K",
    "negative_prompt": "blurry, low quality, distorted",
    "steps": 25,
    "sampler_name": "euler",
    "width": 1024,
    "height": 768,
    "seed": -1,
    "cfg_scale": 7
  }' | python3 -c "
import sys, json, base64
data = json.load(sys.stdin)
img = base64.b64decode(data['images'][0])
with open('output.png', 'wb') as f: f.write(img)
print('Saved output.png')
"

POST /v1/img2img

Transform an existing image based on a text prompt.

Parameters

ParameterTypeRequiredDescription
sd_model_checkpointstringyesModel name
promptstringyesDesired transformation
init_imagesarrayyesArray of base64-encoded input images
negative_promptstringnoThings to avoid
denoising_strengthfloatnoHow much to alter the image (0–1). Higher = more change. Default: 0.75
maskstringnoBase64 mask image (white = inpaint, black = preserve)
mask_blurintnoBlur radius for mask edges
stepsintnoDenoising steps
sampler_namestringnoSampler algorithm
cfg_scalefloatnoCFG scale

Example

curl https://api.arliai.com/v1/img2img \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sd_model_checkpoint": "FLUX.2-klein-4B",
    "prompt": "turn daytime scene into night with stars",
    "init_images": ["'"$(base64 -w0 input.png)"'"],
    "denoising_strength": 0.6,
    "steps": 25,
    "cfg_scale": 7
  }'

Arli AI — Upscaling

Arli AI Image

POST /v1/upscale-img

Upscale an image using ESRGAN models.

Parameters

ParameterTypeRequiredDescription
imagestringyesBase64-encoded image to upscale
upscaler_1stringnoUpscaler model name. Default: ESRGAN_4x

Response

{ "images": ["iVBORw0KGgoAAAANSUhEUgAA...base64..."], "info": { "upscaler": "ESGAN_4x", "scale_factor": 4 } }

Example

curl https://api.arliai.com/v1/upscale-img \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image": "'"$(base64 -w0 photo.png)"'",
    "upscaler_1": "ESRGAN_4x"
  }' | python3 -c "
import sys, json, base64
data = json.load(sys.stdin)
img = base64.b64decode(data['images'][0])
with open('upscaled.png', 'wb') as f: f.write(img)
print('Saved upscaled.png (4x)')
"

Arli AI — Utility Endpoints

Arli AI Image

GET /sdapi/v1/sd-models

List all 74 available image generation models.

curl https://api.arliai.com/sdapi/v1/sd-models \
  -H "Authorization: Bearer sk-YOUR_KEY"
[ { "title": "FLUX.2-klein-4B", "model_name": "FLUX.2-klein-4B" }, { "title": "SDXL-zukiBestAnimeMix", "model_name": "SDXL-zukiBestAnimeMix" }, { "title": "LUMINA-Turbo", "model_name": "LUMINA-Turbo" } ]

GET /v1/img-samplers

List available sampler algorithms.

curl https://api.arliai.com/v1/img-samplers \
  -H "Authorization: Bearer sk-YOUR_KEY"

GET /v1/upscalers

List available upscaler models.

curl https://api.arliai.com/v1/upscalers \
  -H "Authorization: Bearer sk-YOUR_KEY"

GET /v1/img-options

Get current image generation options and defaults.

curl https://api.arliai.com/v1/img-options \
  -H "Authorization: Bearer sk-YOUR_KEY"

GET /v1/parallel-requests

Check rate limits for image generation.

curl https://api.arliai.com/v1/parallel-requests \
  -H "Authorization: Bearer sk-YOUR_KEY"
{ "parallel_requests": 6, "remaining": 5, "message": "You can run up to 6 parallel requests" }

Vision / OCR Guide

Featherless AI Vision

Images accepted via URL or base64 (data URI format).

ModelBest ForImage Input
google/gemma-3-27b-itOCR, general visionURL or base64
google/gemma-3-12b-itBalanced speed/qualityURL or base64
google/gemma-3-4b-itFast vision analysisURL or base64
mistralai/Mistral-Small-3.1-24B-Instruct-2503Document understandingURL or base64
mistralai/Mistral-Small-3.2-24B-Instruct-2506Document understandingURL or base64
mistralai/Magistral-Small-2506-24B-InstructMultimodal tasksURL or base64

Arli AI Vision

Images accepted via base64 ONLY (URL input is not supported).

ModelBest ForImage Input
Qwen/Qwen3.5-27B-DerestrictedGeneral visionbase64 only
Qwen/Qwen3.5-VLDocument/visionbase64 only

OCR Best Practice

Use google/gemma-3-27b-it on Featherless with a clear OCR prompt:

{
  "model": "google/gemma-3-27b-it",
  "messages": [{
    "role": "user",
    "content": [
      { "type": "text", "text": "Extract all text from this image. Preserve formatting and line breaks." },
      { "type": "image_url", "image_url": { "url": "https://example.com/document.png" } }
    ]
  }],
  "max_tokens": 2000,
  "temperature": 0.1
}

Model Recommendations

TaskProviderBest ModelWhy
ChatFeatherlessQwen/Qwen2.5-72B-InstructLargest context, smartest
CodingFeatherlessQwen/Qwen2.5-Coder-32B-InstructDedicated code model
ReasoningFeatherlessdeepseek-ai/DeepSeek-R1-Distill-Qwen-32BChain of thought
Vision / OCRFeatherlessgoogle/gemma-3-27b-itBest OCR accuracy
Image GenArli AIFLUX.2-klein-4BFastest, good quality
Anime ArtArli AISDXL-zukiBestAnimeMixSpecialized anime model
Photo RealisticArli AILUMINA-TurboHigh quality photos
UpscalingArli AIESRGAN_4xOnly upscaler available
Fast TextFeatherlessQwen/Qwen2.5-0.5B-InstructUnder 1s responses
Creative WritingFeatherlessvicgalle/Roleplay-Llama-3-8BRoleplay specialist

Quick Start by Use Case

Pick your goal and find the right endpoint and example.

Analyze a screenshot

FeatherlessPOST /v1/chat/completions with google/gemma-3-27b-it. Send image URL or base64 in messages content array.

Generate an image

Arli AIPOST /v1/txt2img with sd_model_checkpoint and prompt. Returns base64 image(s).

Extract text from an image (OCR)

FeatherlessPOST /v1/chat/completions with google/gemma-3-27b-it. Use prompt: "Extract all text from this image." + image in content array. Set temperature: 0.1.

Upscale an image

Arli AIPOST /v1/upscale-img with base64 image. Use upscaler_1: "ESRGAN_4x". Returns 4x upscaled base64 image.

Edit / transform an image

Arli AIPOST /v1/img2img with init_images (base64 array) + prompt. Control change intensity with denoising_strength.

Build a chatbot

FeatherlessPOST /v1/chat/completions with Qwen/Qwen2.5-72B-Instruct. Maintain conversation history in the messages array.

Get coding help

FeatherlessPOST /v1/chat/completions with Qwen/Qwen2.5-Coder-32B-Instruct. Optimized for code generation, explanation, and debugging.

Creative writing / roleplay

FeatherlessPOST /v1/chat/completions with vicgalle/Roleplay-Llama-3-8B. Specialized for creative and narrative tasks.

Error Handling

Common Error Codes

CodeMeaningCause
400Bad RequestMissing required parameters, invalid JSON, or malformed request body
401UnauthorizedMissing or invalid Authorization header
403ForbiddenAPI key does not have access to the requested model or endpoint
404Not FoundEndpoint or model does not exist
422Validation ErrorParameter values out of range or wrong type
429Rate LimitedToo many requests. Retry after the time in the Retry-After header
500Server ErrorInternal server error. Retry with exponential backoff
502/503Service UnavailableModel is loading or service is temporarily down

Error Response Format

{ "error": { "message": "Model 'nonexistent-model' not found", "type": "invalid_request_error", "code": "model_not_found" } }

Rate Limiting Behavior

  • Featherless: Subject to per-plan request limits. Returns 429 with Retry-After header.
  • Arli AI Text: Standard rate limits apply per API key.
  • Arli AI Image: Maximum 6 concurrent requests. Check GET /v1/parallel-requests for remaining capacity.

Best Practices

  • Always handle 429 by reading the Retry-After header and waiting before retrying
  • Use exponential backoff for 500 errors (wait 1s, 2s, 4s...)
  • Check GET /v1/models to verify model availability before sending requests
  • Use POST /v1/tokenize to count tokens before sending large prompts

Playground Links

Interactive web UIs for testing all providers without writing code.

PlaygroundURLDescription
Main Hubtest.kim8.s4s.hostCentral dashboard with links to all tools
Vision / OCRvision.kim8.s4s.hostUpload images and analyze with vision models
Text Modelsfeatherless.kim8.s4s.host/test.phpText model playground with model selector
Per-Modelplayground.php?model=XDirect link to a specific model
Image Generationgenerate.kim8.s4s.hostText-to-image, img2img, upscaling