Skip to Content

Reference Documentation

Complete technical reference for Inferno AI including API endpoints, CLI commands, system architecture, and changelog.

Reference Sections

API Reference

Complete OpenAI-compatible API documentation with request/response examples.

Includes:

CLI Reference

Comprehensive command-line interface documentation for all 45+ commands.

Includes:

Architecture

Deep dive into Inferno’s system architecture and design decisions.

Includes:

Changelog

Version history, release notes, and migration guides.

Includes:


Quick Reference

Common API Endpoints

# Chat completion
POST /v1/chat/completions
 
# List models
GET /v1/models
 
# Embeddings
POST /v1/embeddings

Common CLI Commands

# Run inference
inferno run --model MODEL --prompt "text"
 
# Start server
inferno serve --port 8080
 
# Manage models
inferno models list
inferno models download MODEL

API Quick Start

Authentication

curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{...}'

Basic Request

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-2-7b-chat",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

CLI Quick Start

Essential Commands

# Version and info
inferno --version
inferno info
 
# Models
inferno models download llama-2-7b-chat
inferno models list
 
# Run inference
inferno run --model llama-2-7b-chat --prompt "Hello"
 
# Start server
inferno serve