Reference Documentation

Complete technical reference for Inferno AI including API endpoints, CLI commands, system architecture, and changelog.

Reference Sections

API Reference

Complete OpenAI-compatible API documentation with request/response examples.

Includes:

Chat Completions
Embeddings
Model Management
Authentication
Error Handling

CLI Reference

Comprehensive command-line interface documentation for all 45+ commands.

Includes:

Model commands
Server commands
Configuration
Utilities
Examples

Architecture

Deep dive into Inferno’s system architecture and design decisions.

Includes:

System Overview
Component Architecture
GPU Integration
Performance Design

Changelog

Version history, release notes, and migration guides.

Includes:

Latest releases
Breaking changes
New features
Bug fixes

Quick Reference

Common API Endpoints

# Chat completion POST /v1/chat/completions # List models GET /v1/models # Embeddings POST /v1/embeddings

Common CLI Commands

# Run inference inferno run --model MODEL --prompt "text" # Start server inferno serve --port 8080 # Manage models inferno models list inferno models download MODEL

API Quick Start

Authentication

curl http://localhost:8080/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{...}'

Basic Request

curl http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "llama-2-7b-chat", "messages": [ {"role": "user", "content": "Hello!"} ] }'

# Version and info inferno --version inferno info # Models inferno models download llama-2-7b-chat inferno models list # Run inference inferno run --model llama-2-7b-chat --prompt "Hello" # Start server inferno serve

Reference Documentation

Reference Sections

API Reference

CLI Reference

Architecture

Changelog

Quick Reference

Common API Endpoints

Common CLI Commands

API Quick Start

Authentication

Basic Request

CLI Quick Start

Essential Commands

Navigation