Skip to Content

CLI Reference

Complete reference for the Inferno command-line interface.

Global Flags

These flags work with all commands:

--help, -h       Show help information
--version, -v    Show version information
--verbose        Enable verbose logging
--quiet          Suppress non-error output

Commands

inferno run

Run inference with a model.

inferno run --model MODEL_NAME --prompt "Your prompt"

Options:

Examples:

# Basic inference
inferno run --model llama2 --prompt "Explain AI"
 
# With custom parameters
inferno run --model llama2 --prompt "Write a poem" --temperature 1.2 --max-tokens 200
 
# Streaming output
inferno run --model llama2 --prompt "Tell me a story" --stream

inferno serve

Start the API server.

inferno serve [OPTIONS]

Options:

Examples:

# Start server on default port
inferno serve
 
# Custom port and host
inferno serve --port 3000 --host 0.0.0.0
 
# With API key auth
inferno serve --api-key your-secret-key
 
# Enable CORS
inferno serve --cors "*"

inferno models

Manage AI models.

List Models

inferno models list

Lists all available and installed models.

Download Model

inferno models download MODEL_NAME

Download a model from the registry.

Examples:

# Download a specific model
inferno models download llama2-7b
 
# Download with progress
inferno models download --verbose llama3-8b

Remove Model

inferno models remove MODEL_NAME

Remove an installed model.

Info

inferno models info MODEL_NAME

Show detailed information about a model.

inferno batch

Run batch inference on multiple inputs.

inferno batch --model MODEL --input FILE --output FILE

Options:

Input Format (JSONL):

{"prompt": "First prompt", "id": "1"}
{"prompt": "Second prompt", "id": "2"}

Example:

inferno batch --model llama2 --input prompts.jsonl --output results.jsonl --parallel 4

inferno config

Manage configuration.

inferno config [SUBCOMMAND]

Subcommands:

Examples:

# Show config
inferno config show
 
# Set default model
inferno config set default_model llama2
 
# Get a value
inferno config get api_port

Environment Variables

Configure Inferno using environment variables:

Example:

export INFERNO_HOME=/data/inferno
export INFERNO_LOG_LEVEL=debug
inferno serve

Configuration File

Create ~/.inferno/config.toml for persistent configuration:

[server]
port = 8080
host = "127.0.0.1"
workers = 4
 
[inference]
default_model = "llama2"
temperature = 0.7
max_tokens = 512
 
[gpu]
enabled = true
device = "auto"  # auto, cuda, metal, rocm, intel

Exit Codes