This guide will help you install Inferno and run your first AI inference.
Choose the installation method for your platform:
The easiest way to install on macOS is via Homebrew:
brew install inferno
Alternatively, download the desktop app from the downloads page.
Download the latest binary from GitHub Releases :
# Download and extract (replace VERSION with latest)
curl -LO https://github.com/ringo380/inferno/releases/download/VERSION/inferno-linux-x64.tar.gz
tar -xzf inferno-linux-x64.tar.gz
# Move to your PATH
sudo mv inferno /usr/local/bin/
Or install via Cargo:
cargo install inferno
Download the installer from GitHub Releases or install via Cargo:
cargo install inferno
Pull the image from GitHub Container Registry:
docker pull ghcr.io/ringo380/inferno:latest
Check that Inferno is installed correctly:
inferno --version
Let’s run a simple inference to make sure everything works:
inferno run --model llama2 --prompt "Hello, world!"
This command will:
llama2
modelSee what models are available:
inferno models list
To use Inferno with OpenAI-compatible clients, start the API server:
inferno serve --port 8080
The server will start at http://localhost:8080
with OpenAI-compatible endpoints.
Once the server is running, test it with curl:
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama2",
"messages": [{"role": "user", "content": "Hello!"}]
}'