Installation Guide

This guide covers installation of Inferno AI on all supported platforms.

System Requirements

Minimum Requirements

OS: macOS 10.15+, Linux (Ubuntu 20.04+, Debian 11+), Windows 10+
CPU: 4 cores, 2.0 GHz
RAM: 8GB (16GB recommended)
Storage: 20GB free space
Internet: Required for initial model downloads

Recommended Configuration

CPU: 8+ cores, 3.0+ GHz
RAM: 32GB+
Storage: 100GB+ SSD
GPU:
- NVIDIA RTX 3060+ with CUDA 11.x+
- AMD RX 6600+ with ROCm
- Apple Silicon (M1/M2/M3/M4)

Installation Methods

Choose your preferred installation method based on your platform and use case.

macOS Installation

Method 1: Homebrew (Recommended)

The easiest way to install Inferno on macOS:

# Install Inferno via Homebrew
brew install inferno
 
# Verify installation
inferno --version
 
# Check GPU support (Apple Silicon)
inferno info --gpu

Method 2: Binary Download

Download the pre-built binary for macOS:

# Download the latest release
curl -LO https://github.com/ringo380/inferno/releases/latest/download/inferno-macos-universal.tar.gz
 
# Extract the archive
tar xzf inferno-macos-universal.tar.gz
 
# Move to a directory in your PATH
sudo mv inferno /usr/local/bin/
 
# Make executable
sudo chmod +x /usr/local/bin/inferno
 
# Verify installation
inferno --version

Apple Silicon GPU Support

Metal GPU acceleration is automatically enabled on Apple Silicon Macs (M1/M2/M3/M4):

# Verify Metal support
inferno info --gpu
 
# Run with GPU acceleration (automatic)
inferno run --model llama-2-7b-chat --prompt "Hello!"

Linux Installation

Method 1: Package Manager (Ubuntu/Debian)

# Download the .deb package
wget https://github.com/ringo380/inferno/releases/latest/download/inferno-linux-amd64.deb
 
# Install the package
sudo dpkg -i inferno-linux-amd64.deb
 
# Fix any dependency issues
sudo apt-get install -f
 
# Verify installation
inferno --version

Method 2: Binary Installation

# Download the binary
wget https://github.com/ringo380/inferno/releases/latest/download/inferno-linux-x86_64.tar.gz
 
# Extract
tar xzf inferno-linux-x86_64.tar.gz
 
# Move to system path
sudo mv inferno /usr/local/bin/
 
# Make executable
sudo chmod +x /usr/local/bin/inferno
 
# Verify installation
inferno --version

NVIDIA GPU Support (CUDA)

To use NVIDIA GPUs with CUDA:

# Install CUDA Toolkit (version 11.x or 12.x)
# Follow instructions at: https://developer.nvidia.com/cuda-downloads
 
# Verify CUDA installation
nvidia-smi
 
# Inferno will automatically detect and use CUDA
inferno info --gpu

AMD GPU Support (ROCm)

To use AMD GPUs with ROCm:

# Install ROCm (Ubuntu/Debian)
# Follow instructions at: https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html
 
# Verify ROCm installation
rocm-smi
 
# Inferno will automatically detect and use ROCm
inferno info --gpu

Windows Installation

Method 1: Windows Package Manager (winget)

# Install using winget (Windows 10+)
winget install Inferno.InfernoAI
 
# Verify installation
inferno --version

Method 2: MSI Installer

Download the MSI installer from GitHub Releases
Run the installer: inferno-windows-x64.msi
Follow the installation wizard
Open a new terminal and verify:

inferno --version

Method 3: Binary Download

# Download the Windows binary
Invoke-WebRequest -Uri "https://github.com/ringo380/inferno/releases/latest/download/inferno-windows-x64.zip" -OutFile "inferno.zip"
 
# Extract the archive
Expand-Archive -Path "inferno.zip" -DestinationPath "C:\Program Files\Inferno"
 
# Add to PATH (run as Administrator)
$env:Path += ";C:\Program Files\Inferno"
[Environment]::SetEnvironmentVariable("Path", $env:Path, [EnvironmentVariableTarget]::Machine)
 
# Verify installation
inferno --version

NVIDIA GPU Support (Windows)

# Download and install CUDA Toolkit from NVIDIA
# https://developer.nvidia.com/cuda-downloads
 
# Verify installation
nvidia-smi
 
# Inferno will automatically detect CUDA
inferno info --gpu

Docker Installation

Docker is the recommended method for consistent deployment across platforms.

Basic Docker Setup

# Pull the latest Inferno image
docker pull ringo380/inferno:latest
 
# Verify installation
docker run --rm ringo380/inferno:latest inferno --version

Run with Persistent Storage

# Create directories for models and cache
mkdir -p ~/inferno-data/models
mkdir -p ~/inferno-data/cache
 
# Run with volume mounts
docker run -d \
  --name inferno \
  -p 8080:8080 \
  -v ~/inferno-data/models:/data/models \
  -v ~/inferno-data/cache:/data/cache \
  ringo380/inferno:latest serve --port 8080

Docker with GPU Support

NVIDIA GPU:

# Install NVIDIA Container Toolkit
# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
 
# Run with GPU support
docker run -d \
  --name inferno \
  --gpus all \
  -p 8080:8080 \
  -v ~/inferno-data/models:/data/models \
  ringo380/inferno:latest serve --port 8080

AMD GPU:

# Run with ROCm support
docker run -d \
  --name inferno \
  --device=/dev/kfd \
  --device=/dev/dri \
  -p 8080:8080 \
  -v ~/inferno-data/models:/data/models \
  ringo380/inferno:latest serve --port 8080

Docker Compose

Create a docker-compose.yml file:

version: '3.8'
 
services:
  inferno:
    image: ringo380/inferno:latest
    container_name: inferno
    ports:
      - "8080:8080"
    volumes:
      - ./inferno-data/models:/data/models
      - ./inferno-data/cache:/data/cache
    command: serve --port 8080
    restart: unless-stopped
    # For NVIDIA GPU support:
    # deploy:
    #   resources:
    #     reservations:
    #       devices:
    #         - driver: nvidia
    #           count: all
    #           capabilities: [gpu]

Run with:

docker-compose up -d

Build from Source

For developers or advanced users who want to build from source:

Prerequisites

# Install Rust (all platforms)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
 
# Source the Rust environment
source $HOME/.cargo/env
 
# Verify Rust installation
rustc --version
cargo --version

Build Process

# Clone the repository
git clone https://github.com/ringo380/inferno.git
cd inferno
 
# Build in release mode
cargo build --release
 
# The binary will be at: target/release/inferno
# Move to system path
sudo cp target/release/inferno /usr/local/bin/
 
# Verify installation
inferno --version

Build with Specific Features

# Build with CUDA support only
cargo build --release --features cuda
 
# Build with ROCm support only
cargo build --release --features rocm
 
# Build with all GPU backends
cargo build --release --features cuda,rocm,metal

Verify Installation

After installation, verify everything is working:

# Check version
inferno --version
 
# View system information
inferno info
 
# Check GPU support
inferno info --gpu
 
# List available commands
inferno --help

Expected output:

Inferno v0.7.0
Platform: darwin-arm64
GPU: Metal (Apple M3 Pro)

Post-Installation Steps

1. Download Your First Model

# Download a small model to test
inferno models download llama-2-7b-chat

2. Run a Test Inference

# Test with a simple prompt
inferno run --model llama-2-7b-chat --prompt "Hello, world!"

3. Start the API Server

# Start the server
inferno serve --port 8080
 
# Test the API
curl http://localhost:8080/v1/models

Troubleshooting

Command Not Found

If you get “command not found” after installation:

# Check if binary is in PATH
which inferno
 
# If not found, add to PATH (Linux/macOS)
export PATH="$PATH:/usr/local/bin"
 
# Make permanent by adding to ~/.bashrc or ~/.zshrc
echo 'export PATH="$PATH:/usr/local/bin"' >> ~/.bashrc
source ~/.bashrc

Permission Denied

# Make sure the binary is executable
sudo chmod +x /usr/local/bin/inferno

GPU Not Detected

# Check GPU information
inferno info --gpu
 
# For NVIDIA, verify CUDA installation
nvidia-smi
 
# For AMD, verify ROCm installation
rocm-smi

For more troubleshooting help, see the Troubleshooting Guide.

Next Steps

Quick Start Guide - Run your first inference
Configuration - Configure Inferno for your needs
CLI Reference - Learn all available commands
API Reference - Understand the API endpoints

Ready to run your first inference? Head to the Quick Start Guide →