Common issues and their solutions.
Problem: inferno: command not found after installation.
Solutions:
macOS/Linux: Add Inferno to your PATH
export PATH="/usr/local/bin:$PATH"Add to ~/.bashrc, ~/.zshrc, or equivalent.
Windows: Restart your terminal or add installation directory to PATH in System Environment Variables.
Problem: Permission errors when running Inferno.
Solutions:
# Make binary executable (Linux/macOS)
chmod +x /usr/local/bin/inferno
 
# Run with sudo if needed for installation
sudo mv inferno /usr/local/bin/Problem: Error: Model 'xyz' not found
Solutions:
List available models:
inferno models listDownload the model:
inferno models download model-nameCheck model path in config:
inferno config showProblem: Model download interrupted or fails.
Solutions:
df -hexport INFERNO_HOME=/path/to/modelsProblem: Inferno not using GPU acceleration.
Solutions:
NVIDIA (CUDA):
# Check CUDA installation
nvidia-smi
 
# Install CUDA toolkit if missing
# https://developer.nvidia.com/cuda-downloadsAMD (ROCm):
# Check ROCm installation
rocm-smi
 
# Install ROCm if missing
# https://rocm.docs.amd.com/Apple Silicon:
Force CPU mode if needed:
inferno run --device cpu --model llama2 --prompt "test"Problem: Error: Out of GPU memory
Solutions:
max_tokens:
inferno run --model llama2 --prompt "test" --max-tokens 100inferno run --device cpu --model llama2 --prompt "test"Problem: Error: Address already in use
Solutions:
Use a different port:
inferno serve --port 3000Find and kill process using the port:
# Linux/macOS
lsof -i :8080
kill -9 <PID>
 
# Windows
netstat -ano | findstr :8080
taskkill /PID <PID> /FProblem: Cannot connect to API server.
Solutions:
curl http://localhost:8080/health0.0.0.0:
inferno serve --host 0.0.0.0 --port 8080Problem: Inference taking too long.
Solutions:
# CPU and memory usage
top
htop
 
# GPU usage (NVIDIA)
nvidia-smiProblem: Inferno using too much RAM.
Solutions:
inferno serve --workers 2Problem: Docker container exits immediately.
Solutions:
Check logs:
docker logs <container-id>Run interactively:
docker run -it ghcr.io/ringo380/inferno:latest /bin/bashVerify volume mounts:
docker run -v ~/models:/models ghcr.io/ringo380/inferno:latestProblem: GPU not detected in Docker container.
Solutions:
--gpus all flag:
docker run --gpus all ghcr.io/ringo380/inferno:latestGet more detailed output:
inferno --verbose run --model llama2 --prompt "test"
 
# Or set log level
export INFERNO_LOG_LEVEL=debug
inferno serveView current configuration:
inferno config showRun diagnostics:
inferno --version
inferno models list
inferno config showIf you’re still experiencing issues:
inferno --version)--verbose)| Error | Cause | Solution | 
|---|---|---|
| Model not found | Model not downloaded | Run inferno models download MODEL | 
| CUDA error | GPU driver issue | Update NVIDIA drivers | 
| Connection refused | Server not running | Start server with inferno serve | 
| Permission denied | File permissions | Run with sudoor fix permissions | 
| Out of memory | Model too large | Use smaller model or add RAM |