Common issues and their solutions.
Problem: inferno: command not found after installation.
Solutions:
macOS/Linux: Add Inferno to your PATH
export PATH="/usr/local/bin:$PATH"
Add to ~/.bashrc, ~/.zshrc, or equivalent.
Windows: Restart your terminal or add installation directory to PATH in System Environment Variables.
Problem: Permission errors when running Inferno.
Solutions:
# Make binary executable (Linux/macOS)
chmod +x /usr/local/bin/inferno
# Run with sudo if needed for installation
sudo mv inferno /usr/local/bin/
Problem: Error: Model 'xyz' not found
Solutions:
List available models:
inferno models list
Download the model:
inferno models download model-name
Check model path in config:
inferno config show
Problem: Model download interrupted or fails.
Solutions:
df -hexport INFERNO_HOME=/path/to/models
Problem: Inferno not using GPU acceleration.
Solutions:
NVIDIA (CUDA):
# Check CUDA installation
nvidia-smi
# Install CUDA toolkit if missing
# https://developer.nvidia.com/cuda-downloads
AMD (ROCm):
# Check ROCm installation
rocm-smi
# Install ROCm if missing
# https://rocm.docs.amd.com/
Apple Silicon:
Force CPU mode if needed:
inferno run --device cpu --model llama2 --prompt "test"
Problem: Error: Out of GPU memory
Solutions:
max_tokens:
inferno run --model llama2 --prompt "test" --max-tokens 100
inferno run --device cpu --model llama2 --prompt "test"
Problem: Error: Address already in use
Solutions:
Use a different port:
inferno serve --port 3000
Find and kill process using the port:
# Linux/macOS
lsof -i :8080
kill -9 <PID>
# Windows
netstat -ano | findstr :8080
taskkill /PID <PID> /F
Problem: Cannot connect to API server.
Solutions:
curl http://localhost:8080/health
0.0.0.0:
inferno serve --host 0.0.0.0 --port 8080
Problem: Inference taking too long.
Solutions:
# CPU and memory usage
top
htop
# GPU usage (NVIDIA)
nvidia-smi
Problem: Inferno using too much RAM.
Solutions:
inferno serve --workers 2
Problem: Docker container exits immediately.
Solutions:
Check logs:
docker logs <container-id>
Run interactively:
docker run -it ghcr.io/ringo380/inferno:latest /bin/bash
Verify volume mounts:
docker run -v ~/models:/models ghcr.io/ringo380/inferno:latest
Problem: GPU not detected in Docker container.
Solutions:
--gpus all flag:
docker run --gpus all ghcr.io/ringo380/inferno:latest
Get more detailed output:
inferno --verbose run --model llama2 --prompt "test"
# Or set log level
export INFERNO_LOG_LEVEL=debug
inferno serve
View current configuration:
inferno config show
Run diagnostics:
inferno --version
inferno models list
inferno config show
If you’re still experiencing issues:
inferno --version)--verbose)| Error | Cause | Solution |
|---|---|---|
Model not found | Model not downloaded | Run inferno models download MODEL |
CUDA error | GPU driver issue | Update NVIDIA drivers |
Connection refused | Server not running | Start server with inferno serve |
Permission denied | File permissions | Run with sudo or fix permissions |
Out of memory | Model too large | Use smaller model or add RAM |