Troubleshooting

Common issues and their solutions.

Installation Issues

Command Not Found

Problem: inferno: command not found after installation.

Solutions:

macOS/Linux: Add Inferno to your PATH
```
export PATH="/usr/local/bin:$PATH"
```
Add to ~/.bashrc, ~/.zshrc, or equivalent.
Windows: Restart your terminal or add installation directory to PATH in System Environment Variables.

Permission Denied

Problem: Permission errors when running Inferno.

Solutions:

# Make binary executable (Linux/macOS)
chmod +x /usr/local/bin/inferno
 
# Run with sudo if needed for installation
sudo mv inferno /usr/local/bin/

Model Issues

Model Not Found

Problem: Error: Model 'xyz' not found

Solutions:

List available models:
```
inferno models list
```
Download the model:
```
inferno models download model-name
```
Check model path in config:
```
inferno config show
```

Model Download Fails

Problem: Model download interrupted or fails.

Solutions:

Check internet connection
Verify disk space: df -h
Try manual download from GitHub Releases
Set custom model directory:
```
export INFERNO_HOME=/path/to/models
```

GPU Issues

GPU Not Detected

Problem: Inferno not using GPU acceleration.

Solutions:

NVIDIA (CUDA):

# Check CUDA installation
nvidia-smi
 
# Install CUDA toolkit if missing
# https://developer.nvidia.com/cuda-downloads

AMD (ROCm):

# Check ROCm installation
rocm-smi
 
# Install ROCm if missing
# https://rocm.docs.amd.com/

Apple Silicon:
- Metal is built-in, no additional setup required
- Ensure you’re on macOS 11+ with M1/M2/M3/M4

Force CPU mode if needed:

inferno run --device cpu --model llama2 --prompt "test"

Out of Memory (OOM)

Problem: Error: Out of GPU memory

Solutions:

Use a smaller model
Reduce batch size

Reduce max_tokens:

inferno run --model llama2 --prompt "test" --max-tokens 100

Close other GPU-intensive applications

Use CPU instead:

inferno run --device cpu --model llama2 --prompt "test"

API Server Issues

Port Already in Use

Problem: Error: Address already in use

Solutions:

Use a different port:
```
inferno serve --port 3000
```

Find and kill process using the port:

# Linux/macOS
lsof -i :8080
kill -9 <PID>
 
# Windows
netstat -ano | findstr :8080
taskkill /PID <PID> /F

Connection Refused

Problem: Cannot connect to API server.

Solutions:

Verify server is running:
```
curl http://localhost:8080/health
```
Check firewall settings
Use correct host/port in client

Try binding to 0.0.0.0:

inferno serve --host 0.0.0.0 --port 8080

Performance Issues

Slow Inference

Problem: Inference taking too long.

Solutions:

Enable GPU acceleration (see GPU Issues above)
Use a smaller/faster model
Reduce max_tokens

Check system resources:

# CPU and memory usage
top
htop
 
# GPU usage (NVIDIA)
nvidia-smi

High Memory Usage

Problem: Inferno using too much RAM.

Solutions:

Use a smaller model
Reduce number of concurrent requests
Adjust worker threads:
```
inferno serve --workers 2
```

Docker Issues

Container Won’t Start

Problem: Docker container exits immediately.

Solutions:

Check logs:
```
docker logs <container-id>
```

Run interactively:

docker run -it ghcr.io/ringo380/inferno:latest /bin/bash

Verify volume mounts:

docker run -v ~/models:/models ghcr.io/ringo380/inferno:latest

GPU Not Available in Docker

Problem: GPU not detected in Docker container.

Solutions:

Install NVIDIA Container Toolkit

Use --gpus all flag:

docker run --gpus all ghcr.io/ringo380/inferno:latest

Debugging

Enable Verbose Logging

Get more detailed output:

inferno --verbose run --model llama2 --prompt "test"
 
# Or set log level
export INFERNO_LOG_LEVEL=debug
inferno serve

Check Configuration

View current configuration:

inferno config show

Verify Installation

Run diagnostics:

inferno --version
inferno models list
inferno config show

Getting Help

If you’re still experiencing issues:

Check GitHub Issues for similar problems
Create a new issue with:
- Inferno version (inferno --version)
- Operating system and version
- Complete error message
- Steps to reproduce
- Relevant logs (use --verbose)

Common Error Messages

Error	Cause	Solution
`Model not found`	Model not downloaded	Run `inferno models download MODEL`
`CUDA error`	GPU driver issue	Update NVIDIA drivers
`Connection refused`	Server not running	Start server with `inferno serve`
`Permission denied`	File permissions	Run with `sudo` or fix permissions
`Out of memory`	Model too large	Use smaller model or add RAM