Try editing the code examples below. The beautiful code editor shows you exactly how to use the Inferno API.
Simple Inference Request
This example shows how to make a basic inference request to the Inferno API.
Edit this code to see how the Inferno API works
Endpoint:POST /inference
// Configure the APIconstAPI_KEY = 'your_api_key';constBASE_URL = 'http://localhost:8080';// Make an inference requestasyncfunctionrunInference(){constresponse = awaitfetch(`${BASE_URL}/inference`,{method:'POST',headers:{'Authorization':`Bearer ${API_KEY}`,'Content-Type':'application/json'},body:JSON.stringify({model:'llama-2-7b',prompt:'What is machine learning?',max_tokens:100,temperature:0.7})});constresult = awaitresponse.json();console.log(result.choices[0].text);}runInference();
from openai importOpenAI
# Point the OpenAI client to your Inferno instanceclient = OpenAI(base_url="http://localhost:8080/v1",api_key="your_api_key" # or "not-needed"if auth disabled)
# Use it exactly like the OpenAI APIresponse = client.chat.completions.create(model="llama-2-7b",messages=[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is the capital of France?"}],temperature=0.7,max_tokens=100)print(response.choices[0].message.content)
importOpenAIfrom'openai';// Point the OpenAI client to your Inferno instanceconstclient = new OpenAI({baseURL:'http://localhost:8080/v1',apiKey:'your_api_key'// or 'not-needed' if auth disabled});// Use it exactly like the OpenAI APIconstresponse = awaitclient.chat.completions.create({model:'llama-2-7b',messages:[{role:'system',content:'You are a helpful assistant.'},{role:'user',content:'What is the capital of France?'}],temperature:0.7,max_tokens:100});console.log(response.choices[0].message.content);