Deploy an AI Agent or LLM App
Varity automatically detects AI agent apps and configures the right hosting for them. You write the code, run one command, and your agent is live.
Prerequisites
Section titled “Prerequisites”-
Python 3.8+ or Node.js 20+
-
varitykitCLI installed:Terminal window pipx install varitykit(Recommended over
pip installfor isolated installs. Both work.) -
A Varity account with a deploy key. Run
varitykit loginif you have not logged in yet.
Option 1: Python AI Agent (FastAPI)
Section titled “Option 1: Python AI Agent (FastAPI)”This example builds a simple agent endpoint that accepts a prompt and returns a response using the OpenAI API.
-
Create your agent
main.py from fastapi import FastAPIfrom pydantic import BaseModelimport openaiimport osapp = FastAPI()client = openai.OpenAI(api_key=os.environ["OPENAI_API_KEY"])class PromptRequest(BaseModel):prompt: str@app.post("/run")async def run_agent(request: PromptRequest):response = client.chat.completions.create(model="gpt-4o-mini",messages=[{"role": "user", "content": request.prompt}])return {"result": response.choices[0].message.content}@app.get("/health")async def health():return {"status": "ok"} -
Add your dependencies
requirements.txt fastapiuvicornopenai -
Add your API key
Create a
.envfile in your project root:.env OPENAI_API_KEY=sk-...Varity reads this file at deploy time and injects the variable into your app’s runtime environment.
-
Deploy
Terminal window varitykit app deployVarity detects FastAPI, configures dynamic compute hosting, and deploys your agent.
-
Your agent is live
Detected: FastAPI (Python)Hosting: dynamic computeDeploying...Your agent is live at: https://varity.app/your-agent/Send a request to test it:
Terminal window curl -X POST https://varity.app/your-agent/run \-H "Content-Type: application/json" \-d '{"prompt": "Summarize the Varity docs in one sentence."}'
Option 2: Node.js AI Agent (Express)
Section titled “Option 2: Node.js AI Agent (Express)”This example builds an Express API that wraps the OpenAI API.
-
Create your agent
index.js const express = require('express');const OpenAI = require('openai');const app = express();app.use(express.json());const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });app.post('/run', async (req, res) => {const { prompt } = req.body;const response = await client.chat.completions.create({model: 'gpt-4o-mini',messages: [{ role: 'user', content: prompt }]});res.json({ result: response.choices[0].message.content });});app.get('/health', (_req, res) => res.json({ status: 'ok' }));const port = process.env.PORT || 3000;app.listen(port, '0.0.0.0', () => {console.log(`Agent running on port ${port}`);}); -
Add your dependencies
package.json {"name": "my-agent","version": "1.0.0","scripts": {"start": "node index.js"},"dependencies": {"express": "^4.20.0","openai": "^4.0.0"}} -
Add your API key
.env OPENAI_API_KEY=sk-... -
Deploy
Terminal window varitykit app deployVarity detects Express, configures dynamic compute hosting, and deploys your agent.
-
Your agent is live
Detected: Express (Node.js)Hosting: dynamic computeDeploying...Your agent is live at: https://varity.app/your-agent/
Using a Local Language Model
Section titled “Using a Local Language Model”If your agent runs a language model locally instead of calling an external API, add the ollama package to your dependencies. Varity will detect it and automatically provision a model server alongside your app.
fastapiuvicornollamaimport osimport ollamafrom fastapi import FastAPIfrom pydantic import BaseModel
app = FastAPI()ollama_url = os.environ.get("OLLAMA_URL", "http://localhost:11434")
class PromptRequest(BaseModel): prompt: str
@app.post("/run")async def run_agent(request: PromptRequest): response = ollama.chat( model="llama3", messages=[{"role": "user", "content": request.prompt}], host=ollama_url ) return {"result": response["message"]["content"]}{ "dependencies": { "express": "^4.20.0", "@langchain/ollama": "^0.1.0" }}const express = require('express');const { Ollama } = require('@langchain/ollama');
const app = express();app.use(express.json());
const llm = new Ollama({ baseUrl: process.env.OLLAMA_URL || 'http://localhost:11434', model: 'llama3'});
app.post('/run', async (req, res) => { const result = await llm.invoke(req.body.prompt); res.json({ result });});When you deploy, Varity:
- Detects
ollamaor@langchain/ollamain your dependencies - Provisions a model server alongside your app
- Injects
OLLAMA_URLinto your runtime environment automatically
Your app code reads process.env.OLLAMA_URL (Node.js) or os.environ["OLLAMA_URL"] (Python) to connect.
Check Deployment Status
Section titled “Check Deployment Status”After deploying, check on your agent:
# See if it is runningvaritykit app statusOr ask your AI coding tool (if you have the Varity MCP installed):
“Check the status of my Varity deployment”
“Show me the logs for my last deployment”
Troubleshooting
Section titled “Troubleshooting”“ProjectDetectionError: Framework not supported”
Check that your project has a requirements.txt (Python) or package.json (Node.js) with a supported framework listed. See Supported Frameworks.
Agent returns 502 after deploy
Your app may be taking a few extra seconds to start. Wait 30 seconds and try again. If it persists, check your logs by asking your AI coding tool:
“Show me the logs for my last deployment”
A common cause is the app binding to localhost instead of 0.0.0.0. Make sure your server listens on 0.0.0.0:
# uvicorn listens on 0.0.0.0 by default when run via Procfile or start commandapp.listen(port, '0.0.0.0', () => { ... });Environment variables not reaching the app
Make sure your .env file is in the project root (same directory you run varitykit app deploy from). Variables prefixed with VERCEL_, RAILWAY_, or NETLIFY_ are filtered out automatically.