Skip to content

Deploy an AI Agent or LLM App

Varity Team Core Contributors Updated May 2026

Varity automatically detects AI agent apps and configures the right hosting for them. You write the code, run one command, and your agent is live.

  • Python 3.8+ or Node.js 20+

  • varitykit CLI installed:

    Terminal window
    pipx install varitykit

    (Recommended over pip install for isolated installs. Both work.)

  • A Varity account with a deploy key. Run varitykit login if you have not logged in yet.

This example builds a simple agent endpoint that accepts a prompt and returns a response using the OpenAI API.

  1. Create your agent

    main.py
    from fastapi import FastAPI
    from pydantic import BaseModel
    import openai
    import os
    app = FastAPI()
    client = openai.OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    class PromptRequest(BaseModel):
    prompt: str
    @app.post("/run")
    async def run_agent(request: PromptRequest):
    response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": request.prompt}]
    )
    return {"result": response.choices[0].message.content}
    @app.get("/health")
    async def health():
    return {"status": "ok"}
  2. Add your dependencies

    requirements.txt
    fastapi
    uvicorn
    openai
  3. Add your API key

    Create a .env file in your project root:

    .env
    OPENAI_API_KEY=sk-...

    Varity reads this file at deploy time and injects the variable into your app’s runtime environment.

  4. Deploy

    Terminal window
    varitykit app deploy

    Varity detects FastAPI, configures dynamic compute hosting, and deploys your agent.

  5. Your agent is live

    Detected: FastAPI (Python)
    Hosting: dynamic compute
    Deploying...
    Your agent is live at: https://varity.app/your-agent/

    Send a request to test it:

    Terminal window
    curl -X POST https://varity.app/your-agent/run \
    -H "Content-Type: application/json" \
    -d '{"prompt": "Summarize the Varity docs in one sentence."}'

This example builds an Express API that wraps the OpenAI API.

  1. Create your agent

    index.js
    const express = require('express');
    const OpenAI = require('openai');
    const app = express();
    app.use(express.json());
    const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
    app.post('/run', async (req, res) => {
    const { prompt } = req.body;
    const response = await client.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [{ role: 'user', content: prompt }]
    });
    res.json({ result: response.choices[0].message.content });
    });
    app.get('/health', (_req, res) => res.json({ status: 'ok' }));
    const port = process.env.PORT || 3000;
    app.listen(port, '0.0.0.0', () => {
    console.log(`Agent running on port ${port}`);
    });
  2. Add your dependencies

    package.json
    {
    "name": "my-agent",
    "version": "1.0.0",
    "scripts": {
    "start": "node index.js"
    },
    "dependencies": {
    "express": "^4.20.0",
    "openai": "^4.0.0"
    }
    }
  3. Add your API key

    .env
    OPENAI_API_KEY=sk-...
  4. Deploy

    Terminal window
    varitykit app deploy

    Varity detects Express, configures dynamic compute hosting, and deploys your agent.

  5. Your agent is live

    Detected: Express (Node.js)
    Hosting: dynamic compute
    Deploying...
    Your agent is live at: https://varity.app/your-agent/

If your agent runs a language model locally instead of calling an external API, add the ollama package to your dependencies. Varity will detect it and automatically provision a model server alongside your app.

requirements.txt
fastapi
uvicorn
ollama
main.py
import os
import ollama
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI()
ollama_url = os.environ.get("OLLAMA_URL", "http://localhost:11434")
class PromptRequest(BaseModel):
prompt: str
@app.post("/run")
async def run_agent(request: PromptRequest):
response = ollama.chat(
model="llama3",
messages=[{"role": "user", "content": request.prompt}],
host=ollama_url
)
return {"result": response["message"]["content"]}

When you deploy, Varity:

  1. Detects ollama or @langchain/ollama in your dependencies
  2. Provisions a model server alongside your app
  3. Injects OLLAMA_URL into your runtime environment automatically

Your app code reads process.env.OLLAMA_URL (Node.js) or os.environ["OLLAMA_URL"] (Python) to connect.

After deploying, check on your agent:

Terminal window
# See if it is running
varitykit app status

Or ask your AI coding tool (if you have the Varity MCP installed):

“Check the status of my Varity deployment”

“Show me the logs for my last deployment”

“ProjectDetectionError: Framework not supported”

Check that your project has a requirements.txt (Python) or package.json (Node.js) with a supported framework listed. See Supported Frameworks.

Agent returns 502 after deploy

Your app may be taking a few extra seconds to start. Wait 30 seconds and try again. If it persists, check your logs by asking your AI coding tool:

“Show me the logs for my last deployment”

A common cause is the app binding to localhost instead of 0.0.0.0. Make sure your server listens on 0.0.0.0:

Python
# uvicorn listens on 0.0.0.0 by default when run via Procfile or start command
Node.js
app.listen(port, '0.0.0.0', () => { ... });

Environment variables not reaching the app

Make sure your .env file is in the project root (same directory you run varitykit app deploy from). Variables prefixed with VERCEL_, RAILWAY_, or NETLIFY_ are filtered out automatically.