Handle long-running requests - GMI Cloud Documentation

AI agent tasks, including multi-step reasoning, document analysis, and model chains, can take anywhere from 30 seconds to several minutes. HTTP gateways on every cloud platform, including GMI, close connections that stay open too long and return a 504 Gateway Timeout to the caller. The fix is to decouple accepting the request from returning the result. A 504 from a slow task and a connection failure are two different problems. If your endpoint is unreachable, first check that ingress is enabled and networking is configured for your deployment. The async pattern below only helps when the request reaches your agent, but the work takes too long to finish inside the gateway window.

The async job pattern

Instead of holding the connection open, your agent should:

Accept the request and immediately return a job_id.
Run the task in the background.
Let the caller poll a status endpoint until the result is ready.

POST /run  ->  202 { "job_id": "abc-123" }

GET /jobs/abc-123  ->  200 { "status": "running" }
GET /jobs/abc-123  ->  200 { "status": "completed", "result": { ... } }

Implementation

Python (FastAPI)

import uuid
from fastapi import FastAPI, BackgroundTasks
from fastapi.responses import JSONResponse

app = FastAPI()

# In memory store. Replace with Redis or a database in production.
jobs: dict = {}

async def run_task(job_id: str, payload: dict):
    jobs[job_id] = {"status": "running"}
    try:
        result = await your_agent.run(payload)
        jobs[job_id] = {"status": "completed", "result": result}
    except Exception as e:
        jobs[job_id] = {"status": "failed", "error": str(e)}

@app.post("/run", status_code=202)
async def start_job(payload: dict, background_tasks: BackgroundTasks):
    job_id = str(uuid.uuid4())
    jobs[job_id] = {"status": "pending"}
    background_tasks.add_task(run_task, job_id, payload)
    return {"job_id": job_id}

@app.get("/jobs/{job_id}")
async def get_job(job_id: str):
    job = jobs.get(job_id)
    if not job:
        return JSONResponse(status_code=404, content={"error": "Job not found"})
    return job

Node.js (Express)

import express from "express";
import { randomUUID } from "crypto";

const app = express();
app.use(express.json());

// In memory store. Replace with Redis or a database in production.
const jobs = new Map();

app.post("/run", (req, res) => {
  const jobId = randomUUID();
  jobs.set(jobId, { status: "pending" });
  runTask(jobId, req.body); // fire and forget
  res.status(202).json({ job_id: jobId });
});

app.get("/jobs/:jobId", (req, res) => {
  const job = jobs.get(req.params.jobId);
  if (!job) return res.status(404).json({ error: "Job not found" });
  res.json(job);
});

async function runTask(jobId, payload) {
  jobs.set(jobId, { status: "running" });
  try {
    const result = await yourAgent.run(payload);
    jobs.set(jobId, { status: "completed", result });
  } catch (err) {
    jobs.set(jobId, { status: "failed", error: err.message });
  }
}

Calling the endpoint

import time
import requests

BASE_URL = "https://your-agent-endpoint.gmicloud.ai"

# Submit
response = requests.post(f"{BASE_URL}/run", json={"input": "..."})
job_id = response.json()["job_id"]

# Poll
while True:
    result = requests.get(f"{BASE_URL}/jobs/{job_id}").json()
    if result["status"] == "completed":
        print(result["result"])
        break
    elif result["status"] == "failed":
        print(result["error"])
        break
    time.sleep(3)

Persisting job state

GMI containers are stateless. If a container restarts, any in-memory job state is lost. For production, write the job state to an external store such as Redis or a database, and inject the connection credentials as Secrets in Step 4 of Register an agent.

​The async job pattern

​Implementation

​Python (FastAPI)

​Node.js (Express)

​Calling the endpoint

​Persisting job state