Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.gmicloud.ai/llms.txt

Use this file to discover all available pages before exploring further.

Registering an agent packages your code with infrastructure, environment config, and a live public endpoint, all in one four-step wizard. To get started, click Register & List in the left-side menu. Test the live endpoint before sending it to the Marketplace.

Pick a deployment path

GMI GPU Clusters

GMI hosts your container, scales it, and exposes a public URL. Eligible for the Verified badge when paired with MaaS.

Self-hosted + MaaS

You host the agent yourself and call GMI Models-as-a-Service for inference. Lists with the Powered by GMI MaaS badge.

Wizard steps

  1. Basic Info
  2. Infrastructure
  3. Env Variables
  4. Review & Deploy

Step 1: Basic Info

Identity for the listing. This is what users see on the catalog card and detail page.
Register an agent, Basic Info
If you’re going with the self-hosted + MaaS path, the Basic Info step looks slightly different:
Register an agent, Basic Info (MaaS path)
Basic Info
  • Add the internal project name

Step 2: Infrastructure

Configure compute resources. GMI GPU Clusters provisions containers on demand.

Docker image source

  • Registry URL. Pull from Docker Hub, GHCR, or any public/private registry.
  • Upload Image. Push a local image directly to GMI’s registry. Useful for one-off builds.

Registry URL

  • Format: registry.hub.docker.com/your-org/your-agent:latest
  • Private registries: add credentials in Step 3 as secrets.

Compute tier

  • Performance ($1.20/hr). 32 vCPU, 128 GB RAM, 25 Gbps. High-concurrency or heavy orchestration.
  • Standard ($0.60/hr) recommended. 16 vCPU, 64 GB RAM, 10 Gbps. Most agent workloads: API orchestration, RAG, tool use.
  • Economy ($0.15/hr). 4 vCPU, 16 GB RAM, 1 Gbps. Lightweight or dev/test deployments.

Region

  • Choose from US West, US East, Asia (Singapore), or Europe (Germany).
  • Pick the region closest to your users to minimize latency.
  • Multi-region rollouts require a separate deploy per region.

Scaling

  • Min instances. Set to 0 for serverless (cold start on first request). Min instances are always running and always billed.
  • Max instances. Upper bound for autoscaling under load.
  • Billing is per instance·hour.

MaaS integration

  • Toggle on to give your agent access to GMI’s 200+ frontier models.
  • GMI injects a MaaS API key into your container at startup, no key management on your end.
  • Select every model your agent may call. Selection is editable later.
  • Required for the Verified badge.

Step 3: Env Variables

Runtime configuration injected into your container at startup.

Plain values

  • Non-sensitive config: feature flags, base URLs, log levels.
  • Visible in the dashboard and editable anytime.

Secrets

  • API keys, third-party credentials, and any sensitive value.
  • Encrypted at rest and redacted from all logs.
  • Write-once: values can be replaced but never read back from the UI.

Per-region overrides

  • Override values per region for staged rollouts or region-specific endpoints.
  • Only available when the agent is deployed to more than one region.

Step 4: Review & Deploy

Confirm settings, deploy, and verify before submitting the listing.

Review screen

  • Every setting from the previous steps is summarized on one page.
  • Click any section to jump back and edit.

Deploy

  • Click Deploy. GPU Clusters pulls the image, builds the container, and runs health checks.
  • Once healthy, GPU Clusters assigns a public URL and starts billing per the scaling rules.
Test the endpoint
Test the deployed endpoint
  • Hit the URL with a sample request. Confirm latency, output, and error handling.
  • Iterate by re-deploying. URLs stay stable across deploys.
Next
  • When the endpoint is ready, continue to List an agent to submit it for review.