Run GPU workloads on GMI Cloud. Pick the format that fits your job: a managed Kubernetes cluster for training, a container instance for short jobs and notebooks, or a dedicated bare-metal server for full hardware control.Documentation Index
Fetch the complete documentation index at: https://docs.gmicloud.ai/llms.txt
Use this file to discover all available pages before exploring further.

What you can do here
Launch a managed GPU cluster
Production-ready Kubernetes clusters with H200 or B200 nodes, provisioned and operated by GMI.
Run a container workload
Spin up single containers from a template (vLLM, SGLang, JupyterLab, custom images) on demand.
Request a bare-metal server
Dedicated hosts when you need full OS access, custom drivers, or persistent local NVMe.
Attach networking
Configure firewalls and Elastic IPs for any compute resource you provision.
How requests work
- Browse the cluster catalog in the console. Each card lists the SKU, region, and full hardware spec.
- Click Request Cluster on the card you want. The form is pre-filled with that SKU and region.
- GMI support reviews the request and provisions the resources.
- Once ready, the cluster appears under Managed GPU Clusters and you can start using it.