API Usage
You can access GLM-4.5-FP8 through the same RESTful Chat Completions API used by other GLM models.The following examples demonstrate text generation and function-calling usage.
GLM-4.5-FP8 is a high-efficiency variant of the GLM-4.5 large language model, designed for ultra-fast inference and reduced memory consumption through FP8 quantization.