Skip to main content

Overview

GMI Cloud is a robust cloud-based GPU infrastructure platform designed for high-performance AI inference services. The GMI Cloud plugin for Dify allows you to seamlessly integrate the GMI Cloud’s capabilities into your Dify workflows. Here are some key features of the GMI plugin in Dify:
  • OpenAI-Compatible API: Use standard OpenAI client libraries and tools for seamless integration.
  • Multiple Model Families: Access a wide range of models including DeepSeek, Llama, Qwen, OpenAI OSS, and GLM models.
  • High Performance: Optimized for fast inference and low latency, ideal for research tasks requiring heavy compute power.
  • Streaming Support: Real-time streaming for chat completions.
  • Tool Calling: Support for function calling and integrating external tools into your workflow.
  • Custom Model Support: Easily deploy and use your own fine-tuned models.
  • Flexible Endpoints: Configure custom API endpoints for enterprise-level deployments.
Once the plugin is configured, you can access and use a range of preset models that come with the plugin. Right now, these include:
  • DeepSeek:
    • deepseek-ai/DeepSeek-V3-0324
    • deepseek-ai/DeepSeek-V3.1
  • OpenAI OSS:
    • openai/gpt-oss-120b
  • Meta Llama:
    • meta-llama/Llama-4-Scout-17B-16E-Instruct
  • Qwen:
    • Qwen/Qwen3-32B-FP8
    • Qwen/Qwen3-Next-80B-A3B-Instruct
    • Qwen/Qwen3-235B-A22B-Thinking-2507-FP8
    • Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8
  • Zhipu (ZAI):
    • zai-org/GLM-4.6
These models provide a variety of capabilities that you can leverage for tasks such as natural language processing, text generation, code generation, and more. You can always find the latest documentation about the plugin at https://marketplace.dify.ai/plugins/langgenius/gmicloud.

Step-by-Step Guide

Step 1: Getting Your API Key from GMI Cloud

If you don’t have your API key ready yet, let’s start by creating your API key from GMI Cloud console:
  1. Sign in to your GMI Cloud console and go to API Key Management.
  2. Click Create API Key, give it a name that’s easy to remember, and select Scope to be “Inference”.
  3. Remember to properly save your API key, as you won’t be able to access it once you close the popup.
image-20251112201249092

Step 2: Installing the GMI Plugin in Dify

Now let’s get to Dify. Go to the Dify plugin marketplace, located at: Plugins - Dify. Search and install the GMI Cloud plugin. dify-marketplace

Step 3: Configuring the GMI Plugin in Dify

Now, let’s configure the plugin in Dify:
  1. Open Dify and go to SettingsModel Provider.
  2. Locate GMI Cloud in the list of available providers, and click Setup.
  3. Enter your API key in the API Key field. This is the only required field.
  4. (Optional) If your organization uses a custom endpoint, enter the API Endpoint URL. Otherwise, the plugin defaults to: https://api.gmi-serving.com/v1.
  5. Click Save to activate the plugin.
Dify will validate your credentials by calling the /v1/models endpoint to ensure everything is set up correctly. configure-dify-model-provider You should see a green light if everything is good to go. Now we are ready to build our workflow! green-light

Step 4: Build a Deep Research Workflow in Dify

Go to the front page and click Create from Template: create-from-template This time we will use the DeepResearch Template provided by Dify official. deep-research-plugin At the plugin installation screen, make sure to check the two tools: Tavily and JSON Process. The other two model provider plugins are not needed as we will be using GMI Cloud’s model endpoints. plugin-install Now don’t be intimidated by the seemingly complicated graph. The only two nodes we need to care about are the LLM node and Reasoning Model node, which we will be replacing using GMI cloud’s model endpoints. replace-model-node For LLM node, let’s replace gpt-4o with GLM-4.6, which is a very capable LLM model good at a variety of general tasks. (Learn more about GLM 4.6 at zai-org/GLM-4.6 · Hugging Face) replace-node-glm4.6 For the Reasoning Model node, let’s replace it using Qwen3 235B A22B Thinking 2507 FP8, which shows great performance in a variety of reasoning benchmarks. (Learn more at Qwen/Qwen3-235B-A22B-Thinking-2507-FP8 · Hugging Face) replace-node-qwen3-thinking And that’s it! Click the Publish button on the top right and we are ready to run our workflow!

Step 5: Try It!

Now let’s enter our workflow app. There’s an optional Depth parameter we can set. This is why this workflow is called Deep Research: Based on the specified depth, multiple rounds of iterative searches are conducted. Let’s set it to 2 for example. Here’s a sample prompt:
Which industries are showing the strongest early signals of disruption from generative AI?
run-workflow Because deep research can take several rounds of reasoning, the full answer may not show up until one or two minutes. Feel free to grab a cup of coffee and check back. Looking at the final answer, we see a well-written analysis report with proper sources cited. run-result

Conclusion

Building deep research workflows with the GMI plugin for Dify is an excellent way to leverage the GMI cloud’s cutting-edge AI models and cloud infrastructure. Whether you’re conducting market research, model evaluations, or literature reviews, GMI Cloud provides the reliability and performance you need to drive actionable insights in your research. Ready to get started? Install the GMI Cloud plugin, configure it with your API key, and start building your own deep research workflows today! Please don’t hesitate to contact us through support@gmicloud.ai if you come across any questions!