手把手教你！Dify 工作流实战全解析 —— 附 YML 配置文件实操案例

最新推荐文章于 2025-07-29 22:22:34 发布

逻极

最新推荐文章于 2025-07-29 22:22:34 发布

阅读量2.6k

点赞数 75

CC 4.0 BY-SA版权

分类专栏： dify AI 开源模型文章标签： dify AI 人工智能 AI编程工作流 agent 实战

本文链接：https://blog.csdn.net/haolove527/article/details/149723451

dify 同时被 3 个专栏收录

4 篇文章

订阅专栏

4 篇文章

订阅专栏

开源模型

4 篇文章

订阅专栏

Dify 工作流实战指南

引言

Dify 是一个开源的低代码/无代码 AI 应用开发平台，其工作流（Workflow）功能通过可视化节点连接，允许用户快速构建复杂的多步骤逻辑，适用于聊天机器人、问答系统、数据分析和自动化任务等场景。本文提供一个详细、可实施的 Dify 工作流实战指南，基于 Deep Research 模板构建一个自动化研究助手，输入研究主题（如“AI 在医疗领域的应用”），通过迭代搜索和信息整合生成带引用的综合报告。同时，补充 Dify 本地化部署所需的 YAML 文件（docker-compose.yml 和 docker-compose.middleware.yml），以及工作流配置文件（workflow.yaml），以支持本地运行工作流。本指南逻辑清晰，步骤明确，适合初学者和进阶用户。

实战目标

通过 Dify 的 Deep Research 工作流模板，构建一个自动化研究助手，展示以下功能：

意图识别：捕获用户输入的研究主题和迭代深度。
迭代搜索：使用搜索工具（如 Exa Answer 或 Serper）获取相关信息。
信息合成：通过大型语言模型（LLM）分析结果，生成结构化报告。
本地部署：使用 Docker Compose 部署 Dify，确保工作流在本地运行。

准备工作

在开始实战之前，确保完成以下准备：

Dify 环境：
- 云端部署：注册 Dify 账号（Dify 官网），无需本地安装。
- 本地部署（推荐用于本实战）：参考 Dify 文档 - 本地部署，需至少 2 vCPU 和 8GB 内存。
硬件要求：2 vCPU，8GB 内存，20GB 存储空间。
软件要求：
- Docker（Linux 使用 Docker Engine，Windows/macOS 使用 Docker Desktop，建议启用 WSL 2）。
- Git（用于克隆 Dify 仓库）。
LLM API 密钥：获取 OpenAI、Claude 或本地模型（如 Ollama）的 API 密钥。Ollama 部署命令：
```
docker run -d --name ollama --rm -p 11434:11434 ollama/ollama
```
工具配置：注册并获取搜索工具的 API 密钥，如：
- Exa Answer（exa.ai）
- Serper（serper.dev）
基础知识：了解 Dify 的核心组件，如节点（Start、LLM、Tools、Answer 等）、变量和 RAG 管道。

实战步骤：构建 Deep Research 工作流

以下是基于 Dify Deep Research 模板的详细、可实施步骤，涵盖工作流创建、配置、测试和发布。

1. 本地部署 Dify

为确保工作流在本地运行，先完成 Dify 的本地化部署。

1.1 克隆 Dify 仓库

git clone https://github.com/langgenius/dify.git
cd dify/docker

1.2 配置 docker-compose.middleware.yml

此文件定义中间件服务（PostgreSQL、Redis、Weaviate），支持工作流的数据库、缓存和向量搜索功能。

version: '3.8'
services:
  db:
    image: postgres:16.4
    container_name: dify-postgres
    restart: always
    environment:
      - POSTGRES_USER=dify
      - POSTGRES_PASSWORD=difyai123456
      - POSTGRES_DB=dify
    volumes:
      - ./pgdata:/var/lib/postgresql/data
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U dify"]
      interval: 5s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7
    container_name: dify-redis
    restart: always
    volumes:
      - ./redis:/data
    ports:
      - "6379:6379"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 5s
      retries: 5

  weaviate:
    image: semitechnologies/weaviate:1.26.1
    container_name: dify-weaviate
    restart: always
    environment:
      - QUERY_DEFAULTS_LIMIT=25
      - AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true
      - PERSISTENCE_DATA_PATH=/var/lib/weaviate
      - DEFAULT_VECTORIZER_MODULE=none
      - ENABLE_MODULES=text2vec-transformers
      - TRANSFORMERS_INFERENCE_API=http://t2v-transformers:8080
    volumes:
      - ./weaviate:/var/lib/weaviate
    ports:
      - "8080:8080"
      - "50051:50051"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/v1/.well-known/ready"]
      interval: 10s
      timeout: 5s
      retries: 5

  t2v-transformers:
    image: semitechnologies/transformers-inference:sentence-transformers-multi-qa-MiniLM-L6-cos-v1
    container_name: dify-t2v-transformers
    environment:
      - ENABLE_CUDA=0

1.3 配置 docker-compose.yml

此文件定义 Dify 的核心服务，包括前端、后端和工作进程。

version: '3.8'
services:
  api:
    image: langgenius/dify-api:0.10.0
    container_name: dify-api
    restart: always
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
      weaviate:
        condition: service_healthy
    environment:
      - SECRET_KEY=${SECRET_KEY:-your-secret-key}
      - DB_HOST=db
      - DB_PORT=5432
      - DB_USER=dify
      - DB_PASSWORD=difyai123456
      - DB_DATABASE=dify
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - WEAVIATE_HOST=weaviate
      - WEAVIATE_PORT=8080
      - DEPLOY_ENV=PRODUCTION
      - EDITION=SELF_HOSTED
    volumes:
      - ./api:/app/api
    ports:
      - "5001:5001"

  web:
    image: langgenius/dify-web:0.10.0
    container_name: dify-web
    restart: always
    depends_on:
      - api
    environment:
      - API_PREFIX=http://api:5001
      - DEPLOY_ENV=PRODUCTION
      - EDITION=SELF_HOSTED
    ports:
      - "3000:3000"

  worker:
    image: langgenius/dify-api:0.10.0
    container_name: dify-worker
    restart: always
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
      weaviate:
        condition: service_healthy
    environment:
      - SECRET_KEY=${SECRET_KEY:-your-secret-key}
      - DB_HOST=db
      - DB_PORT=5432
      - DB_USER=dify
      - DB_PASSWORD=difyai123456
      - DB_DATABASE=dify
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - WEAVIATE_HOST=weaviate
      - WEAVIATE_PORT=8080
      - DEPLOY_ENV=PRODUCTION
      - EDITION=SELF_HOSTED
    command: ["celery", "-A", "app.celery", "worker", "-P", "gevent", "-c", "1", "--loglevel", "INFO", "-Q", "dataset,generation,mail,ops_trace"]
    volumes:
      - ./api:/app/api

1.4 配置环境变量

复制环境变量模板：

cp .env.example .env
cp middleware.env.example middleware.env

编辑 .env，设置关键变量：

DOCKER_COMPOSE_PORT=80
STORAGE_TYPE=local
OLLAMA_BASE_URL=http://localhost:11434
SECRET_KEY=$(openssl rand -base64 42)

1.5 启动 Dify

启动中间件服务：

docker compose -f docker-compose.middleware.yaml up -d

启动 Dify 服务：
```
docker compose up -d --build
```
访问初始化页面：打开浏览器，访问 http://localhost/install，设置管理员账户。
访问 Dify：初始化完成后，访问 http://localhost。

2. 创建 Deep Research 工作流

登录 Dify：
- 访问 http://localhost，使用管理员账户登录。
- 点击“Create Application”。
选择模板：
- 在“Explore”页面，搜索“DeepResearch”模板。
- 点击“Create”，命名应用（如“ResearchBot”），进入工作流编辑器。
- 若无模板，从空白 Workflow 开始，手动添加节点。

3. 配置 Start 节点

设置输入参数：
- research_topic：文本类型（如“AI in healthcare”）。
- max_loop：数字类型（默认 5）。

配置示例：

research_topic: AI in healthcare
max_loop: 5

连接：输出连接到 Iteration 节点。

4. 添加 Iteration 节点

拖放节点：拖动 Iteration 节点，连接 Start 节点的 research_topic 输出。
配置输入：
- 输入：search_keywords（数组类型，初始值从 research_topic 分解，如 [“AI diagnostics”, “AI patient care”]）。
输出：iteration_results（每次循环的搜索结果）。

5. 配置搜索工具（Tools 节点）

添加节点：拖动 Tools 节点，连接 Iteration 节点的 search_keywords 输出。
选择工具：
- 选择“Exa Answer”或“Serper”。
- 输入 API 密钥（需在 exa.ai 或 serper.dev 注册）。
配置输入：
- 查询：{{iteration.search_keywords}}。
输出变量：search_results（搜索结果列表，包含标题、URL、摘要）。

6. 添加 LLM 节点

选择模型：
- 选择 gpt-4o、Claude 或 Ollama（URL：http://localhost:11434）。

设置提示：

Based on the search results: {{search_results}}, identify knowledge gaps and suggest the next search topic. Output in JSON format:
{
  "nextSearchTopic": "string",
  "shouldContinue": boolean
}

输出变量：
- nextSearchTopic：如“AI patient outcome prediction”。
- shouldContinue：true/false。
连接：输入为 search_results，输出连接到 If/Else 节点。

7. 添加 If/Else 节点

配置条件：
- 条件：{{llm.shouldContinue}} == true
分支：
- True 分支：连接回 Iteration 节点。
- False 分支：连接到 Answer 节点。

8. 配置 Answer 节点

配置输出：

使用 Jinja2 模板：

# Research Report on {{start.research_topic}}
## Findings
{% for result in iteration.iteration_results %}
- {{result.snippet}}
{% endfor %}
## References
{% for result in iteration.iteration_results %}
- {{result.title}}: {{result.url}}
{% endfor %}

连接：输入为 iteration.iteration_results。

9. 配置工作流 YAML 文件

以下是 Deep Research 工作流的 YAML 配置，用于定义节点和逻辑。

version: '1.0'
nodes:
  - id: start
    type: Start
    inputs:
      research_topic:
        type: string
        value: ''
      max_loop:
        type: number
        value: 5
    outputs:
      research_topic: string
      max_loop: number
  - id: iteration
    type: Iteration
    inputs:
      search_keywords: '{{start.research_topic}}'
      max_iterations: '{{start.max_loop}}'
    outputs:
      iteration_results: array
  - id: search_tool
    type: Tool
    tool: ExaAnswer
    inputs:
      query: '{{iteration.search_keywords}}'
      api_key: '${EXA_API_KEY}'
    outputs:
      search_results: array
  - id: llm
    type: LLM
    model: gpt-4o
    inputs:
      prompt: |
        Based on the search results: {{search_results}}, identify knowledge gaps and suggest the next search topic. Output in JSON format:
        {
          "nextSearchTopic": "string",
          "shouldContinue": boolean
        }
      search_results: '{{search_tool.search_results}}'
    outputs:
      nextSearchTopic: string
      shouldContinue: boolean
  - id: if_else
    type: IfElse
    condition: '{{llm.shouldContinue}} == true'
    true_branch: iteration
    false_branch: answer
  - id: answer
    type: Answer
    inputs:
      output: |
        # Research Report on {{start.research_topic}}
        ## Findings
        {% for result in iteration.iteration_results %}
        - {{result.snippet}}
        {% endfor %}
        ## References
        {% for result in iteration.iteration_results %}
        - {{result.title}}: {{result.url}}
        {% endfor %}
    outputs:
      report: string

保存文件：保存至 dify/workflows/research_bot.yaml。
导入工作流：
- 在 Dify 仪表板，点击“Import Workflow”，上传 workflow.yaml。
- 验证节点连接和配置。

10. 测试与调试

预览工作流：
- 点击“Preview”，输入：
```
research_topic: AI in healthcare
max_loop: 5
```
- 检查输出报告。
调试工具：
- 使用“Workflow Process”面板，检查节点输入/输出。
- 验证变量值（如 search_results、nextSearchTopic）。
常见问题：
- 搜索结果不准确：检查 API 密钥或关键词。
- 迭代未结束：验证 max_loop 或 shouldContinue。

11. 发布与集成

发布应用：
- 点击“Publish”，生成 WebApp 链接（如 http://localhost/apps/research_bot）。

API 集成：

获取 API 端点：

curl -X POST http://localhost:5001/v1/workflows/run \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"research_topic": "AI in healthcare", "max_loop": 5}'

示例输出

# Research Report on AI in Healthcare
## Findings
- AI improves diagnostics through image analysis, such as MRI and CT scans, achieving over 90% accuracy.
- Predictive models reduce hospital readmissions by 15%.
## References
- AI in MRI Analysis: https://pubmed.ncbi.nlm.nih.gov/...
- WHO Report on AI in Healthcare: https://www.who.int/...