基于AI的软件界面:用CopilotKit打造智能交互应用-夜雨聆风

基于AI的软件界面:用CopilotKit打造智能交互应用

（系列文章，点击上方蓝字，关注公众号查看全系列内容）

引言：AI Agent与用户界面的完美融合

随着人工智能技术的飞速发展，AI Agent（智能代理）正逐渐渗透到我们日常生活的方方面面。它们不再仅仅是后台默默运行的算法，而是开始以更加直观、友好的方式与用户进行交互。然而，如何将强大的AI能力与流畅的用户界面（UI）无缝结合，一直是开发者面临的挑战。

今天，我们将深入探讨CopilotKit，一个令人兴奋的工具包，它让构建基于用户界面的AI Agent变得前所未有的简单。我们将通过一个实际案例——“成都城市卡片”程序，手把手教你如何利用CopilotKit的强大功能，将AI的智能与UI的魅力完美融合。

一、理论基石：A2UI与AG-UI，智能交互的未来

在深入实战之前，我们先来理解几个核心概念，它们是构建智能交互应用的关键：

1. A2UI (AI to UI)：AI生成用户界面

A2UI的核心思想是让AI模型能够直接生成或辅助生成用户界面元素。想象一下，你只需要告诉AI你想要一个“显示城市信息的卡片”，AI就能根据你的意图，生成一个包含城市名称、人口、描述等字段的UI组件。

在传统的开发模式中，UI的构建通常需要前端开发者手动编写代码。而A2UI则旨在自动化这一过程，通过AI的理解和生成能力，大大提高UI开发的效率和灵活性。这使得AI Agent能够根据不同的情境和用户需求，动态地调整和呈现最合适的界面。

2. AG-UI (Agent-Generated UI)：Agent驱动的用户界面

AG-UI是A2UI的进一步延伸，它强调的是由AI Agent来驱动和管理用户界面的生命周期。这意味着AI Agent不仅能生成UI，还能根据其内部逻辑、用户输入或外部事件，动态地更新、移除或添加UI元素。

例如，在我们的“成都城市卡片”案例中，当用户请求“成都卡片”时，AI Agent会生成一个城市信息卡片。如果用户进一步询问“天气”，Agent则会调用天气工具，并将天气信息以新的UI组件形式呈现在卡片下方。整个过程都由AI Agent智能地协调和控制。

3. CopilotKit：连接AI与UI的桥梁

CopilotKit是一个全栈的AI Agent开发框架，它提供了一系列工具和库，旨在简化AI Agent的构建和部署，特别是那些需要与用户界面深度集成的Agent。它通过以下几个关键特性，实现了A2UI和AG-UI的理念：

工具（Tools）：

CopilotKit允许你定义各种工具，这些工具可以是API调用、数据库查询，甚至是复杂的业务逻辑。AI Agent可以根据用户意图选择并执行合适的工具。
UI组件（UI Components）：

CopilotKit提供了将工具输出映射到可渲染UI组件的能力。这意味着当AI Agent执行一个工具并获得结果后，它可以将结果转换为用户友好的界面元素。
会话管理（Session Management）：

CopilotKit帮助管理AI Agent与用户之间的会话状态，使得Agent能够理解上下文，并进行多轮对话。
集成（Integrations）：它提供了与各种前端框架（如React）和后端服务（如OpenAI API）的无缝集成。

二、实战演练：打造你的“成都城市卡片”AI Agent

现在，让我们通过一个具体的例子，来学习如何使用CopilotKit构建一个能够显示“成都城市卡片”和天气信息的AI Agent。

目标：创建一个Web应用，用户可以在其中与AI Agent对话，当用户询问“成都卡片”时，Agent能展示一个包含成都信息的卡片；当用户询问天气时，Agent能展示成都的实时天气。

1. 前期准备：环境搭建

首先，你需要确保你的开发环境已准备就绪。

pip install google-adk #https://google.github.io/adk-docs/pip install copilotkit openai fastapi uvicorn python-dotenv

2. 后端Agent的构建 (Python FastAPI)

我们将创建一个简单的FastAPI服务器作为我们的后端，其中包含AI Agent的逻辑和工具定义。

import osfrom typing import Anyfrom fastapi import FastAPIfrom ag_ui_adk import ADKAgent, add_adk_fastapi_endpointfrom google.adk.agents import LlmAgentfrom google.adk.models import lite_llm as adk_lite_llmfrom google.adk.models.lite_llm import LiteLlmfrom dotenv import load_dotenvfrom tools import get_city_card, get_weatherload_dotenv()MODEL_NAME = os.getenv("MODEL_NAME", "openai/mistral-7b-instruct")if MODEL_NAME.startswith("openai/") and not os.getenv("OPENAI_API_KEY"):    raise RuntimeError(        "Missing OPENAI_API_KEY. Set it before running, for example:\n"        "export OPENAI_API_KEY='your_key'\n"        "uv run main.py"    )def patch_litellm_tool_call_messages() -> None:    """Drops assistant text when a LiteLLM message also contains tool calls.    Some LiteLLM streaming responses surface an assistant message with both    `content` and `tool_calls`. OpenAI-compatible chat endpoints reject that    shape on the next turn, requiring exactly one of the two fields.    ADK converts that message back into chat history, so we normalize it here by    preserving the tool calls and discarding the assistant text for that turn.    """    if getattr(adk_lite_llm, "_my_agent2_tool_call_patch", False):        return    original_split = adk_lite_llm._split_message_content_and_tool_calls    def _patched_split_message_content_and_tool_calls(        message: Any,    ) -> tuple[Any, list[Any]]:        content, tool_calls = original_split(message)        if tool_calls:            return None, tool_calls        return content, tool_calls    adk_lite_llm._split_message_content_and_tool_calls = (        _patched_split_message_content_and_tool_calls    )    adk_lite_llm._my_agent2_tool_call_patch = Truepatch_litellm_tool_call_messages()agent = LlmAgent(    name="assistant",    # model="gemini-2.5-flash",    model=LiteLlm(model=MODEL_NAME),    instruction=(        "Be helpful and fun. "        "When user asks for a city card/city profile, do not invent city data inside tools. "        "Generate city facts with the model, then call get_city_card with fields: "        "city_name, country, population, description, accent_color. "        "For weather-only requests, call get_weather."    ),    tools=[get_weather, get_city_card], # 这两个工具函数自己写，要求返回固定格式就行)adk_agent = ADKAgent(    adk_agent=agent,    app_name="demo_app",    user_id="demo_user",    session_timeout_seconds=3600,    use_in_memory_services=True)app = FastAPI()add_adk_fastapi_endpoint(app, adk_agent, path="/")if __name__ == "__main__":    import uvicorn    port = int(os.getenv("PORT", "8000"))    uvicorn.run(app, host="localhost", port=port)

3. 前端应用的构建(React)

前端代码与之前保持一致，因为它与后端语言无关，只通过HTTP请求与后端交互。（代码清单太多，参考文后连接实例）

1. 安装前端依赖：@copilotkit/react-ui、@copilotkit/react-core、@copilotkit/runtime、@ag-ui/client，先保证聊天 UI、运行时和 Agent 连接能力齐全。2. 建立 CopilotKit API 路由：在 Next.js 里创建 /api/copilotkit，用 CopilotRuntime + HttpAgent({ url: "http://localhost:8000/" }) 把前端请求转发到你的后端 Agent。3. 页面接入聊天入口：在页面放 CopilotSidebar（或你选择的 CopilotKit UI 组件），让用户先能完成“提问→响应”闭环。4. 捕获工具结果而不是只看文本：用 useDefaultRenderTool 拿到 name / parameters / status / result，这是做可视化 UI 的关键数据源。5. 做一层结果归一化：把工具返回统一成前端数据模型（例如 toolName + status + resultText + 业务字段），避免不同工具返回格式不一致导致渲染混乱。6. 模板驱动渲染：根据 toolName 切换模板（如天气模板/城市模板），由统一渲染器按 path 绑定字段渲染卡片。7. 补上交互动作：在模板里定义按钮 action（如 showWeather），页面收到 action 后更新状态并触发二次渲染，实现“聊天 + 可操作 UI”。

三、总结与展望

通过本篇博客，我们不仅了解了A2UI和AG-UI这些前沿的AI交互概念，更通过CopilotKit的实战，亲手构建了一个基于用户界面的AI Agent。我们看到了CopilotKit如何作为连接AI逻辑和前端UI的强大桥梁，让开发者能够专注于Agent的核心智能，而无需为复杂的UI渲染逻辑而烦恼。

未来，随着AI技术的不断成熟，基于AI Agent的UI将变得更加智能、个性化和动态。CopilotKit这样的工具，无疑为我们打开了通往这一激动人心未来的大门。

现在，你已经掌握了CopilotKit的基础知识和实践经验，快去尝试构建更多富有创意的AI Agent应用吧！也许下一个改变世界的智能应用，就诞生在你的指尖！

参考与实例：

https://docs.ag-ui.com/introduction

https://github.com/CopilotKit/with-a2a-a2ui

https://a2ui-composer.ag-ui.com/

https://docs.copilotkit.ai/a2a/generative-ui/declarative-a2ui

https://dojo.ag-ui.com/adk-middleware/feature/shared_state?openCopilot=true&file=page.tsx