Harness工程之拆解 AI 编程助手(二):一条指令的完整旅程——从敲回车到修复完成

上篇我们认识了 CoreCoder——一个用 ~1500 行 Python 还原 Claude Code 50 万行源码骨架的开源项目。我们聊了它的 7 面”承重墙”，从搜索替换编辑到子代理隔离，每一面都是 AI Coding Agent 不可或缺的设计。

今天我们动手：跟随一条真实的用户指令——read main.py and fix the broken import，看它从敲下回车到修复完成，系统内部经历了什么。

冷启动：敲下回车前的那几秒

当你输入 corecoder -m gpt-4o，系统做了一连串紧凑的初始化。

用户输入: corecoder -m gpt-4o    │    ▼main.py (3行) ── 调用 cli.main()    │    ▼cli.py: _parse_args() ── 解析 -m, -p, -r, --base-url, --api-key 参数    │    ▼config.py: Config.from_env() ── 加载 .env 环境变量（从 CWD 向上查找到 home 目录）    │    ▼cli.py: 创建 LLM(model, api_key, base_url) ── 初始化 OpenAI 客户端    │    ▼cli.py: 创建 Agent(llm=llm) ── 注册 7 个工具，生成系统提示词    │    ▼cli.py: _repl(agent, config) ── 进入交互循环，等待用户输入

下面逐步拆解每个环节。

第一步：入口（__main__.py）

当你通过 python -m corecoder 或 corecoder 命令启动时，Python 解释器首先执行 __main__.py。这个文件只有 3 行：

corecoder/main.pyfrom corecoder.cli import mainmain()

就这么简单。没有参数检查、没有异常处理、没有日志记录。直接调用 cli.main()，把控制权交给 CLI 层。

第二步：参数解析（cli.py: _parse_args()）

main() 的第一个动作是解析命令行参数：

# corecoder/cli.py:23-38def _parse_args():    p = argparse.ArgumentParser(        prog="corecoder",        description="Minimal AI coding agent. Works with any OpenAI-compatible LLM.",    )    p.add_argument("-m", "--model", help="Model name (default: $CORECODER_MODEL or gpt-4o)")    p.add_argument("--base-url", help="API base URL (default: $OPENAI_BASE_URL)")    p.add_argument("--api-key", help="API key (default: $OPENAI_API_KEY)")    p.add_argument("-p", "--prompt", help="One-shot prompt (non-interactive mode)")    p.add_argument("-r", "--resume", metavar="ID", help="Resume a saved session")    p.add_argument("-v", "--version", action="version", version=f"%(prog)s {__version__}")    return p.parse_args()

💡面向开发人员：这里定义了 CoreCoder 的两种使用模式——

交互模式（默认）：启动后进入 REPL 循环，持续对话

单次模式（-p参数）：执行一条指令后立即退出，适合脚本集成

第三步：配置加载（config.py: Config.from_env()）

# corecoder/config.py:37-55@classmethoddef from_env(cls) -> "Config":    # load .env if present (won't override existing env vars)    _load_dotenv()    # pick up common env vars automatically    api_key = (        os.getenv("CORECODER_API_KEY")        or os.getenv("OPENAI_API_KEY")        or os.getenv("DEEPSEEK_API_KEY")        or ""    )    return cls(        model=os.getenv("CORECODER_MODEL", "gpt-4o"),        api_key=api_key,        base_url=os.getenv("OPENAI_BASE_URL") or os.getenv("CORECODER_BASE_URL"),        ...    )

这段代码做了一件巧妙的事：API 密钥的查找优先级。

它依次尝试三个环境变量（CORECODER_API_KEY → OPENAI_API_KEY → DEEPSEEK_API_KEY），这意味着无论你用的是 OpenAI、DeepSeek 还是 CoreCoder 自己的环境变量，都能被正确识别。

.env 文件的查找也很有意思 —— 它不是只看当前目录，而是从当前工作目录逐级向上查找直到用户 home 目录：

# corecoder/config.py:8-25def _load_dotenv():    """Load .env from cwd, walking up to home dir. No-op if python-dotenv missing."""    try:        from dotenv import load_dotenv        env_path = Path(".env")        if not env_path.exists():            cur = Path.cwd()            home = Path.home()            while cur != home and cur != cur.parent:                candidate = cur / ".env"                if candidate.exists():                    env_path = candidate                    break                cur = cur.parent        load_dotenv(env_path, override=False)    except ImportError:        pass  # python-dotenv not installed, silently skip

🔧注意：这个”向上查找”模式很实用：你把.env放在项目根目录，不管你在子目录中哪里启动corecoder，它都能找到配置。override=False确保已有的环境变量不会被.env覆盖 —— 这在 CI/CD 场景中很重要，因为密钥通常通过环境变量注入，不应该被文件覆盖。

第四步：创建 LLM 和 Agent

回到 cli.py 的 main() 函数，配置加载完成后，下一步是创建核心组件：

# corecoder/cli.py:69-76llm = LLM(    model=config.model,    api_key=config.api_key,    base_url=config.base_url,    temperature=config.temperature,    max_tokens=config.max_tokens,)agent = Agent(llm=llm, max_context_tokens=config.max_context_tokens)

这两行是整个启动流程的核心。创建 LLM 对象时，内部会初始化一个 OpenAI 客户端。创建 Agent 对象时，构造函数做了三件事：

# corecoder/agent.py:22-34def __init__(self, llm, tools=None, max_context_tokens=128_000, max_rounds=50):    self.llm = llm    self.tools = tools if tools is not None else ALL_TOOLS  # ① 注册 7 个工具    self.messages = []                                       # ② 初始化空的消息列表    self.context = ContextManager(max_tokens=max_context_tokens)  # ③ 创建上下文管理器    self.max_rounds = max_rounds    self._system = system_prompt(self.tools)                  # ④ 生成系统提示词

💡 简短的总结：启动过程做了四件事——(1) 找到你的 API 密钥 (2) 连接到 LLM 服务 (3) 准备好工具箱（7 个工具）(4) 生成一份”工作手册”（系统提示词）告诉 AI 它是谁、在哪里、有哪些工具。

第五步：进入 REPL

最后，控制权交给 _repl() 函数，它会显示欢迎面板并进入 while True 循环等待用户输入：

# corecoder/cli.py:116-126def _repl(agent, config):    console.print(        Panel(            f"[bold]CoreCoder[/bold] v{__version__}\n"            f"Model: [cyan]{config.model}[/cyan]"            + (f"  Base: [dim]{config.base_url}[/dim]" if config.base_url else "")            + "\nType [bold]/help[/bold] for commands, [bold]Ctrl+C[/bold] to cancel, [bold]quit[/bold] to exit.",            border_style="blue",        )    )    # ... while True loop waiting for input

主戏：一条指令的三轮循环

现在，你在终端输入了：

read main.py and fix the broken import

这条消息进入 Agent.chat() 方法，一场三轮的”思考→行动→观察”循环开始了。

用户: "read main.py and fix the broken import"    │    ▼Agent.chat(user_input)    ① 追加用户消息到 messages 列表    ② 调用 context.maybe_compress() 检查是否需要压缩上下文    │    ▼┌─────────── 第 1 轮 LLM 调用 ───────────┐│                                          ││  LLM 看到：用户消息 + 系统提示词 + 工具列表  ││  LLM 思考：我需要先读取文件才能修复          ││  LLM 返回：tool_calls = [read_file(file_path='main.py')]│                                          │└──────────────────────────────────────────┘    │    ▼Agent._exec_tool() ── 查找 read_file 工具 ── 执行 ── 返回文件内容    │    ▼┌─────────── 第 2 轮 LLM 调用 ───────────┐│                                          ││  LLM 看到：用户消息 + 工具调用记录 + 文件内容  ││  LLM 思考：我看到了 `from utils import halper`，应该是 helper ││  LLM 返回：tool_calls = [edit_file(       ││      file_path='main.py',                ││      old_string='from utils import halper',││      new_string='from utils import helper' ││  )]                                      ││                                          │└──────────────────────────────────────────┘    │    ▼Agent._exec_tool() ── 查找 edit_file 工具 ── 搜索替换 ── 返回 diff    │    ▼┌─────────── 第 3 轮 LLM 调用 ───────────┐│                                          ││  LLM 看到：用户消息 + 读取结果 + 编辑 diff   ││  LLM 思考：编辑成功了，可以告诉用户了         ││  LLM 返回：纯文本 "Fixed: halper → helper." ││                                          │└──────────────────────────────────────────┘    │    ▼没有 tool_calls → Agent.chat() 返回文本给用户

第一轮：先读后动手

Agent 把用户消息追加到对话历史，然后第一次调用 LLM。

此时 LLM 能看到三样东西：系统提示词（告诉它”你是 CoreCoder，工作目录是 /xxx，有 7 个工具”）、用户的消息、以及 7 个工具的 JSON Schema 描述。

LLM 的推理过程大概是：用户让我读文件并修复 import，我得先看看文件内容。于是它返回：

tool_calls = [read_file(file_path='main.py')]

Agent 拿到这个工具调用指令，找到 read_file 工具执行，把文件内容拿到手，追加到对话历史里。第一轮结束。

注意到一个细节了吗？LLM 没有上来就改代码。它的第一步是观察。这跟人类程序员一样——你不会不看代码就直接动手改。

第二轮：发现 Bug，精准修改

现在 LLM 看到了完整的对话历史：用户消息 + 第一轮的读取结果。

它扫描文件内容，发现了问题：from utils import halper——halper 应该是 helper，一个典型的拼写错误。

于是它返回第二个工具调用：

tool_calls = [edit_file(    file_path='main.py',    old_string='from utils import halper',    new_string='from utils import helper')]

Agent 执行搜索替换编辑，在 main.py 中找到 from utils import halper，替换为 from utils import helper，返回 diff 结果。第二轮结束。

这里有一个设计上的关键决策：编辑工具用的是搜索替换而非行号定位。为什么？因为行号太脆弱了——文件稍有变动，行号就对不上。而搜索替换基于内容匹配，只要匹配到唯一一处就执行修改，稳健得多。

第三轮：确认完成

LLM 现在看到了：用户消息 + 文件内容 + 编辑 diff。它判断修复成功，返回纯文本：

Fixed: halper → helper.

这一次，LLM 没有返回 tool_calls。

没有tool_calls，Agent 循环就结束了。这个终止条件非常优雅——不是硬编码”最多调 3 次工具”，而是让 LLM 自己决定什么时候该停下。它觉得任务完成了，就不再调用工具，直接返回文本回复。Agent 检测到这个信号，退出循环，把结果交给用户。

从你敲回车到看到修复结果，可能也就两三秒。但内部经历了三轮 LLM 调用、两次工具执行、一次完整的”读取→分析→修复→确认”流程。

代码层面的真实逻辑

上面是宏观视角，下面我们看 agent.py 中 chat() 方法的真实代码：

# corecoder/agent.py:47-91def chat(self, user_input: str, on_token=None, on_tool=None) -> str:    """Process one user message. May involve multiple LLM/tool rounds."""    self.messages.append({"role": "user", "content": user_input})    self.context.maybe_compress(self.messages, self.llm)    for _ in range(self.max_rounds):          # 最多 50 轮循环        resp = self.llm.chat(                  # 调用 LLM            messages=self._full_messages(),     # [系统提示词] + [历史消息]            tools=self._tool_schemas(),         # 7 个工具的 JSON Schema            on_token=on_token,                 # 流式输出回调        )        # 没有 tool_calls → LLM 说完了，返回文本        if not resp.tool_calls:            self.messages.append(resp.message)            return resp.content        # 有 tool_calls → 执行工具        self.messages.append(resp.message)        if len(resp.tool_calls) == 1:            # 单工具：直接执行            tc = resp.tool_calls[0]            if on_tool:                on_tool(tc.name, tc.arguments)            result = self._exec_tool(tc)            self.messages.append({                "role": "tool",                "tool_call_id": tc.id,                "content": result,            })        else:            # 多工具：并行执行            results = self._exec_tools_parallel(resp.tool_calls, on_tool)            for tc, result in zip(resp.tool_calls, results):                self.messages.append({                    "role": "tool",                    "tool_call_id": tc.id,                    "content": result,                })        # 工具执行后，检查是否需要压缩上下文        self.context.maybe_compress(self.messages, self.llm)    return "(reached maximum tool-call rounds)"

💡 关键洞察：Agent 不是一次调用就搞定的。它像人类程序员一样——先看代码，再改代码，再确认结果。每一轮都是一个“思考 → 行动 → 观察”的循环。在学术界这叫 ReAct 模式（Reasoning + Acting），是目前 AI Agent 最主流的架构范式。

这个循环有一个自然的终止条件：当 LLM 不再返回 tool_calls 时，说明它认为任务已完成，返回文本回复给用户。这比硬编码”最多调 3 次工具”要优雅得多 —— LLM 自己决定什么时候该停下。

🔧 深入细节：chat() 方法用 for _ in range(self.max_rounds) 循环（最多 50 轮），每轮包含一次 LLM 调用 + 零到多次工具执行。self._full_messages() 在每次循环时都会重新拼装完整消息列表（系统提示词 + 全部历史消息），确保 LLM 拥有完整的上下文。

并行工具执行

当 LLM 一次返回多个工具调用时（比如同时读取 3 个文件），CoreCoder 会并行执行它们：

# corecoder/agent.py:105-118def _exec_tools_parallel(self, tool_calls, on_tool=None) -> list[str]:    """Run multiple tool calls concurrently using threads.    This is inspired by Claude Code's StreamingToolExecutor which starts    executing tools while the model is still generating. We simplify to:    when the model returns N tool calls at once, run them in parallel.    """    for tc in tool_calls:        if on_tool:            on_tool(tc.name, tc.arguments)    with concurrent.futures.ThreadPoolExecutor(max_workers=8) as pool:        futures = [pool.submit(self._exec_tool, tc) for tc in tool_calls]        return [f.result() for f in futures]

🔧 这里用 ThreadPoolExecutor 而非 asyncio，因为工具内部的操作（文件 I/O、子进程调用）都是阻塞式的。线程池的最大并发数设为 8，这是一个经验值 —— 大多数场景下 LLM 不会同时调用超过 8 个工具。Claude Code 的 StreamingToolExecutor 更激进，它在 LLM 还在生成流式响应时就开始执行工具（streaming tool execution），但实现复杂度也高得多。

系统架构总览图

综合前面所有内容，CoreCoder 的完整架构可以用这张图表示：

┌──────────────────────────────────────────────────────────┐│                     CLI 层 (cli.py, 280行)                 ││                                                          ││    REPL 交互循环 · 命令处理 (/help, /save, /compact...)   ││    参数解析 · 输出渲染 (Rich Markdown) · 输入处理          ││                                                          ││    两种模式:                                              ││    · _repl()   → 交互模式，while True 循环                ││    · _run_once() → 单次模式，执行一条指令后退出            │└────────────────────────┬─────────────────────────────────┘                         │ user_input                         ▼┌──────────────────────────────────────────────────────────┐│                 Agent 层 (agent.py, 122行)                 ││                                                          ││    消息编排 · 工具调度 · 循环控制 · 并行执行               ││                                                          ││    chat() 核心循环:                                       ││    for _ in range(50):                                   ││        resp = LLM.chat(messages + tools)                 ││        if no tool_calls → return text                    ││        else → exec tools (parallel if >1) → append      ││                                                          ││    ┌──────────────┐  ┌───────────────────────────────┐   ││    │ Prompt       │  │ Context Manager               │   ││    │ (prompt.py)  │  │ (context.py, 196行)           │   ││    │ 33 行        │  │                               │   ││    │              │  │ L1: Snip (50% 阈值)           │   ││    │ 运行时动态    │  │     → 截断过长的工具输出       │   ││    │ 生成系统提示词│  │ L2: Summarize (70% 阈值)     │   ││    │              │  │     → LLM 摘要旧对话          │   ││    │              │  │ L3: Collapse (90% 阈值)       │   ││    │              │  │     → 紧急压缩，只保留摘要+最近│   ││    └──────────────┘  └───────────────────────────────┘   ││                                                          ││    ┌──────────────┐  ┌───────────────────────────────┐   ││    │ LLM          │  │ Session                       │   ││    │ (llm.py)     │  │ (session.py, 68行)            │   ││    │ 199 行       │  │                               │   ││    │              │  │ save_session()                │   ││    │ · 流式响应    │  │     → JSON → ~/.corecoder/    │   ││    │ · 指数退避重试│  │ load_session()                │   ││    │ · 费用估算    │  │     → 恢复消息历史 + 模型      │   ││    │ · 工具调用解析│  │ list_sessions()               │   ││    │              │  │     → 列出最近 20 个会话        │   ││    └──────────────┘  └───────────────────────────────┘   │└────────────────────────┬─────────────────────────────────┘                         │           ┌─────────────┼─────────────┐           ▼             ▼             ▼    ┌────────────┐ ┌────────────┐ ┌────────────┐    │   bash     │ │   edit     │ │   read     │    │  (115行)   │ │  (89行)    │ │  (53行)    │    │            │ │            │ │            │    │ Shell 执行  │ │ 搜索替换    │ │ 文件读取    │    │ 危险命令拦截│ │ 唯一性校验  │ │ 行号显示    │    │ cd 跟踪    │ │ diff 生成   │ │ 分页截断    │    ├────────────┤ ├────────────┤ ├────────────┤    │   write    │ │   glob     │ │   grep     │    │  (38行)    │ │  (47行)    │ │  (78行)    │    │            │ │            │ │            │    │ 文件创建    │ │ 文件名搜索  │ │ 内容搜索    │    │ 目录自动创建│ │ 递归匹配    │ │ 正则匹配    │    ├────────────┤ └────────────┘ └────────────┘    │   agent    │    │  (58行)    │    │            │    │ 子代理生成  │    │ 独立上下文  │    │ 无递归代理  │    └────────────┘              Tools Layer           (7 个工具, 共 478 行)

各层职责说明

CLI 层（cli.py）是用户和系统之间的桥梁。它负责两件事：把用户的输入传递给 Agent，把 Agent 的输出美观地呈现给用户。这一层还处理所有斜杠命令（/help、/save、/compact 等），这些命令不经过 LLM，而是直接在本地执行。

💡 如果你是产品经理：CLI 层是产品经理最容易理解的层次。你可以把它想象成”前台”——用户的所有需求都先到前台，前台判断是自己处理（斜杠命令）还是转给”技术部”（Agent）。

Agent 层（agent.py）是系统的”大脑”。它维护对话历史（self.messages），决定何时调用 LLM、何时执行工具、何时结束循环。Agent 不关心具体工具怎么实现，它只负责”调度”——拿到 LLM 的工具调用指令，找到对应的工具执行，把结果喂回给 LLM。

工具层是最底层的执行单元。每个工具都是独立的，只关心自己那一件事（读文件、写文件、执行命令等）。所有工具继承自同一个 Tool 基类（tools/base.py，27行），实现了统一的接口：

# corecoder/tools/base.pyclass Tool(ABC):    """Minimal tool interface. Subclass this to add new capabilities."""    name: str    description: str    parameters: dict  # JSON Schema for the function args    @abstractmethod    def execute(self, **kwargs) -> str:        """Run the tool and return a text result."""    def schema(self) -> dict:        """OpenAI function-calling schema."""        return {            "type": "function",            "function": {                "name": self.name,                "description": self.description,                "parameters": self.parameters,            },        }

🔧 这个设计模式叫Strategy Pattern（策略模式）。Agent 不知道也不关心具体有哪些工具 —— 它只通过 get_tool(name) 查找工具，通过 tool.execute(**kwargs) 执行。如果你想新增一个工具（比如 HTTP 请求、数据库查询），只需要继承 Tool 类，实现 execute 方法，然后在 tools/__init__.py 中注册即可。这就是 README 中”添加自定义工具只需约 20 行”的原因。

数据流总结

整个系统的数据流可以概括为一个公式：

用户输入 → [系统提示词 + 历史消息 + 工具列表] → LLM → {工具调用? → 执行 → 循环 | 文本回复? → 返回}

每一轮循环，消息列表（self.messages）都在增长：用户消息、LLM 回复、工具调用记录、工具执行结果……这些消息构成了 LLM 的”工作记忆”。当记忆太长时，ContextManager 介入压缩 —— 这就是下一章要讲的内容。

项目地址：https://github.com/he-yufeng/CoreCoder

关注我，追完整个系列。下一篇见。