Claude Code 源码深度拆解⑩ | 抄作业的正确姿势:从源码到Agent工程实践-夜雨聆风

Claude Code 源码深度拆解⑩ | 抄作业的正确姿势:从源码到Agent工程实践

追着fork永远追不上，理解原理才能跑出自己的路线。这51万行源码的真正价值，不是可以复制的代码，而是可以迁移的思想。

一、引言：十期拆解，我们学到了什么？

从第一期到第九期，我们完成了对Claude Code泄露源码的系统性拆解：

期数	主题	核心收获
①	六层架构全景	脚手架随模型能力提升而变薄
②	工具系统	Bash作为通用适配器，40个工具不如一个shell
③	TAOR循环	运行时越“笨”，架构越稳定
④	三层压缩	Context是需要主动管理的稀缺资源
⑤	多Agent编排	Coordinator-Worker+邮箱模式
⑥	安全架构	六级权限+22个验证器+AST解析
⑦	记忆系统	只记偏好不记代码，记忆是索引不是存储
⑧	Feature Flag	KAIROS、ULTRAPLAN揭示产品演进方向
⑨	工程细节	启动优化、遥测设计、print.ts的反面教材

现在，到了收官的时候。本期我们将回答一个根本问题：如何把这些知识转化为你自己的Agent工程能力？

这不是一篇“总结”，而是一篇“行动指南”。我们将从四条实践建议出发，给出可直接落地的工程方案，并探讨“抄”的边界——什么该学，什么不该学。

二、核心收获回顾：Harness设计哲学

2.1 什么是Harness？

在拆解过程中，我们反复提到一个概念：Harness。

Harness（马具/ harness）在软件工程中指的是包裹在大模型外层的运行时框架——它不替代模型的智能，而是为模型提供“身体”，让智能能够在现实世界中行动。

┌─────────────────────────────────────────────────────────────┐
│                    Harness 的核心组成                         │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   ┌─────────┐      ┌─────────┐      ┌─────────┐            │
│   │  工具   │ ───→ │  循环   │ ───→ │  记忆   │            │
│   │ 系统   │      │  引擎   │      │  系统   │            │
│   └─────────┘      └─────────┘      └─────────┘            │
│        │                │                │                  │
│        ▼                ▼                ▼                  │
│   “手脚”            “心脏”           “大脑皮层”              │
│   让模型行动        让模型持续       让模型记住               │
│                                                             │
│   ┌─────────┐      ┌─────────┐      ┌─────────┐            │
│   │  安全   │ ───→ │  编排   │ ───→ │  压缩   │            │
│   │  架构   │      │  系统   │      │  系统   │            │
│   └─────────┘      └─────────┘      └─────────┘            │
│        │                │                │                  │
│        ▼                ▼                ▼                  │
│   “免疫系统”        “管理团队”        “垃圾回收”             │
│   防止越界          多人协作          保持清爽               │
│                                                             │
└─────────────────────────────────────────────────────────────┘

2.2 核心设计哲学

经过十期拆解，Claude Code的Harness设计哲学可以浓缩为三句话：

哲学一：把智能下沉给模型，把确定性留给框架

TAOR循环的核心只有50行，因为所有决策都交给模型。框架不试图“聪明”，只提供可靠的执行环境。这恰恰是最聪明的设计——当模型能力升级时，框架几乎不需要修改。

哲学二：模型之外的工程化才是真正的难点

51万行代码中，与模型推理直接相关的不到5%。剩下的95%都是工程化：工具集成、权限管理、上下文压缩、多Agent编排、记忆系统、遥测监控。这些才是决定产品体验下限的因素。

哲学三：产品设计体现在每一个细节里

从工具描述的字斟句酌，到错误消息的“为什么+怎么办”，到硬编码字数限制的A/B测试——产品力不是宏大叙事，是无数细节的累积。

三、四条可落地的实践建议

3.1 建议一：工具描述就是产品力

Claude Code的做法：

每个工具的描述不是简单的函数签名，而是包含：

什么时候用（WHEN TO USE）
什么时候不用（WHEN NOT TO USE）
副作用说明（持久化shell会话）
输出格式说明
使用示例

你可以这样做：

// ❌ 糟糕的工具描述
const badToolDescription = "Execute a bash command.";

// ✅ 精心设计的工具描述
const goodToolDescription = `
## Bash

Execute a bash command in a persistent shell session.

### WHEN TO USE
- Running any CLI tool: git, npm, docker, python, etc.
- Inspecting the file system: ls, cat, find
- Installing dependencies or running scripts
- Any task that requires command execution

### WHEN NOT TO USE
- Simple file reading — use FileRead (faster and safer)
- Code searching — use Grep (optimized for search)

### IMPORTANT
- Commands run in a persistent shell. Environment variables and 
  working directory persist between calls.
- Use && to chain commands, ; for sequential execution.
- Long-running commands will be terminated after 30 seconds.

### OUTPUT
- Returns stdout, stderr, and exit code
- Stdout truncated at 10,000 characters
- Stderr is NOT automatically an error — check exit code
`;

// 在系统提示词中注入
const systemPrompt = `
You have access to the following tools:

${goodToolDescription}

When using tools, always follow the WHEN TO USE / WHEN NOT TO USE guidance.
`;

预期效果：同样的模型，工具描述写得好，成功率能提升一个档次。模型知道边界在哪里，就不会滥用或误用工具。

3.2 建议二：记忆架构比模型参数更影响体验

Claude Code的做法：

六层记忆架构，分层加载
只记偏好和决策，不记代码（能从代码推导的不记）
MEMORY.md作为索引，详情按需加载
Auto Dream后台清理矛盾和陈旧记忆

你可以这样做：

// 一个简化的三层记忆系统

interface MemorySystem {
  // Layer 1: 会话记忆（当前会话，最高优先级）
  sessionMemory: {
    recentFiles: string[];
    currentTask: string;
    userFeedback: string[];
  };

  // Layer 2: 项目记忆（.agent/memory.md，项目级）
  projectMemory: {
    conventions: string[];      // "使用tabs缩进"
    decisions: string[];         // "认证用JWT不用session"
    gotchas: string[];          // "测试不能用mock数据库"
  };

  // Layer 3: 用户记忆（~/.agent/memory.md，全局）
  userMemory: {
    preferences: string[];      // "偏好函数式风格"
    aliases: Record<string, string>;
  };
}

// 加载顺序：用户 → 项目 → 会话（后加载的覆盖先加载的）
async function buildContext(memory: MemorySystem): Promise<string> {
  const parts: string[] = [];

  // 用户全局偏好
  if (memory.userMemory.preferences.length > 0) {
    parts.push("## Your Preferences\n" + memory.userMemory.preferences.join('\n'));
  }

  // 项目约定（优先级更高，覆盖用户偏好）
  if (memory.projectMemory.conventions.length > 0) {
    parts.push("## Project Conventions\n" + memory.projectMemory.conventions.join('\n'));
  }

  // 当前会话上下文（最高优先级）
  parts.push("## Current Session\n" + 
    `Working on: ${memory.sessionMemory.currentTask}\n` +
    `Recent files: ${memory.sessionMemory.recentFiles.join(', ')}`
  );

  return parts.join('\n\n---\n\n');
}

// 记忆提取（只在满足条件时触发）
async function maybeExtractMemory(
  conversation: Message[],
  memory: MemorySystem
): Promise<void> {
  // 触发条件：新增≥8条消息 且 距上次提取≥10分钟
  if (!shouldExtract(conversation)) return;

  const prompt = `
    Extract information worth remembering from this conversation.

    Only extract what CANNOT be derived from reading the code:
    - User preferences (e.g., "use tabs not spaces")
    - Project decisions (e.g., "we chose JWT over session")
    - Gotchas and lessons learned (e.g., "don't mock the database in tests")

    Do NOT extract anything that can be found in the codebase.
  `;

  const extracted = await callAI(prompt);

  // 写入记忆文件
  await appendMemory(memory.projectMemory, extracted);

  // 保持索引精简
  await compactMemoryIndex();
}

预期效果：Agent越用越“懂你”，不需要每次都重复交代偏好和约定。这是用户留存的核心机制。

3.3 建议三：上下文是需要主动管理的资源

Claude Code的做法：

三层压缩策略：MicroCompact → SessionMemory Compact → Full Compact
熔断器防止无限压缩循环
压缩后重新注入关键环境信息
子Agent上下文隔离（执行过程不入主上下文）

你可以这样做：

class ContextManager {
  private maxTokens: number;
  private warningThreshold: number;

  constructor(maxTokens: number = 100000) {
    this.maxTokens = maxTokens;
    this.warningThreshold = maxTokens * 0.8;
  }

  // 每轮对话前检查token预算
  async ensureBudget(messages: Message[]): Promise<Message[]> {
    const currentTokens = this.estimateTokens(messages);

    if (currentTokens < this.warningThreshold) {
      return messages;  // 空间充足，无需压缩
    }

    console.log(`Context budget: ${currentTokens}/${this.maxTokens}, compacting...`);

    // 第一层：轻量级清理（清空旧的工具结果）
    messages = this.microCompact(messages);
    if (this.estimateTokens(messages) < this.warningThreshold) {
      return messages;
    }

    // 第二层：使用缓存摘要
    const cached = await this.getCachedSummary();
    if (cached) {
      messages = this.buildWithSummary(messages, cached);
      return messages;
    }

    // 第三层：调用AI生成摘要
    const summary = await this.generateSummary(messages);
    messages = this.buildWithSummary(messages, summary);

    // 重新注入环境信息
    messages = await this.reinjectContext(messages);

    return messages;
  }

  private microCompact(messages: Message[]): Message[] {
    // 清空超过5分钟的旧工具结果
    const FIVE_MINUTES = 5 * 60 * 1000;

    return messages.map(msg => {
      if (msg.role === 'tool' && Date.now() - msg.timestamp > FIVE_MINUTES) {
        return { ...msg, content: '[Content cleared to save context]' };
      }
      return msg;
    });
  }

  private async reinjectContext(messages: Message[]): Promise<Message[]> {
    // 压缩后重新告诉模型项目环境
    const context = {
      cwd: process.cwd(),
      recentFiles: this.getRecentFiles(5),
      activeTask: this.getActiveTask(),
    };

    messages.unshift({
      role: 'system',
      content: `Current context: ${JSON.stringify(context)}`,
    });

    return messages;
  }

  // 熔断器：防止压缩无限重试
  private compactionFailures = 0;
  private readonly MAX_FAILURES = 3;

  private async generateSummary(messages: Message[]): Promise<string> {
    if (this.compactionFailures >= this.MAX_FAILURES) {
      throw new Error('Compaction circuit breaker open — too many failures');
    }

    try {
      const summary = await this.callAIForSummary(messages);
      this.compactionFailures = 0;
      return summary;
    } catch (error) {
      this.compactionFailures++;
      throw error;
    }
  }
}

预期效果：Agent可以处理任意长的会话，不会因token超限而崩溃或表现下降。

3.4 建议四：情绪感知是工程问题，不是AI问题

Claude Code的做法：

用正则匹配挫败感关键词（ffs, shitty, wtf…）
追踪“continue”输入频率检测Agent卡住
这些是领先指标，能在用户流失前预警

你可以这样做：

class UserExperienceMonitor {
  private frustrationPatterns = [
    /ffs/i, /shitty/i, /wtf/i, /fuck/i,
    /come\s+on/i, /useless/i, /broken/i,
  ];

  private continueStreak = 0;
  private readonly STUCK_THRESHOLD = 3;

  // 每个用户输入都经过这里
  processInput(input: string): void {
    // 1. 检测挫败感
    const frustrationMatches = this.frustrationPatterns.filter(p => p.test(input));
    if (frustrationMatches.length > 0) {
      this.recordFrustration(frustrationMatches.map(p => p.source));
    }

    // 2. 检测Agent卡住
    if (input.trim().toLowerCase() === 'continue') {
      this.continueStreak++;
      if (this.continueStreak >= this.STUCK_THRESHOLD) {
        this.recordAgentStuck();
      }
    } else {
      this.continueStreak = 0;
    }

    // 3. 检测重复失败
    // (需要结合工具调用结果)
  }

  private recordFrustration(patterns: string[]): void {
    // 发送到遥测系统
    telemetry.record('user.frustration', {
      patterns,  // 只传模式名，不传原始文本（隐私保护）
      session_id: this.sessionId,
      timestamp: Date.now(),
    });

    // 本地也可以触发一些安抚行为
    // 比如在下一次回复中加入 "I notice you might be frustrated..."
  }

  private recordAgentStuck(): void {
    telemetry.record('agent.stuck', {
      continue_streak: this.continueStreak,
      session_id: this.sessionId,
      last_action: this.getLastAgentAction(),
    });
  }
}

// 在系统提示词中加入情绪感知
const systemPromptWithEmpathy = `
...

If you detect the user is frustrated (based on their language), 
acknowledge it and offer help. For example:
"I can see this is frustrating. Let me try a different approach..."

If you find yourself stuck in a loop, be honest about it:
"I seem to be stuck. Could you clarify what you'd like me to do next?"
`;

预期效果：在用户真正流失之前发现问题。当挫败感指标飙升时，产品团队可以立即调查最近的更新是否引入了bug。

四、“抄”的边界：什么该学，什么不该学

4.1 应该学的：架构思想

该学的	为什么	如何学
Harness设计哲学	模型之外的工程化才是难点	理解六层架构的职责划分
工具系统的模块化	自包含、无共享状态	每个工具独立Schema+权限+执行
TAOR循环的简洁性	运行时越笨越稳定	把决策权完全交给模型
上下文的分层压缩	能省则省，迫不得已才调用AI	Micro→SessionMemory→Full
权限的精细分级	信任是可组合的	五档信任光谱
记忆的索引化	记忆是索引不是存储	MEMORY.md只存指针

4.2 不应该学的：具体实现

不该学的	为什么	替代方案
直接复制源码	版权问题，且代码可能已过时	理解原理后自己实现
print.ts的写法	3000行单函数，反面教材	设计良好的抽象分层
旧解析器的遗留	安全漏洞的根源	统一使用AST解析
特定的API调用方式	Anthropic特有，不通用	设计自己的适配层

4.3 需要警惕的：潜在风险

风险点	说明	应对
Undercover Mode	隐藏AI身份有伦理争议	在产品中明确标注AI生成
过度依赖正则做安全	正则容易被绕过	使用AST解析
技术债务拖延	print.ts的TODO拖了4个月	定期偿还技术债务
内存安全	51万行TypeScript可能有内存泄漏	做好监控和压测

五、对中国AI编程工具赛道的判断

5.1 差距在哪里？

这次泄露事件让国内开发者能够一窥世界顶级AI编程工具的内部设计。客观地说，差距不在“能不能用”，而在“工程精细度”。

维度	Claude Code	国内同类产品	差距分析
工具系统	40+工具，Bash作为通用适配器	通常10-20个专项工具	抽象层次的设计哲学差异
上下文管理	三层压缩+熔断器	通常简单截断	对边界情况的处理深度
多Agent	Coordinator-Worker+邮箱	多数尚未支持	架构复杂度
安全	22个验证器+AST解析	通常基于正则	安全投入的差距
记忆系统	六层架构+Auto Dream	简单的历史记录	对“记忆”理解的深度
遥测	领先指标+情绪监控	通常只有基础DAU	数据驱动的精细化程度

5.2 这个差距是可以弥合的

好消息是：这些差距不是模型能力的差距，是工程能力的差距。

模型能力需要长期的研发积累，但工程能力可以通过学习快速提升。Claude Code的源码泄露，恰好提供了一份“参考答案”——不是让你抄，而是让你理解什么是一个生产级Agent系统应有的工程标准。

三条追赶路径：

理解Harness设计哲学

：把智能下沉给模型，把确定性留给框架
补齐工程短板

：上下文压缩、安全验证、多Agent编排
建立数据驱动的迭代闭环

：追踪领先指标，用数据指导优化

六、Agent工程化的未来趋势

基于Claude Code的Feature Flag和产品演进方向，我们可以预判Agent工程化的几个趋势：

6.1 趋势一：从“响应式”到“主动式”

KAIROS代表的方向：Agent不再等待用户指令，而是主动监控、主动提醒、主动建议。

对开发者的启示：在设计Agent系统时，考虑加入后台任务和事件驱动能力。

6.2 趋势二：从“单Agent”到“多Agent编排”

Coordinator-Worker架构已经在生产环境运行。未来的复杂任务必然由多个Agent协作完成。

对开发者的启示：提前考虑多Agent的通信协议、冲突解决、结果合并。

6.3 趋势三：从“无记忆”到“分层记忆”

Claude Code的六层记忆架构证明了“记忆”不是简单的历史记录，而是需要精细设计的分层系统。

对开发者的启示：设计记忆系统时，区分会话记忆、项目记忆、用户记忆，并考虑记忆的自动清理和去重。

6.4 趋势四：从“黑盒”到“可观测”

Claude Code的遥测系统追踪挫败感、卡住频率、工具失败率等领先指标。

对开发者的启示：在Agent系统中埋入足够的观测点，用数据驱动优化。

七、一个可运行的Agent骨架

最后，我们提供一个简化但完整可运行的Agent骨架，整合了本系列的核心设计思想：

// simple-agent.ts
// 一个融合了Claude Code设计哲学的简化Agent实现

import { readFile, writeFile } from 'fs/promises';
import { exec } from 'child_process';
import { promisify } from 'util';

const execAsync = promisify(exec);

// ===== 1. 工具系统：自包含的模块化设计 =====
interface Tool {
  name: string;
  description: string;  // 精心设计的描述
  inputSchema: Record<string, any>;
  execute: (input: any) => Promise<ToolResult>;
}

const tools: Tool[] = [
  {
    name: 'Bash',
    description: `
Execute a bash command. Use for any CLI operation.
WHEN TO USE: running git, npm, reading files with cat, etc.
WHEN NOT TO USE: when a specialized tool exists (FileRead, Grep).
`,
    inputSchema: { command: 'string' },
    execute: async ({ command }) => {
      // 安全检查（简化版）
      if (command.includes('rm -rf /')) {
        return { success: false, error: 'Dangerous command blocked' };
      }
      const { stdout, stderr } = await execAsync(command);
      return { success: true, output: stdout, error: stderr };
    },
  },
  {
    name: 'FileRead',
    description: 'Read a file. Use for inspecting file contents.',
    inputSchema: { path: 'string' },
    execute: async ({ path }) => {
      const content = await readFile(path, 'utf-8');
      return { success: true, output: content };
    },
  },
];

// ===== 2. 记忆系统：分层设计 =====
interface Memory {
  session: { recentFiles: string[]; currentTask: string };
  project: { conventions: string[] };
}

const memory: Memory = {
  session: { recentFiles: [], currentTask: '' },
  project: { conventions: [] },
};

async function loadMemory(): Promise<string> {
  // 加载项目记忆
  try {
    const memContent = await readFile('.agent/memory.md', 'utf-8');
    memory.project.conventions = memContent.split('\n').filter(l => l.startsWith('- '));
  } catch {
    // 文件不存在，忽略
  }

  // 构建上下文
  return `
## Project Conventions
${memory.project.conventions.join('\n')}

## Current Session
Working on: ${memory.session.currentTask}
Recent files: ${memory.session.recentFiles.join(', ')}
`;
}

// ===== 3. TAOR循环：运行时越笨越稳定 =====
async function taorLoop(
  userQuery: string,
  model: (msgs: Message[]) => Promise<ModelResponse>
): Promise<string> {

  const messages: Message[] = [];

  // 注入记忆
  const memoryContext = await loadMemory();
  messages.push({ role: 'system', content: memoryContext });

  // 注入工具描述
  const toolDescriptions = tools.map(t => `## ${t.name}\n${t.description}`).join('\n\n');
  messages.push({ role: 'system', content: `Available tools:\n${toolDescriptions}` });

  // 用户输入
  messages.push({ role: 'user', content: userQuery });

  let iterations = 0;
  const MAX_ITERATIONS = 20;

  while (iterations < MAX_ITERATIONS) {
    // Think
    const response = await model(messages);

    // 检查是否需要调用工具
    if (response.toolCalls && response.toolCalls.length > 0) {
      // 记录工具调用
      messages.push({ role: 'assistant', tool_calls: response.toolCalls });

      // Act & Observe
      for (const call of response.toolCalls) {
        const tool = tools.find(t => t.name === call.name);
        if (!tool) continue;

        const result = await tool.execute(call.input);

        messages.push({
          role: 'tool',
          tool_call_id: call.id,
          content: JSON.stringify(result),
        });

        // 更新记忆（如果是文件读取）
        if (call.name === 'FileRead') {
          memory.session.recentFiles.push(call.input.path);
        }
      }

      iterations++;
      continue;
    }

    // 没有工具调用，返回最终答案
    messages.push({ role: 'assistant', content: response.content });
    return response.content;
  }

  throw new Error('Max iterations exceeded');
}

// ===== 4. 上下文管理：主动检查预算 =====
class ContextBudgetManager {
  private maxTokens: number;

  constructor(maxTokens: number = 100000) {
    this.maxTokens = maxTokens;
  }

  estimateTokens(messages: Message[]): number {
    // 简化估算：每个字符约0.25 token
    return messages.reduce((sum, m) => 
      sum + JSON.stringify(m).length * 0.25, 0
    );
  }

  async ensureBudget(messages: Message[]): Promise<Message[]> {
    const tokens = this.estimateTokens(messages);

    if (tokens < this.maxTokens * 0.8) {
      return messages;  // 空间充足
    }

    console.log(`Context budget warning: ${tokens}/${this.maxTokens}, compacting...`);

    // 轻量级清理：清空旧的工具结果
    return messages.map(m => {
      if (m.role === 'tool' && m.content.length > 1000) {
        return { ...m, content: '[Content cleared]' };
      }
      return m;
    });
  }
}

// ===== 5. 情绪监控：追踪用户体验 =====
class UXMonitor {
  private frustrationPatterns = [/ffs/i, /wtf/i, /useless/i];
  private continueStreak = 0;

  processInput(input: string): void {
    // 挫败感检测
    const matches = this.frustrationPatterns.filter(p => p.test(input));
    if (matches.length > 0) {
      console.warn('[Monitor] User frustration detected');
      // 发送到遥测...
    }

    // 卡住检测
    if (input.trim().toLowerCase() === 'continue') {
      this.continueStreak++;
      if (this.continueStreak >= 3) {
        console.warn('[Monitor] Agent may be stuck');
      }
    } else {
      this.continueStreak = 0;
    }
  }
}

// ===== 使用示例 =====
async function main() {
  const monitor = new UXMonitor();

  const result = await taorLoop(
    '帮我找出项目中使用lodash的地方',
    async (msgs) => {
      // 这里接入实际的LLM API
      return { content: '...', toolCalls: [] };
    }
  );

  console.log(result);
}

这个骨架虽然简化，但包含了本系列的核心设计思想：模块化工具、分层记忆、TAOR循环、上下文预算管理、用户体验监控。你可以基于此扩展，构建自己的生产级Agent系统。

八、结语：从“看热闹”到“看门道”

十期拆解，51万行源码，我们从“看热闹”走到了“看门道”。

最重要的收获是什么？

不是某个具体的技术点，而是一种认知：Agent系统的真正难点，不在模型，在工程。

模型能力每半年翻一番，但工程能力的积累是以年为单位的。Claude Code的泄露，让我们看到了一个世界级团队如何解决Agent工程化的问题——哪些设计是深思熟虑的，哪些地方是技术债务，哪些功能还在路上。

对于正在构建Agent系统的你，这份源码不是“答案”，而是“参考答案”。理解它，消化它，然后走出你自己的路线。

记住：

追着fork永远追不上
理解原理才能跑出自己的路线
差距是可以弥合的

系列完

全系列回顾：

致谢：感谢所有参与讨论、提出问题的读者。Agent工程化是一个快速演进的领域，本系列只是一个起点。期待看到你们基于这些思想构建的下一代Agent系统。