OpenClaw工具拆解之 web_fetch+image

OpenClaw工具拆解之 web_fetch+image_generate

一、web_fetch 工具

1.1 工具概述

功能：抓取网页内容（HTML → Markdown/Text）
核心特性：

• 支持两种提取模式（markdown/text）
• 支持 Firecrawl 集成
• 支持缓存（TTL 可配置）
• 支持重定向（最多 3 次）
• 支持自定义 User-Agent

1.2 Schema 定义

位置：第 113617 行

constWebFetchSchema = Type.Object({
url: Type.String({ description: "HTTP or HTTPS URL to fetch." }),
extractMode: Type.Optional(stringEnum(EXTRACT_MODES, {
description: 'Extraction mode ("markdown" or "text").',
default: "markdown"
    })),
maxChars: Type.Optional(Type.Number({
description: "Maximum characters to return (truncates when exceeded).",
minimum: 100
    }))
});

1.3 完整执行代码

位置：第 113588 行

functioncreateWebFetchTool(options) {
// 1. 解析配置
const fetch = resolveFetchConfig(options?.config);

// 2. 检查是否启用
if (!resolveFetchEnabled({ fetch, sandboxed: options?.sandboxed })) {
returnnull;  // 工具不可用
    }

// 3. 解析 Readability 支持
const readabilityEnabled = resolveFetchReadabilityEnabled(fetch);

// 4. 解析 Firecrawl 配置
const firecrawl = resolveFirecrawlConfig(fetch);
const runtimeFirecrawlActive = options?.runtimeFirecrawl?.active;
const firecrawlApiKey = (runtimeFirecrawlActive === void0 ? firecrawl?.enabled !== false : runtimeFirecrawlActive) ? 
resolveFirecrawlApiKey(firecrawl) : void0;
const firecrawlEnabled = runtimeFirecrawlActive ?? resolveFirecrawlEnabled({ firecrawl, apiKey: firecrawlApiKey });
const firecrawlBaseUrl = resolveFirecrawlBaseUrl(firecrawl);
const firecrawlOnlyMainContent = resolveFirecrawlOnlyMainContent(firecrawl);
const firecrawlMaxAgeMs = resolveFirecrawlMaxAgeMsOrDefault(firecrawl);
const firecrawlTimeoutSeconds = resolveTimeoutSeconds(firecrawl?.timeoutSeconds ?? fetch?.timeoutSeconds, 30);

// 5. 解析其他配置
const userAgent = fetch && "userAgent"in fetch && typeof fetch.userAgent === "string" && fetch.userAgent || DEFAULT_FETCH_USER_AGENT;
const maxResponseBytes = resolveFetchMaxResponseBytes(fetch);

return {
label: "Web Fetch",
name: "web_fetch",
description: "Fetch and extract readable content from a URL (HTML → markdown/text). Use for lightweight page access without browser automation.",
parameters: WebFetchSchema,
execute: async (_toolCallId, args) => {
const params = args;

// 6. 解析参数
const url = readStringParam$1(params, "url", { required: true });
const extractMode = readStringParam$1(params, "extractMode") === "text" ? "text" : "markdown";
const maxChars = readNumberParam(params, "maxChars", { integer: true });
const maxCharsCap = resolveFetchMaxCharsCap(fetch);

// 7. 执行抓取
returnjsonResult(awaitrunWebFetch({
                url,
                extractMode,
maxChars: resolveMaxChars(maxChars ?? fetch?.maxChars, DEFAULT_FETCH_MAX_CHARS, maxCharsCap),
                maxResponseBytes,
maxRedirects: resolveMaxRedirects(fetch?.maxRedirects, DEFAULT_FETCH_MAX_REDIRECTS),
timeoutSeconds: resolveTimeoutSeconds(fetch?.timeoutSeconds, 30),
cacheTtlMs: resolveCacheTtlMs$1(fetch?.cacheTtlMinutes, 15),
                userAgent,
                readabilityEnabled,
                firecrawlEnabled,
                firecrawlApiKey,
                firecrawlBaseUrl,
                firecrawlOnlyMainContent,
                firecrawlMaxAgeMs,
firecrawlProxy: "auto",
firecrawlStoreInCache: true,
                firecrawlTimeoutSeconds
            }));
        }
    };
}

1.4 配置参数

参数	默认值	说明
`extractMode`	`markdown`	提取模式（markdown/text）
`maxChars`	50000	最大返回字符数
`maxResponseBytes`	2000000	最大响应字节数（2MB）
`maxRedirects`	3	最大重定向次数
`timeoutSeconds`	30	超时时间（秒）
`cacheTtlMinutes`	15	缓存 TTL（分钟）
`userAgent`	Chrome 122	User-Agent 字符串

1.5 Firecrawl 集成

// Firecrawl 配置
const firecrawl = {
enabled: true,
apiKey: "${FIRECRAWL_API_KEY}",
baseUrl: "https://api.firecrawl.dev",
onlyMainContent: true,
maxAgeMs: 172800000,  // 48 小时
timeoutSeconds: 30
};

// Firecrawl 优势：
// - 更好的内容提取
// - 支持 JavaScript 渲染页面
// - 支持截图
// - 支持结构化数据

1.6 执行流程图

web_fetch 工具调用
    ↓
1. 解析配置
    ↓
2. 检查是否启用
    ↓
3. 解析 Firecrawl 配置
    ↓
4. 解析参数（url/extractMode/maxChars）
    ↓
5. 执行抓取
    ├─ 检查缓存
    ├─ 发送 HTTP 请求
    ├─ 处理重定向
    ├─ 提取内容（Readability/Firecrawl）
    ├─ 保存到缓存
    └─ 返回结果
    ↓
6. 返回结果

1.7 返回结果格式

成功（Markdown）：

{
"url":"https://example.com",
"title":"网页标题",
"content":"# 网页标题\n\n网页内容...",
"extractMode":"markdown",
"cached":false,
"timestamp":1711716000000
}

成功（Text）：

{
"url":"https://example.com",
"title":"网页标题",
"content":"网页标题\n\n网页内容...",
"extractMode":"text",
"cached":false,
"timestamp":1711716000000
}

失败：

{
"error":"Failed to fetch URL: Connection timeout"
}

二、image_generate 工具

2.1 工具概述

功能：生成图像（文生图/图生图）
核心特性：

• 支持多种提供商（OpenAI/Google/Fal 等）
• 支持文生图和图生图
• 支持尺寸/宽高比/分辨率配置
• 支持批量生成（最多 10 张）
• 支持参考图片（最多 10 张）

2.2 Schema 定义

位置：第 27139 行

constImageGenerateToolSchema = Type.Object({
action: Type.Optional(Type.String({ 
description: 'Optional action: "generate" (default) or "list" to inspect available providers/models.'
    })),
prompt: Type.Optional(Type.String({ description: "Image generation prompt." })),
image: Type.Optional(Type.String({ description: "Optional reference image path or URL for edit mode." })),
images: Type.Optional(Type.Array(Type.String(), { 
description: `Optional reference images for edit mode (up to ${MAX_INPUT_IMAGES}).`
    })),
model: Type.Optional(Type.String({ description: "Optional provider/model override, e.g. openai/gpt-image-1." })),
filename: Type.Optional(Type.String({ description: "Optional output filename hint." })),
size: Type.Optional(Type.String({ description: "Optional size hint like 1024x1024, 1536x1024, 1024x1536, 1024x1792, or 1792x1024." })),
aspectRatio: Type.Optional(Type.String({ description: "Optional aspect ratio hint: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, or 21:9." })),
resolution: Type.Optional(Type.String({ description: "Optional resolution hint: 1K, 2K, or 4K." })),
count: Type.Optional(Type.Number({
description: `Optional number of images to request (1-${MAX_COUNT}).`,
minimum: 1,
maximum: MAX_COUNT
    }))
});

2.3 完整执行代码（部分）

位置：第 27339 行

functioncreateImageGenerateTool(options) {
// 1. 解析配置
const cfg = options?.config ?? loadConfig();
const imageGenerationModelConfig = resolveImageGenerationModelConfigForTool({
        cfg,
agentDir: options?.agentDir
    });

if (!imageGenerationModelConfig) returnnull;

const effectiveCfg = applyImageGenerationModelConfigDefaults(cfg, imageGenerationModelConfig) ?? cfg;

// 2. 解析沙盒配置
const sandboxConfig = options?.sandbox && options.sandbox.root.trim() ? {
root: options.sandbox.root.trim(),
bridge: options.sandbox.bridge,
workspaceOnly: options.fsPolicy?.workspaceOnly === true
    } : null;

return {
label: "Image Generation",
name: "image_generate",
description: "Generate new images or edit reference images with the configured or inferred image-generation model.",
parameters: ImageGenerateToolSchema,
execute: async (_toolCallId, args) => {
const params = args;

// 3. 解析 action
const action = resolveAction(args);

// 4. action: list → 列出可用提供商
if (action === "list") {
const providers = listRuntimeImageGenerationProviders({ config: cfg });
returnjsonResult({
status: "ok",
providers: providers.map((p) => ({
id: p.id,
defaultModel: p.defaultModel,
capabilities: p.capabilities,
authEnvVars: getImageGenerationProviderAuthEnvVars(p.id)
                    }))
                });
            }

// 5. action: generate → 生成图像
// 解析提示词
const prompt = readStringParam$1(params, "prompt");

// 解析参考图片
const imageInputs = normalizeReferenceImages(args);
const isEdit = imageInputs.length > 0;

// 解析数量
const count = resolveRequestedCount(args);

// 解析模型配置
const provider = resolveSelectedImageGenerationProvider({
config: cfg,
modelOverride: readStringParam$1(params, "model"),
                imageGenerationModelConfig
            });

// 验证能力
validateImageGenerationCapabilities({
                provider,
                count,
inputImageCount: imageInputs.length,
size: readStringParam$1(params, "size"),
aspectRatio: readStringParam$1(params, "aspectRatio"),
resolution: readStringParam$1(params, "resolution")
            });

// 加载参考图片
const loadedImages = isEdit ? awaitloadReferenceImages({
                imageInputs,
config: cfg,
                sandboxConfig,
workspaceDir: options?.workspaceDir,
maxBytes: Math.floor(resolveImageGenerationMaxBytesMb(cfg) * 1024 * 1024)
            }) : [];

// 推断分辨率（如果有参考图片）
const resolution = readStringParam$1(params, "resolution") || 
                              (loadedImages.length > 0 ? awaitinferResolutionFromInputImages(loadedImages) : void0);

// 6. 调用生成 API
const result = awaitrunImageGeneration({
                provider,
                prompt,
                count,
size: readStringParam$1(params, "size"),
aspectRatio: readStringParam$1(params, "aspectRatio"),
                resolution,
referenceImages: loadedImages.map((img) => img.sourceImage),
filename: readStringParam$1(params, "filename"),
config: effectiveCfg,
                sandboxConfig
            });

// 7. 返回结果
returnjsonResult({
status: "ok",
images: result.images.map((img) => ({
path: img.path,
url: img.url,
width: img.width,
height: img.height,
prompt: img.prompt
                })),
provider: provider.id,
model: result.model
            });
        }
    };
}

2.4 支持的提供商

提供商	文生图	图生图	最大数量	API Key
OpenAI	✅	✅	10	需要
Google	✅	✅	10	需要
Fal	✅	✅	10	需要
Replicate	✅	❌	10	需要

2.5 尺寸和宽高比

支持的尺寸：

• 1024x1024（1:1）
• 1536x1024（3:2）
• 1024x1536（2:3）
• 1024x1792（4:7）
• 1792x1024（7:4）

支持的宽高比：

• 1:1（正方形）
• 2:3, 3:2（纵向/横向）
• 3:4, 4:3（纵向/横向）
• 4:5, 5:4（纵向/横向）
• 9:16, 16:9（竖屏/横屏）
• 21:9（超宽屏）

支持的分辨率：

• 1K（约 1000px）
• 2K（约 2000px）
• 4K（约 4000px）

2.6 执行流程图

image_generate 工具调用
    ↓
1. 解析 action
    ↓
2. action=list → 列出提供商
    ↓
3. action=generate → 生成图像
    ├─ 解析提示词
    ├─ 解析参考图片
    ├─ 解析数量
    ├─ 解析模型配置
    ├─ 验证能力
    ├─ 加载参考图片
    ├─ 推断分辨率
    └─ 调用生成 API
    ↓
4. 返回结果

2.7 返回结果格式

list 成功：

{
"status":"ok",
"providers":[
{
"id":"openai",
"defaultModel":"dall-e-3",
"capabilities":{
"generate":{"enabled":true,"maxCount":10},
"edit":{"enabled":true,"maxInputImages":10},
"geometry":{
"sizes":["1024x1024","1536x1024","1024x1536"],
"aspectRatios":["1:1","3:2","2:3"],
"resolutions":["1K","2K","4K"]
}
},
"authEnvVars":["OPENAI_API_KEY"]
}
]
}

generate 成功：

{
"status":"ok",
"images":[
{
"path":"/tmp/image-xxx.png",
"url":"https://...",
"width":1024,
"height":1024,
"prompt":"A cute cat"
}
],
"provider":"openai",
"model":"dall-e-3"
}

三、关键机制对比

3.1 功能定位

特性	web_fetch	image_generate
用途	抓取网页	生成图像
输入	URL	提示词 + 参考图片
输出	Markdown/Text	图片文件

3.2 提供商支持

特性	web_fetch	image_generate
提供商数量	4+ 个	4+ 个
无需 API Key	DuckDuckGo	都需要
Firecrawl 集成	支持	不支持

3.3 配置要求

特性	web_fetch	image_generate
API Key	取决于提供商	需要
沙盒支持	支持	支持
缓存支持	支持	不支持

四、使用示例

4.1 web_fetch 工具调用

用户：抓取 OpenClaw 文档首页

大模型返回：

{
"tool_call":{
"name":"web_fetch",
"arguments":{
"url":"https://docs.openclaw.ai",
"extractMode":"markdown"
}
}
}

执行结果：

{
"url":"https://docs.openclaw.ai",
"title":"OpenClaw Documentation",
"content":"# OpenClaw Documentation\n\nWelcome to OpenClaw...",
"extractMode":"markdown",
"cached":false,
"timestamp":1711716000000
}

4.2 image_generate 工具调用

用户：生成一只可爱的猫咪图片

大模型返回：

{
"tool_call":{
"name":"image_generate",
"arguments":{
"prompt":"A cute fluffy cat, photorealistic, high quality",
"size":"1024x1024",
"count":1
}
}
}

执行结果：

{
"status":"ok",
"images":[
{
"path":"/tmp/image-xxx.png",
"url":"https://...",
"width":1024,
"height":1024,
"prompt":"A cute fluffy cat, photorealistic, high quality"
}
],
"provider":"openai",
"model":"dall-e-3"
}