让 AI 动手查数据——从自然语言到 SQL 的魔法

"数据不对"的困惑

"小众清单"上线三个月，用户量突破了一万。林远每天都会收到各种反馈，其中最让他头疼的就是"数据不对"这类模糊报告。

"我的清单数据不对。"用户说。
"哪里不对？"林远问。
"就是不对。"
"能具体一点吗？比如哪条数据？和预期有什么差异？"
"反正就是不对，你自己看吧。"

然后林远就得打开数据库，一层层翻找。有时候是用户理解偏差，有时候确实是 bug，更多时候是某个边缘条件没处理好。

他做过一个数据查询面板，但用户不会用——要填表单、选字段、设条件。他想教用户写 SQL，但"学习成本太高"的反馈让他放弃了这个念头。

"要是能直接用自然语言查数据就好了。"

这个念头在他脑海里转了很久。直到某天，他在 Claude Cookbooks 里翻到了 Tool Use 和 Text to SQL 的教程。

Tool Use：让 AI 学会"动手"

上一章，林远让 AI "看见"了截图。但那只是被动接收信息——AI 能理解图片内容，却不能主动做什么。

Tool Use 是另一回事。它让 AI 能够"动手"——调用外部工具、执行操作、返回结果。

官方示例里有个计算器工具的例子：

from anthropic import Anthropic

client = Anthropic()

tools = [

{

"name": "calculator",

"description": "一个简单的计算器，可以执行基本算术运算",

"input_schema": {

"type": "object",

"properties": {

"expression": {

"type": "string",

"description": "要计算的数学表达式，如 '2 + 3 * 4'"

}

"required": ["expression"]

}

]

message = client.messages.create(

model="claude-sonnet-4-6",

max_tokens=1024,

messages=[{

"role": "user",

"content": "帮我算一下 1984135 乘以 9343116 等于多少？"

}],

tools=tools

)

print(message.content)

运行这段代码，林远看到了一个让他兴奋的结果。Claude 的响应会包含一个 tool_use 类型的 content block：

这告诉你要调用什么工具、传什么参数。完整的处理流程是这样的：

def calculate(expression):

import re

expression = re.sub(r"[^0-9+\-*/().]", "", expression)

try:

result = eval(expression) # 注意：生产环境应使用更安全的解析器

return str(result)

except:

return "Error: Invalid expression"

message = client.messages.create(

model="claude-sonnet-4-6",

max_tokens=1024,

messages=[{"role": "user", "content": "帮我算一下 1984135 乘以 9343116"}],

tools=tools

)

if message.stop_reason == "tool_use":

tool_use = next(block for block in message.content if block.type == "tool_use")

tool_name = tool_use.name

tool_input = tool_use.input

print(f"Claude 想调用: {tool_name}")

print(f"参数: {tool_input}")

result = calculate(tool_input["expression"])

print(f"计算结果: {result}")

AI 并不直接给出答案——它知道自己可能会算错，所以主动提出要调用计算器工具。你提供工具的实现，AI 决定什么时候调用、传什么参数。

这就是 Tool Use 的核心：AI 不必什么都会，它只需要知道什么时候该用什么工具。

林远很快理解了这个模式的威力。计算器只是示例，真正的工具可以是任何东西——查询数据库的 SQL、调用 API 的函数、访问文件系统的脚本……

"等等，这不就是自然语言查数据的思路吗？"

Text to SQL：用中文查数据库

林远找到了 Cookbooks 里的 Text to SQL 教程。这是一个完整的指南，教你怎么把自然语言转成 SQL 查询。

核心思路很清晰：

1. 把数据库 schema 告诉 Claude

2. 用户用自然语言提问

3. Claude 生成 SQL

4. 执行 SQL，返回结果

第一步：获取数据库结构

import sqlite3

def get_schema_info(db_path):

conn = sqlite3.connect(db_path)

cursor = conn.cursor()

schema_info = []

cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")

tables = cursor.fetchall()

for (table_name,) in tables:

cursor.execute(f"PRAGMA table_info({table_name})")

columns = cursor.fetchall()

table_info = f"Table: {table_name}\n"

table_info += "\n".join(f" - {col[1]} ({col[2]})" for col in columns)

schema_info.append(table_info)

conn.close()

return "\n\n".join(schema_info)

第二步：构造 Prompt

def generate_sql_prompt(schema, query):

return f"""

你是一个 AI 助手，负责将自然语言查询转换为 SQL。

数据库结构：

{schema}

</schema>

将以下自然语言查询转换为 SQL：

<query>

{query}

</query>

请在 <sql> 标签中提供 SQL 查询语句。

"""

user_query = "工程部有哪些员工？"

prompt = generate_sql_prompt(schema, user_query)

第三步：让 Claude 生成 SQL

response = client.messages.create(

model="claude-sonnet-4-6",

max_tokens=1000,

messages=[{"role": "user", "content": prompt}]

)

sql = response.content[0].text.split("<sql>")[1].split("</sql>")[0].strip()

print(sql)

第四步：执行 SQL 并返回结果

import pandas as pd

def run_sql(db_path, sql):

conn = sqlite3.connect(db_path)

result = pd.read_sql_query(sql, conn)

conn.close()

return result

result = run_sql("myapp.db", sql)

print(result)

整个过程行云流水。用户说"工程部有哪些员工"，系统自动生成 SQL、执行查询、返回结果。

林远立刻在本地数据库上试了几个查询：

• "本月新增用户有多少？" → 正确的 COUNT 查询

• "销量最高的五个产品" → 正确的 TOP 5 查询

• "过去七天每天的订单金额" → 正确的日期分组查询

"这太强了，我之前写的查询面板可以扔了。"

进阶：让 SQL 生成更可靠

林远继续往下读，发现 Cookbooks 里还有几个提升可靠性的技巧。

1. Few-shot Prompting：给示例

examples = """

示例 1:

<query>列出 HR 部门的所有员工</query>

<sql>SELECT e.name FROM employees e

JOIN departments d ON e.department_id = d.id

WHERE d.name = 'HR';</sql>

示例 2:

<query>2022 年入职的员工平均薪资是多少？</query>

<sql>SELECT AVG(salary) FROM employees

WHERE strftime('%Y', hire_date) = '2022';</sql>

"""

在 prompt 里加入几个示例，AI 生成的 SQL 会更准确。

2. Chain-of-Thought：让 AI 展示思考过程

def generate_sql_with_cot(schema, query):

return f"""

将以下查询转换为 SQL，并在 <thought_process> 标签中展示你的思考过程。

<query>{query}</query>

输出格式：

<thought_process>

1. 分析查询需要哪些表

2. 确定表之间的关联条件

3. 选择需要返回的字段

4. 构建完整的 SQL

</thought_process>

"""

AI 会先分析查询需求，再生成 SQL。这个中间步骤让它能处理更复杂的查询。

3. Self-Improvement：错误自修正

更高级的做法是让 AI 自己修正错误：

def generate_sql_with_retry(query, max_attempts=3):

for attempt in range(max_attempts):

sql = generate_sql(query)

success, result, error = execute_sql(sql)

if success:

return sql, result

correction_prompt = f"""

之前的 SQL 执行失败了：

SQL: {sql}

错误: {error}

请分析错误并给出修正后的 SQL。

"""

sql = generate_sql(correction_prompt)

return None, None

林远把这个模式用在了他的项目里。用户反馈"数据不对"时，他先让 AI 把相关数据查出来，再和用户一起核对。很多时候，问题就在这个过程中暴露出来了。

实战：构建一个自然语言查询工具

林远决定为"小众清单"做一个自然语言查询功能。核心代码如下：

import sqlite3

from anthropic import Anthropic

import pandas as pd

class NaturalLanguageQuery:

def __init__(self, db_path):

self.db_path = db_path

self.client = Anthropic()

self.schema = self._get_schema()

def _get_schema(self):

conn = sqlite3.connect(self.db_path)

cursor = conn.cursor()

cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")

tables = cursor.fetchall()

schema_parts = []

for (table_name,) in tables:

cursor.execute(f"PRAGMA table_info({table_name})")

columns = cursor.fetchall()

cols = "\n".join(f" - {col[1]}: {col[2]}" for col in columns)

schema_parts.append(f"表名: {table_name}\n字段:\n{cols}")

conn.close()

return "\n\n".join(schema_parts)

def query(self, question):

prompt = f"""

数据库结构：

{self.schema}

问题：{question}

请生成 SQL 查询，用 <sql> 标签包裹。

只返回 SELECT 语句，不要修改数据。

"""

response = self.client.messages.create(

model="claude-sonnet-4-6",

max_tokens=1000,

messages=[{"role": "user", "content": prompt}]

)

text = response.content[0].text

if "<sql>" in text:

sql = text.split("<sql>")[1].split("</sql>")[0].strip()

try:

conn = sqlite3.connect(self.db_path)

result = pd.read_sql_query(sql, conn)

conn.close()

return {"success": True, "sql": sql, "data": result}

except Exception as e:

return {"success": False, "error": str(e), "sql": sql}

return {"success": False, "error": "无法生成 SQL"}

nlq = NaturalLanguageQuery("xiaozhongqingdan.db")

result = nlq.query("过去七天新增了多少清单？")

print(result["data"])

林远把这个功能集成到了后台管理界面。现在他可以直接用中文查数据：

• "昨天的活跃用户数"

• "每个分类下有多少清单"

• "最近一个月新增用户的来源分布"

"终于不用手写 SQL 了。"林远长舒一口气。

从"动得了"到"能自主"

那天晚上，林远更新了他的开发日志：

第二项能力解锁：Tool Use。

以前查数据要写 SQL，现在说一句话就行。

但这还不够。AI 现在是被动的——我让它查，它才查。

真正的 AI 分身应该能主动规划。我给一个目标，它自己分解任务、调用工具、汇报结果。

这个能力，Cookbooks 里叫 Agent SDK。下一章要好好研究一下。

林远不知道的是，那个"自己调研竞品最新动态"的梦想，已经离他不远了。

代码参考

本文涉及的完整代码来自 Claude Cookbooks：https://github.com/anthropics/claude-cookbooks/

• tool_use/calculator_tool.ipynb - Tool Use 入门示例

• tool_use/customer_service_agent.ipynb - 多工具组合示例

• capabilities/text_to_sql/guide.ipynb - Text to SQL 完整指南

下期预告：输入"调研竞品最新动态"，AI 自动完成搜索、阅读、总结全流程。Agent SDK 让 AI 真正自主起来……

本文属于「独立开发者的 AI 分身进化论」系列，记录一个独立开发者如何逐步解锁 Claude API 的各项能力。

"数据不对"的困惑

Tool Use：让 AI 学会"动手"

Text to SQL：用中文查数据库

进阶：让 SQL 生成更可靠

实战：构建一个自然语言查询工具

更多的工具，更多的可能

从"动得了"到"能自主"

代码参考