nanoClaw/docs/ToolSystemDesign.md

476 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 工具调用系统设计
## 概述
本文档描述 NanoClaw 工具调用系统的设计,采用简化的工厂模式,减少不必要的类层次。
---
## 一、核心类图
```mermaid
classDiagram
direction TB
class ToolDefinition {
<<dataclass>>
+str name
+str description
+dict parameters
+Callable handler
+str category
+dict to_openai_format()
}
class ToolRegistry {
-dict _tools
+register(ToolDefinition tool) void
+get(str name) ToolDefinition?
+list_all() list~dict~
+execute(str name, dict args) Any
}
class ToolExecutor {
-ToolRegistry registry
+process_tool_calls(list tool_calls) list~dict~
+build_request(list messages) dict
}
class ToolResult {
<<dataclass>>
+bool success
+Any data
+str? error
+dict to_dict()
}
ToolRegistry "1" --> "*" ToolDefinition : manages
ToolExecutor "1" --> "1" ToolRegistry : uses
ToolDefinition ..> ToolResult : returns
```
---
## 二、工具定义工厂
使用工厂函数创建工具,避免复杂的类继承:
```mermaid
classDiagram
direction LR
class ToolFactory {
<<module>>
+tool(name, description, parameters)$ decorator
+register(name, handler, description, parameters)$ void
+create_crawler_tools()$ list~ToolDefinition~
+create_data_tools()$ list~ToolDefinition~
+create_file_tools()$ list~ToolDefinition~
}
class ToolDefinition {
+str name
+str description
+dict parameters
+Callable handler
}
ToolFactory ..> ToolDefinition : creates
```
---
## 三、核心类实现
### 3.1 ToolDefinition
```python
from dataclasses import dataclass, field
from typing import Callable, Any
@dataclass
class ToolDefinition:
"""工具定义"""
name: str
description: str
parameters: dict # JSON Schema
handler: Callable[[dict], Any]
category: str = "general"
def to_openai_format(self) -> dict:
return {
"type": "function",
"function": {
"name": self.name,
"description": self.description,
"parameters": self.parameters
}
}
```
### 3.2 ToolResult
```python
from dataclasses import dataclass
from typing import Any, Optional
@dataclass
class ToolResult:
"""工具执行结果"""
success: bool
data: Any = None
error: Optional[str] = None
def to_dict(self) -> dict:
return {
"success": self.success,
"data": self.data,
"error": self.error
}
@classmethod
def ok(cls, data: Any) -> "ToolResult":
return cls(success=True, data=data)
@classmethod
def fail(cls, error: str) -> "ToolResult":
return cls(success=False, error=error)
```
### 3.3 ToolRegistry
```python
from typing import Dict, List, Optional
class ToolRegistry:
"""工具注册表(单例)"""
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance._tools: Dict[str, ToolDefinition] = {}
return cls._instance
def register(self, tool: ToolDefinition) -> None:
self._tools[tool.name] = tool
def get(self, name: str) -> Optional[ToolDefinition]:
return self._tools.get(name)
def list_all(self) -> List[dict]:
return [t.to_openai_format() for t in self._tools.values()]
def execute(self, name: str, arguments: dict) -> dict:
tool = self.get(name)
if not tool:
return ToolResult.fail(f"Tool not found: {name}").to_dict()
try:
result = tool.handler(arguments)
if isinstance(result, ToolResult):
return result.to_dict()
return ToolResult.ok(result).to_dict()
except Exception as e:
return ToolResult.fail(str(e)).to_dict()
# 全局注册表
registry = ToolRegistry()
```
### 3.4 ToolExecutor
```python
import json
from typing import List, Dict
class ToolExecutor:
"""工具执行器"""
def __init__(self, registry: ToolRegistry = None):
self.registry = registry or ToolRegistry()
def process_tool_calls(self, tool_calls: List[dict]) -> List[dict]:
"""处理工具调用,返回消息列表"""
results = []
for call in tool_calls:
name = call["function"]["name"]
args = json.loads(call["function"]["arguments"])
call_id = call["id"]
result = self.registry.execute(name, args)
results.append({
"role": "tool",
"tool_call_id": call_id,
"name": name,
"content": json.dumps(result, ensure_ascii=False)
})
return results
def build_request(self, messages: List[dict], **kwargs) -> dict:
"""构建 API 请求"""
return {
"model": kwargs.get("model", "glm-5"),
"messages": messages,
"tools": self.registry.list_all(),
"tool_choice": "auto"
}
```
---
## 四、工具工厂模式
### 4.1 装饰器注册
```python
# backend/tools/factory.py
from .core import ToolDefinition, registry
def tool(name: str, description: str, parameters: dict, category: str = "general"):
"""工具注册装饰器"""
def decorator(func):
tool_def = ToolDefinition(
name=name,
description=description,
parameters=parameters,
handler=func,
category=category
)
registry.register(tool_def)
return func
return decorator
```
### 4.2 使用示例
```python
# backend/tools/builtin/crawler.py
from ..factory import tool
# 网页搜索工具
@tool(
name="web_search",
description="搜索互联网获取信息",
parameters={
"type": "object",
"properties": {
"query": {"type": "string", "description": "搜索关键词"},
"max_results": {"type": "integer", "default": 5}
},
"required": ["query"]
},
category="crawler"
)
def web_search(arguments: dict) -> dict:
from ..services import SearchService
query = arguments["query"]
max_results = arguments.get("max_results", 5)
service = SearchService()
results = service.search(query, max_results)
return {"results": results}
# 页面抓取工具
@tool(
name="fetch_page",
description="抓取指定网页内容",
parameters={
"type": "object",
"properties": {
"url": {"type": "string", "description": "网页URL"},
"extract_type": {"type": "string", "enum": ["text", "links", "structured"]}
},
"required": ["url"]
},
category="crawler"
)
def fetch_page(arguments: dict) -> dict:
from ..services import FetchService
url = arguments["url"]
extract_type = arguments.get("extract_type", "text")
service = FetchService()
result = service.fetch(url, extract_type)
return result
# 计算器工具
@tool(
name="calculator",
description="执行数学计算",
parameters={
"type": "object",
"properties": {
"expression": {"type": "string", "description": "数学表达式"}
},
"required": ["expression"]
},
category="data"
)
def calculator(arguments: dict) -> dict:
import ast
import operator
expr = arguments["expression"]
# 安全计算
ops = {
ast.Add: operator.add,
ast.Sub: operator.sub,
ast.Mult: operator.mul,
ast.Div: operator.truediv
}
node = ast.parse(expr, mode='eval')
result = eval(compile(node, '<string>', 'eval'), {"__builtins__": {}}, ops)
return {"result": result}
```
---
## 五、辅助服务类
工具依赖的服务保持独立,不与工具类耦合:
```mermaid
classDiagram
direction LR
class SearchService {
-SearchEngine engine
+search(str query, int limit) list~dict~
}
class FetchService {
+fetch(str url, str type) dict
+fetch_batch(list urls) dict
}
class ContentExtractor {
+extract_text(html) str
+extract_links(html) list
+extract_structured(html) dict
}
FetchService --> ContentExtractor : uses
```
```python
# backend/tools/services.py
class SearchService:
"""搜索服务"""
def __init__(self, engine=None):
from ddgs import DDGS
self.engine = engine or DDGS()
def search(self, query: str, max_results: int = 5) -> list:
results = list(self.engine.text(query, max_results=max_results))
return [
{"title": r["title"], "url": r["href"], "snippet": r["body"]}
for r in results
]
class FetchService:
"""页面抓取服务"""
def __init__(self, timeout: float = 30.0):
self.timeout = timeout
def fetch(self, url: str, extract_type: str = "text") -> dict:
import httpx
from bs4 import BeautifulSoup
resp = httpx.get(url, timeout=self.timeout, follow_redirects=True)
soup = BeautifulSoup(resp.text, "html.parser")
extractor = ContentExtractor(soup)
if extract_type == "text":
return {"text": extractor.extract_text()}
elif extract_type == "links":
return {"links": extractor.extract_links()}
else:
return extractor.extract_structured()
class ContentExtractor:
"""内容提取器"""
def __init__(self, soup):
self.soup = soup
def extract_text(self) -> str:
# 移除脚本和样式
for tag in self.soup(["script", "style"]):
tag.decompose()
return self.soup.get_text(separator="\n", strip=True)
def extract_links(self) -> list:
return [
{"text": a.get_text(strip=True), "href": a.get("href")}
for a in self.soup.find_all("a", href=True)
]
def extract_structured(self) -> dict:
return {
"title": self.soup.title.string if self.soup.title else "",
"text": self.extract_text(),
"links": self.extract_links()[:20]
}
```
---
## 六、工具初始化
```python
# backend/tools/__init__.py
from .core import ToolDefinition, ToolResult, ToolRegistry, registry, ToolExecutor
from .factory import tool
def init_tools():
"""初始化所有内置工具"""
# 导入即自动注册
from .builtin import crawler, data, weather
# 使用时
init_tools()
```
---
## 七、工具清单
| 类别 | 工具名称 | 描述 | 依赖服务 |
| ------- | --------------- | ---- | ------------- |
| crawler | `web_search` | 网页搜索 | SearchService |
| crawler | `fetch_page` | 单页抓取 | FetchService |
| crawler | `crawl_batch` | 批量爬取 | FetchService |
| data | `calculator` | 数学计算 | CalculatorService |
| data | `text_process` | 文本处理 | - |
| data | `json_process` | JSON处理 | - |
| weather | `get_weather` | 天气查询 | - (模拟数据) |
---
## 八、与旧设计对比
| 方面 | 旧设计 | 新设计 |
| ----- | ----------------- | --------- |
| 类数量 | 30+ | ~10 |
| 工具定义 | 继承 BaseTool | 装饰器 + 函数 |
| 中间抽象层 | 5个CrawlerTool 等) | 无 |
| 扩展方式 | 创建子类 | 写函数 + 装饰器 |
| 代码量 | 多 | 少 |
---
## 九、总结
简化后的设计:
1. **核心类**`ToolDefinition`、`ToolRegistry`、`ToolExecutor`、`ToolResult`
2. **工厂模式**:使用 `@tool` 装饰器注册工具
3. **服务分离**:工具依赖的服务独立,不与工具类耦合
4. **易于扩展**:新增工具只需写一个函数并加装饰器