feat: Phase 6.3 视觉理解 — 多模态图片输入 + OCR/Vision 工具 + 图片编码管线

- LLMMessage 新增 Images 字段支持多模态 content array - OpenAIProvider 支持 image_url content parts - VisionTool: 图片读取 + base64 编码 + OCR/场景描述/综合分析 - 对话管道全线支持 images 参数传递 (Gateway->Orchestrator->Synthesizer->LLM) - 自动根据图片有无构建 text-only 或 multimodal content Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 22:28:42 +08:00
parent 38b36fc5ad
commit 9a8fb8d0ce
7 changed files with 205 additions and 24 deletions
@@ -135,6 +135,7 @@ func DefaultAutonomousToolPolicy() *AutonomousToolPolicy {
 			"iot_query", "iot_control", "memory_search", "web_search",
 			"calculator", "datetime", "web_fetch",
 			"host_exec", "host_file", "host_system",
+			"vision_analyze",
 		},
 		MaxToolCallsPerRound: 5,
 		MaxHighRiskPerHour:   10,