# 比赛优化五件套 Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** 实现 5 项已确认的比赛优化:① 性能自检卡(SME2 证据)② 问答检索可视化 ③ 报告摘要预生成 + 趋势解读缓存 ④ OCR 文本辅助报告识别(Qwen3.5-2B 多模态,QwenVL-4B 已弃用)⑤ AIRuntime 优先级闸门 + MNN KV cache 调研。 **Architecture:** 不改变 §3.1 模块边界(UI → Service → AIRuntime)。在 AIRuntime 增加两后端归一的 `GenerateStats` 与协作式优先级闸门(interactive 插队、background 在下一 token 让位);HealthExportService 的事件流增加 `.retrieved(RetrievalSummary)`;新增 BenchmarkService / ReportInsightService / TrendInsightService 三个轻服务;VL prompt 注入 OCR 参考文本。**注意:视觉推理现在由 Qwen3.5-2B 多模态承担(MNN Omni 主路 / MLX VLSession 兜底,均从 `.llm`/`.mnnLLM` 目录加载),不存在独立 VL 模型。** **Tech Stack:** SwiftUI + SwiftData(iOS 17+)、MNN(ObjC++ bridge)、MLX Swift(mlx-swift-lm 2.31.3,`GenerateCompletionInfo`)、Vision OCR、Swift Testing(康康Tests)。 **用户编号 → 任务映射:** 用户项 1 → Task 1+2;项 2 → Task 3;项 5 → Task 4+8;项 3 → Task 5+6;项 4 → Task 7。Task 4(优先级闸门)提前是因为 Task 5 的后台预生成依赖 `priority: .background`。 **构建/测试命令(全任务通用):** ```bash # 单元测试(模拟器;首次先 xcrun simctl list devices available | grep iPhone 确认名字) cd /Users/xuhuayong/apps/康康 xcodebuild test -project 康康.xcodeproj -scheme 康康 \ -destination 'platform=iOS Simulator,name=iPhone 17' \ -derivedDataPath build/cli-dd \ -only-testing:'康康Tests/<测试类>' 2>&1 | tail -25 # 设备编译验证(MNNLLMBridge 真实分支只在 device 切片编译) xcodebuild build -project 康康.xcodeproj -scheme 康康 \ -destination 'generic/platform=iOS' \ -derivedDataPath build/cli-dd CODE_SIGNING_ALLOWED=NO 2>&1 | tail -15 ``` **红线提醒(写每一行代码前记住):** 不出现「患者」字样;新 UI 字号一律 `Font.tjScaled`;颜色只用 `Tj.Palette.*`;UI 不直接调 AIRuntime(自检页例外已既成,新代码走 Service);所有 prompt 带 few-shot + `/no_think` + 失败回退;不碰 Localizable.xcstrings(git status 里已有未提交修改,保持不动)。 --- ### Task 1: GenerateStats 两后端归一统计 **Files:** - Create: `康康/AI/GenerateStats.swift` - Modify: `康康/AI/MNNBackend.swift` - Modify: `康康/AI/LLMSession.swift` - Modify: `康康/AI/AIRuntime.swift` - [x] **Step 1.1: 新建 GenerateStats.swift** ```swift import Foundation /// 单次生成的性能统计,两后端(MNN / MLX)归一。 /// MNN 取自 LlmContext(prefill_us / decode_us);MLX 取自 GenerateCompletionInfo。 struct GenerateStats: Sendable, Equatable { var promptTokens: Int var genTokens: Int /// prefill(读入 prompt)耗时,秒。 var prefillSeconds: Double /// decode(逐 token 生成)耗时,秒。 var decodeSeconds: Double var prefillTokensPerSecond: Double { prefillSeconds > 0 ? Double(promptTokens) / prefillSeconds : 0 } var decodeTokensPerSecond: Double { decodeSeconds > 0 ? Double(genTokens) / decodeSeconds : 0 } } ``` - [x] **Step 1.2: MNNBackend 捕获统计** actor 增加状态与方法: ```swift /// 末次生成统计(供 AIRuntime 在流结束后取走,性能自检用)。 private(set) var lastStats: GenerateStats? private func record(_ s: GenerateStats) { lastStats = s } ``` `generate` 的 detached Task 改为(MNNGenerateStats 是 ObjC 对象,先抽成 Sendable 的 GenerateStats 再跨 actor): ```swift let task = Task.detached(priority: .userInitiated) { let stats = box.value.generateText(prompt, maxTokens: Int32(maxTokens)) { piece in let rate = meter.tick() continuation.yield(TokenChunk(text: piece, decodeRate: rate)) } await self.record(GenerateStats( promptTokens: Int(stats.promptTokens), genTokens: Int(stats.genTokens), prefillSeconds: stats.prefillMs / 1000.0, decodeSeconds: stats.decodeMs / 1000.0 )) continuation.finish() } ``` `analyze` 同样:`_ = try box.value.analyzeImages(...)` 改为接住返回的 stats,`cont.resume(returning:)` 前 `await self.record(...)`(注意 analyzeImages 返回 optional,`if let s = ...` 再 record)。 - [x] **Step 1.3: LLMSession 捕获 .info 统计** actor 增加: ```swift /// 末次生成统计(取自流末尾的 .info 完成事件,性能自检用)。 private(set) var lastStats: GenerateStats? private func record(_ s: GenerateStats) { lastStats = s } ``` `generate` 内 switch 的 `.info` 分支改为: ```swift case .info(let info): // 生成完成统计,是流的最后一个事件 await self.record(GenerateStats( promptTokens: info.promptTokenCount, genTokens: info.generationTokenCount, prefillSeconds: info.promptTime, decodeSeconds: info.generateTime )) ``` - [x] **Step 1.4: AIRuntime 暴露统计与后端标签** actor 增加: ```swift /// 末次文本生成的性能统计(性能自检页消费;两后端归一)。 private(set) var lastGenerateStats: GenerateStats? /// 当前实际生效的后端标签(性能自检 / PPT 截图用)。 var activeBackendLabel: String { if InferenceEngine.current == .mnn, mnnStatus == .ready { return InferenceEngine.cpuSupportsSME2 ? "MNN · SME2" : "MNN · NEON" } #if targetEnvironment(simulator) return "MLX · CPU(模拟器)" #else return "MLX · GPU" #endif } ``` `generate` MLX 分支 `for try await` 循环之后、`continuation.finish()` 之前加: ```swift self.lastGenerateStats = await session.lastStats ``` `mnnGenerate` 同位置加: ```swift self.lastGenerateStats = await self.mnn.lastStats ``` - [x] **Step 1.5: 设备编译验证(命令见顶部),确认无错误。Commit** ```bash git add 康康/AI/GenerateStats.swift 康康/AI/MNNBackend.swift 康康/AI/LLMSession.swift 康康/AI/AIRuntime.swift git commit -m "feat(AI): 两后端归一的 GenerateStats(prefill/decode 实测统计)" ``` --- ### Task 2: 性能自检卡(用户项 1) **Files:** - Create: `康康/Services/BenchmarkService.swift` - Create: `康康Tests/BenchmarkStoreTests.swift` - Modify: `康康/Features/Me/ModelSelfTestView.swift`(整体改造) - Modify: `康康/Features/Me/ModelManagementView.swift:31-42`(入口条件 + 文案) - [x] **Step 2.1: 写失败测试 BenchmarkStoreTests.swift** ```swift import Testing import Foundation @testable import 康康 struct BenchmarkStoreTests { private func freshDefaults() -> UserDefaults { let d = UserDefaults(suiteName: "test.kk.benchmark")! d.removePersistentDomain(forName: "test.kk.benchmark") return d } @Test func savesAndLoadsPerBackend() { let d = freshDefaults() let mnn = BenchmarkResult(backendLabel: "MNN · SME2", promptTokens: 30, genTokens: 80, prefillTokensPerSecond: 120, decodeTokensPerSecond: 25, totalSeconds: 4.2, date: .now) let mlx = BenchmarkResult(backendLabel: "MLX · GPU", promptTokens: 30, genTokens: 80, prefillTokensPerSecond: 300, decodeTokensPerSecond: 40, totalSeconds: 2.5, date: .now) BenchmarkService.save(mnn, defaults: d) BenchmarkService.save(mlx, defaults: d) let all = BenchmarkService.load(defaults: d) #expect(all.count == 2) #expect(all["MNN · SME2"]?.decodeTokensPerSecond == 25) } @Test func overwritesSameBackend() { let d = freshDefaults() let old = BenchmarkResult(backendLabel: "MLX · GPU", promptTokens: 1, genTokens: 1, prefillTokensPerSecond: 1, decodeTokensPerSecond: 1, totalSeconds: 1, date: .now) var new = old; new.decodeTokensPerSecond = 99 BenchmarkService.save(old, defaults: d) BenchmarkService.save(new, defaults: d) #expect(BenchmarkService.load(defaults: d)["MLX · GPU"]?.decodeTokensPerSecond == 99) } @Test func loadOnEmptyReturnsEmpty() { #expect(BenchmarkService.load(defaults: freshDefaults()).isEmpty) } } ``` - [x] **Step 2.2: 跑测试确认编译失败(BenchmarkService 不存在)** - [x] **Step 2.3: 新建 BenchmarkService.swift** ```swift import Foundation /// 单次性能自检结果。按后端标签归档,供「MNN·SME2 vs MLX·GPU」对比展示(§12 卖点 2/6)。 struct BenchmarkResult: Codable, Equatable { var backendLabel: String var promptTokens: Int var genTokens: Int var prefillTokensPerSecond: Double var decodeTokensPerSecond: Double var totalSeconds: Double var date: Date } /// 性能自检服务:跑固定 prompt,取 AIRuntime 的归一统计,按后端标签存 UserDefaults。 /// UI(ModelSelfTestView)只经本服务调 AIRuntime(§3.1)。 @MainActor struct BenchmarkService { static let shared = BenchmarkService() private init() {} static let storeKey = "kk.benchmark.results" /// 固定测试 prompt:跨设备/引擎可比的前提。 static let fixedPrompt = "用中文一句话介绍肝功能里 ALT 这个指标。" /// 跑一次自检。onToken 把流式输出交给 UI 展示。 func run(onToken: @escaping @MainActor (String, Double) -> Void) async throws -> BenchmarkResult { try await AIRuntime.shared.prepare() let start = Date() let stream = await AIRuntime.shared.generate(prompt: Self.fixedPrompt, maxTokens: 128) for try await chunk in stream { onToken(chunk.text, chunk.decodeRate) } let total = Date().timeIntervalSince(start) let label = await AIRuntime.shared.activeBackendLabel let stats = await AIRuntime.shared.lastGenerateStats let result = BenchmarkResult( backendLabel: label, promptTokens: stats?.promptTokens ?? 0, genTokens: stats?.genTokens ?? 0, prefillTokensPerSecond: stats?.prefillTokensPerSecond ?? 0, decodeTokensPerSecond: stats?.decodeTokensPerSecond ?? 0, totalSeconds: total, date: .now ) Self.save(result) return result } // MARK: - 存档(静态纯函数,单测覆盖) static func save(_ result: BenchmarkResult, defaults: UserDefaults = .standard) { var all = load(defaults: defaults) all[result.backendLabel] = result if let data = try? JSONEncoder().encode(all) { defaults.set(data, forKey: storeKey) } } static func load(defaults: UserDefaults = .standard) -> [String: BenchmarkResult] { guard let data = defaults.data(forKey: storeKey), let all = try? JSONDecoder().decode([String: BenchmarkResult].self, from: data) else { return [:] } return all } } ``` - [x] **Step 2.4: 跑测试确认通过** - [x] **Step 2.5: 改造 ModelSelfTestView** 保留原 prompt 卡/状态行/输出框骨架,改动:`run()` 改走 BenchmarkService;新增本次结果卡(后端 badge + 读入/生成 tok/s + 总耗时)、历史对比卡(每后端一行 + 「切换引擎后再跑一次即可对比」提示);外层换 ScrollView;标题改「性能自检」。完整代码: ```swift import SwiftUI /// 性能自检:跑固定 prompt,展示当前后端(MNN·SME2 / MNN·NEON / MLX·GPU)的 /// prefill / decode 实测速度,并按后端存档对比 —— 挑战赛考核点的可见证据(§12 卖点 2/6)。 struct ModelSelfTestView: View { @State private var output = "" @State private var phase: Phase = .idle @State private var rate: Double = 0 @State private var lastResult: BenchmarkResult? @State private var history: [String: BenchmarkResult] = [:] private enum Phase: Equatable { case idle, loading, running, done, failed(String) var label: String { switch self { case .idle: return String(appLoc: "未开始") case .loading: return String(appLoc: "加载模型…") case .running: return String(appLoc: "推理中…") case .done: return String(appLoc: "完成 ✓") case .failed(let m): return String(appLoc: "失败:\(m)") } } } private var isBusy: Bool { phase == .loading || phase == .running } private var statusColor: Color { switch phase { case .failed: return Tj.Palette.brick case .done: return Tj.Palette.leaf default: return Tj.Palette.text2 } } var body: some View { ScrollView { VStack(alignment: .leading, spacing: 16) { promptCard HStack { Text(phase.label) .font(.tjScaled( 13, weight: .medium)) .foregroundStyle(statusColor) .lineLimit(1) Spacer() if rate > 0 { Text(String(format: "%.1f tok/s", rate)) .font(.tjScaled( 12, design: .monospaced)) .foregroundStyle(Tj.Palette.text3) } } Button { Task { await run() } } label: { Text(isBusy ? "运行中…" : "运行性能自检").frame(maxWidth: .infinity) } .buttonStyle(TjPrimaryButton()) .disabled(isBusy) if isBusy { AIFlowBar() } if let r = lastResult { statsCard(r) } outputCard if !history.isEmpty { historyCard } } .padding(16) } .background(Tj.Palette.sand.ignoresSafeArea()) .navigationTitle("性能自检") .navigationBarTitleDisplayMode(.inline) .onAppear { history = BenchmarkService.load() } } private var promptCard: some View { VStack(alignment: .leading, spacing: 6) { Text("测试 PROMPT") .font(.tjScaled( 11, weight: .semibold)) .tracking(0.5) .foregroundStyle(Tj.Palette.text3) Text(BenchmarkService.fixedPrompt) .font(.tjScaled( 14)) .foregroundStyle(Tj.Palette.text) } .padding(14) .frame(maxWidth: .infinity, alignment: .leading) .tjCard() } private func statsCard(_ r: BenchmarkResult) -> some View { VStack(alignment: .leading, spacing: 10) { HStack { Text("本次结果") .font(.tjScaled( 12, weight: .semibold)) .foregroundStyle(Tj.Palette.text2) Spacer() TjBadge(text: r.backendLabel, style: .leaf) } HStack(spacing: 0) { metric(String(appLoc: "读入"), r.prefillTokensPerSecond > 0 ? String(format: "%.0f tok/s", r.prefillTokensPerSecond) : "—") metric(String(appLoc: "生成"), String(format: "%.1f tok/s", r.decodeTokensPerSecond)) metric(String(appLoc: "总耗时"), String(format: "%.1fs", r.totalSeconds)) } Text(String(appLoc: "prompt \(r.promptTokens) tok · 生成 \(r.genTokens) tok · 100% 本地")) .font(.tjScaled( 10, design: .monospaced)) .foregroundStyle(Tj.Palette.text3) } .padding(14) .frame(maxWidth: .infinity, alignment: .leading) .tjCard() } private func metric(_ label: String, _ value: String) -> some View { VStack(spacing: 3) { Text(value) .font(.tjScaled( 15, weight: .semibold, design: .monospaced)) .foregroundStyle(Tj.Palette.text) Text(label) .font(.tjScaled( 10)) .foregroundStyle(Tj.Palette.text3) } .frame(maxWidth: .infinity) } private var outputCard: some View { ScrollView { Text(output.isEmpty ? "(暂无输出)" : output) .font(.system(.footnote, design: .monospaced)) .foregroundStyle(Tj.Palette.text) .frame(maxWidth: .infinity, alignment: .leading) .textSelection(.enabled) .padding(12) } .frame(maxHeight: 220) .background( RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous) .fill(Tj.Palette.paper) ) .overlay( RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous) .strokeBorder(Tj.Palette.lineSoft, lineWidth: 1) ) } private var historyCard: some View { VStack(alignment: .leading, spacing: 10) { Text("各引擎实测对比") .font(.tjScaled( 12, weight: .semibold)) .foregroundStyle(Tj.Palette.text2) ForEach(history.keys.sorted(), id: \.self) { key in if let r = history[key] { HStack { Text(key) .font(.tjScaled( 12, weight: .medium)) .foregroundStyle(Tj.Palette.text) Spacer() Text(String(format: "生成 %.1f tok/s", r.decodeTokensPerSecond)) .font(.tjScaled( 12, design: .monospaced)) .foregroundStyle(Tj.Palette.leaf) Text(r.date.formatted(.dateTime.month().day())) .font(.tjScaled( 10)) .foregroundStyle(Tj.Palette.text3) } } } Text("在「我的 · 推理引擎」切换引擎后再跑一次,即可对比 SME2 与 GPU。") .font(.tjScaled( 10)) .foregroundStyle(Tj.Palette.text3) } .padding(14) .frame(maxWidth: .infinity, alignment: .leading) .tjCard() } @MainActor private func run() async { output = "" rate = 0 lastResult = nil phase = .loading do { let result = try await BenchmarkService.shared.run { piece, r in output += piece if r > 0 { rate = r } if phase == .loading { phase = .running } } lastResult = result history = BenchmarkService.load() phase = .done } catch { phase = .failed(error.localizedDescription) } } } #Preview { NavigationStack { ModelSelfTestView() } } ``` - [x] **Step 2.6: ModelManagementView 入口条件放宽 + 改文案** `ModelManagementView.swift:31` 的 ```swift if service.states[.mnnLLM]?.phase == .ready { ``` 改为 ```swift if service.states[.mnnLLM]?.phase == .ready || service.states[.llm]?.phase == .ready { ``` `Text("运行推理自检")` 改为 `Text("性能自检")`,icon `"play.circle"` 改为 `"gauge.with.needle"`。 - [x] **Step 2.7: 跑 BenchmarkStoreTests + 模拟器编译通过。Commit** ```bash git add 康康/Services/BenchmarkService.swift 康康Tests/BenchmarkStoreTests.swift 康康/Features/Me/ModelSelfTestView.swift 康康/Features/Me/ModelManagementView.swift git commit -m "feat(Me): 性能自检卡 — 后端标识 + prefill/decode 实测 + 引擎对比存档" ``` --- ### Task 3: 检索可视化(用户项 2) **Files:** - Modify: `康康/Services/HealthExportService.swift`(RetrievalSummary + Event.retrieved + answer 事件化) - Modify: `康康/Features/Archive/HealthExportSheet.swift`(chips UI) - Create: `康康Tests/RetrievalSummaryTests.swift` - [x] **Step 3.1: 写失败测试 RetrievalSummaryTests.swift** ```swift import Testing @testable import 康康 struct RetrievalSummaryTests { @Test func groupsAndCountsPreservingOrder() { let chips = HealthExportService.RetrievalSummary.groupedChips( ["血压", "血糖", "血压", "血压", "体重"], cap: 8) #expect(chips == ["血压 ×3", "血糖", "体重"]) } @Test func capsAndAppendsOverflow() { let names = (1...12).map { "指标\($0)" } let chips = HealthExportService.RetrievalSummary.groupedChips(names, cap: 8) #expect(chips.count == 9) #expect(chips.last == "+4") } @Test func emptyInputGivesEmptyChips() { #expect(HealthExportService.RetrievalSummary.groupedChips([], cap: 8).isEmpty) } } ``` - [x] **Step 3.2: 跑测试确认编译失败** - [x] **Step 3.3: HealthExportService 增加 RetrievalSummary + Event case** 在 `enum Event` 上方加: ```swift /// 检索结果摘要 —— 把「本地 RAG 找到了什么」拿给 UI 演出来(§12 卖点 3)。 struct RetrievalSummary: Sendable, Equatable { var chips: [String] var indicatorCount: Int var reportCount: Int var symptomCount: Int var diaryCount: Int var totalCount: Int { indicatorCount + reportCount + symptomCount + diaryCount } /// 同名指标合并计数(保持检索的新→旧顺序),超出 cap 折叠成 "+N"。纯函数,单测覆盖。 static func groupedChips(_ names: [String], cap: Int = 8) -> [String] { var order: [String] = [] var counts: [String: Int] = [:] for n in names { if counts[n] == nil { order.append(n) } counts[n, default: 0] += 1 } var chips = order.map { name -> String in let c = counts[name] ?? 1 return c > 1 ? "\(name) ×\(c)" : name } if chips.count > cap { let overflow = chips.count - cap chips = Array(chips.prefix(cap)) + ["+\(overflow)"] } return chips } @MainActor static func from(snapshot: Snapshot) -> RetrievalSummary { var chips = groupedChips(snapshot.indicators.map(\.name), cap: 8) chips += snapshot.reports.prefix(3).map(\.title) chips += snapshot.symptoms.prefix(3).map(\.name) if !snapshot.diaries.isEmpty { chips.append(String(appLoc: "日记 ×\(snapshot.diaries.count)")) } return RetrievalSummary( chips: chips, indicatorCount: snapshot.indicators.count, reportCount: snapshot.reports.count, symptomCount: snapshot.symptoms.count, diaryCount: snapshot.diaries.count ) } } ``` `enum Event` 增加 case(放在 phaseChanged 后): ```swift case retrieved(RetrievalSummary) ``` - [x] **Step 3.4: 三个流程 yield .retrieved** `export(prompt:in:)`:`let snapshot = Self.retrieve(...)` 之后、`try Task.checkCancellation()` 之前加: ```swift continuation.yield(.retrieved(RetrievalSummary.from(snapshot: snapshot))) ``` `export(conversation:in:)`:`let snapshot = Self.retrieveDialogueSnapshot(...)` 之后同样加一行。 `answer(question:conversation:in:)` 返回类型从 `AsyncThrowingStream` 改为 `AsyncThrowingStream`;`let snapshot = ...` 之后加 `continuation.yield(.retrieved(RetrievalSummary.from(snapshot: snapshot)))`;循环里 `continuation.yield(TokenChunk(...))` 改为 `continuation.yield(.token(TokenChunk(...)))`。 - [x] **Step 3.5: HealthExportSheet 接事件 + chips UI** 新增状态: ```swift @State private var retrieval: HealthExportService.RetrievalSummary? @State private var turnRetrievals: [UUID: HealthExportService.RetrievalSummary] = [:] ``` `sendQuestion()` 的消费循环改为: ```swift task = Task { @MainActor in do { for try await event in stream { switch event { case .retrieved(let summary): withAnimation(.snappy(duration: 0.25)) { turnRetrievals[assistantTurn.id] = summary } case .token(let chunk): appendToTurn(id: assistantTurn.id, text: chunk.text) if chunk.decodeRate > 0 { rate = chunk.decodeRate } case .phaseChanged, .completed: break } } answeringTurnID = nil questionFocused = true } catch { answeringTurnID = nil appendToTurn(id: assistantTurn.id, text: error.localizedDescription) questionFocused = true } } ``` `startReportGeneration()` 开头清 `retrieval = nil`,事件循环加: ```swift case .retrieved(let summary): withAnimation(.snappy(duration: 0.25)) { retrieval = summary } ``` `stopGeneration()` 与 `reset()` 里加 `retrieval = nil`(reset 另加 `turnRetrievals = [:]`)。 `dialogueBubble` 的内层 VStack(role 标签之后)插入: ```swift if !isUser, let summary = turnRetrievals[turn.id] { RetrievalChipsView(summary: summary) } ``` 并把空文本等待区的文案按是否已有 summary 切换: ```swift if turn.id == answeringTurnID && turn.text.isEmpty { VStack(alignment: .leading, spacing: 8) { Text(turnRetrievals[turn.id] == nil ? "正在查看本地记录…" : "正在根据这些记录回答…") .font(.tjScaled( 13)) .foregroundStyle(Tj.Palette.text3) AIFlowBar() } } else { ``` `phaseIndicator` 的 VStack 里(pills 行之后)插入: ```swift if let retrieval { RetrievalChipsView(summary: retrieval) } ``` 文件底部(MarkdownView 之前)新增组件: ```swift // MARK: - 检索结果 chips(本地 RAG 可视化) private struct RetrievalChipsView: View { let summary: HealthExportService.RetrievalSummary var body: some View { VStack(alignment: .leading, spacing: 6) { if summary.totalCount == 0 { Text("本地档案中暂无相关记录,将仅按你的描述整理") .font(.tjScaled( 11)) .foregroundStyle(Tj.Palette.text3) } else { Text(String(appLoc: "已在本地档案中找到 \(summary.totalCount) 条相关记录")) .font(.tjScaled( 11, weight: .medium)) .foregroundStyle(Tj.Palette.leaf) ScrollView(.horizontal, showsIndicators: false) { HStack(spacing: 6) { ForEach(Array(summary.chips.enumerated()), id: \.offset) { _, chip in Text(chip) .font(.tjScaled( 11)) .foregroundStyle(Tj.Palette.text2) .lineLimit(1) .padding(.horizontal, 8) .padding(.vertical, 4) .background(Capsule().fill(Tj.Palette.sand2)) .overlay(Capsule().strokeBorder(Tj.Palette.lineSoft, lineWidth: 1)) } } .padding(.vertical, 1) } } } .transition(.opacity.combined(with: .move(edge: .top))) } } ``` - [x] **Step 3.6: 跑 RetrievalSummaryTests + 既有 HealthExport 相关测试,全部通过。Commit** ```bash git add 康康/Services/HealthExportService.swift 康康/Features/Archive/HealthExportSheet.swift 康康Tests/RetrievalSummaryTests.swift git commit -m "feat(Ask): 检索过程可视化 — RAG 命中记录以 chips 展示,生成前先看见" ``` --- ### Task 4: AIRuntime 优先级闸门(用户项 5a) **Files:** - Modify: `康康/AI/AIRuntime.swift`(闸门改造 + generate 签名) - Create: `康康Tests/InferencePriorityTests.swift` - [x] **Step 4.1: 写失败测试** ```swift import Testing @testable import 康康 struct InferencePriorityTests { @Test func interactiveJumpsAheadOfBackground() { let idx = AIRuntime.gateInsertionIndex(of: .interactive, in: [.interactive, .background, .background]) #expect(idx == 1) } @Test func interactiveKeepsFIFOAmongInteractive() { let idx = AIRuntime.gateInsertionIndex(of: .interactive, in: [.interactive, .interactive]) #expect(idx == 2) } @Test func backgroundAlwaysAppends() { let idx = AIRuntime.gateInsertionIndex(of: .background, in: [.interactive, .background]) #expect(idx == 2) } @Test func emptyQueueInsertsAtZero() { #expect(AIRuntime.gateInsertionIndex(of: .interactive, in: []) == 0) #expect(AIRuntime.gateInsertionIndex(of: .background, in: []) == 0) } } ``` - [x] **Step 4.2: 跑测试确认编译失败** - [x] **Step 4.3: AIRuntime 闸门改造** 文件顶部(actor 外)加: ```swift /// 推理优先级。interactive = 用户正在屏幕前等(识别/问答/自检); /// background = 预生成(报告摘要等),排队让行、解码中可被协作式抢占。 nonisolated enum InferencePriority: Sendable, Equatable { case interactive case background } ``` 闸门区(替换原 `gateBusy`/`gateWaiters`/`acquireGate`/`releaseGate`,保留原注释主体并补充): ```swift private struct GateWaiter { let priority: InferencePriority let cont: CheckedContinuation } private var gateBusy = false private var gateHolderPriority: InferencePriority = .interactive private var preemptRequested = false private var gateWaiters: [GateWaiter] = [] /// interactive 排到所有 background 等待者之前;同优先级保持 FIFO。纯函数,单测覆盖。 nonisolated static func gateInsertionIndex(of priority: InferencePriority, in waiting: [InferencePriority]) -> Int { guard priority == .interactive else { return waiting.count } return waiting.firstIndex(of: .background) ?? waiting.count } private func acquireGate(_ priority: InferencePriority = .interactive) async { if !gateBusy { gateBusy = true gateHolderPriority = priority return } // 前台请求撞上后台持有者:请其让位 —— 后台解码循环在下一个 token 抛 CancellationError。 if priority == .interactive, gateHolderPriority == .background { preemptRequested = true } await withCheckedContinuation { (cont: CheckedContinuation) in let idx = Self.gateInsertionIndex(of: priority, in: gateWaiters.map(\.priority)) gateWaiters.insert(GateWaiter(priority: priority, cont: cont), at: idx) } // 被 releaseGate 唤醒时即已持有闸门(gateBusy 保持 true)。 } private func releaseGate() { preemptRequested = false if gateWaiters.isEmpty { gateBusy = false } else { // 把闸门直接交给队首等待者,gateBusy 维持 true,不留空窗。 let next = gateWaiters.removeFirst() gateHolderPriority = next.priority next.cont.resume() } } /// 后台持有者每收到一个 token 查一次:前台在排队就让位。 private func shouldPreempt(_ priority: InferencePriority) -> Bool { priority == .background && preemptRequested } ``` - [x] **Step 4.4: generate 加 priority 参数 + 抢占检查** `generate` 签名: ```swift func generate(prompt: String, maxTokens: Int = 256, priority: InferencePriority = .interactive) -> AsyncThrowingStream { if InferenceEngine.current == .mnn, mnnStatus == .ready { return mnnGenerate(prompt: prompt, maxTokens: maxTokens, priority: priority) } ``` MLX 分支 Task 体:`await self.acquireGate()` → `await self.acquireGate(priority)`;循环内 `try Task.checkCancellation()` 之后加: ```swift if self.shouldPreempt(priority) { throw CancellationError() } ``` catch 拆开(让取消/抢占以 CancellationError 透传,调用方好区分): ```swift } catch is CancellationError { continuation.finish(throwing: CancellationError()) } catch { continuation.finish(throwing: AIRuntimeError.inferenceFailed("\(error)")) } ``` `mnnGenerate(prompt:maxTokens:priority:)` 做完全相同的三处修改。`prepare`/`prepareMNN`/`prepareVL`/`analyzeReport` 里的 `acquireGate()` 不带参(默认 interactive,模型加载不可被抢占)。 - [x] **Step 4.5: 跑 InferencePriorityTests + 设备编译。Commit** ```bash git add 康康/AI/AIRuntime.swift 康康Tests/InferencePriorityTests.swift git commit -m "feat(AI): 推理闸门双优先级 — 前台插队,后台预生成按 token 让位" ``` --- ### Task 5: 报告摘要预生成(用户项 3a) **Files:** - Create: `康康/AI/Prompts/InsightPrompts.swift` - Create: `康康/Services/ReportInsightService.swift` - Create: `康康Tests/InsightPromptsTests.swift` - Modify: `康康/Features/Capture/UnifiedCaptureFlow.swift:313`(保存后挂后台任务) - Modify: `康康/Features/Timeline/TimelineEntryDetailView.swift:260-267`(摘要卡组件化 + 兜底触发) - [x] **Step 5.1: 写失败测试 InsightPromptsTests.swift** ```swift import Testing @testable import 康康 struct InsightPromptsTests { @Test func reportSummaryPromptCarriesDataAndGuards() { let p = InsightPrompts.reportPlainSummary( title: "春季体检", typeLabel: "体检报告", indicatorLines: "血红蛋白 118 g/L(参考 130-175)low") #expect(p.contains("春季体检")) #expect(p.contains("血红蛋白 118")) #expect(p.contains("/no_think")) #expect(p.contains("不诊断")) #expect(!p.contains("患者")) } @Test func trendPromptCarriesDataAndGuards() { let p = InsightPrompts.trendInsight( title: "空腹血糖", unit: "mmol/L", rangeText: ",参考 3.9-6.1", dataLines: "2026-05-01 5.2 / 2026-06-01 5.8") #expect(p.contains("空腹血糖")) #expect(p.contains("2026-06-01 5.8")) #expect(p.contains("/no_think")) #expect(!p.contains("患者")) } } ``` - [x] **Step 5.2: 跑测试确认编译失败** - [x] **Step 5.3: 新建 InsightPrompts.swift** ```swift import Foundation /// 本地解读类 prompt:报告大白话摘要 + 趋势一句话解读。 /// 红线:不诊断、不荐药;称呼「你」,不出现「患者」(产品定位:自我健康记录)。 nonisolated enum InsightPrompts { /// 报告整体大白话摘要(归档后台预生成,写回 Report.summary)。 static func reportPlainSummary(title: String, typeLabel: String, indicatorLines: String) -> String { """ 你是健康档案助手。下面是一份报告的指标列表,请用大白话给本人(称「你」)写 2~3 句整体解读: - 第 1 句:总体情况(共几项、几项异常)。 - 之后:点名最值得留意的异常项,用生活化语言说明偏高/偏低意味着什么方向。 - 不诊断疾病、不推荐药物或剂量;异常较多时建议「带上报告咨询医生」。 - 只输出正文文字,不要标题、列表、JSON、markdown。 示例: 输入:血常规(化验单),指标:白细胞 5.2 (3.5-9.5) normal;血红蛋白 118 (130-175) low;血小板 210 (125-350) normal 输出:这份血常规共 3 项,2 项正常,血红蛋白略低于参考范围。血红蛋白偏低通常与贫血方向有关,平时可以多补充含铁食物;如果还伴随乏力头晕,建议带上报告咨询医生。 现在的报告:\(title)(\(typeLabel)) 指标: \(indicatorLines) 只输出 2~3 句正文。/no_think """ } /// 趋势一句话解读(TrendDetailView,按数据指纹缓存)。 static func trendInsight(title: String, unit: String, rangeText: String, dataLines: String) -> String { """ 你是健康档案助手。下面是「\(title)」的历史记录(单位 \(unit)\(rangeText)),请用大白话给本人(称「你」)写 1~2 句趋势解读: - 说清整体走向(上升/下降/平稳/波动)和当前值与参考范围的关系。 - 不诊断疾病、不推荐药物;持续异常时温和建议「复查或咨询医生」。 - 只输出正文文字,不要标题、列表、JSON。 示例: 输入:体重,单位 kg,记录:2026-04-01 72.5 / 2026-04-15 71.8 / 2026-05-01 71.2 输出:近一个月你的体重稳步下降了约 1.3kg,节奏平缓,继续保持现在的习惯就好。 现在的记录: \(dataLines) 只输出 1~2 句正文。/no_think """ } } ``` - [x] **Step 5.4: 跑测试确认通过** - [x] **Step 5.5: 新建 ReportInsightService.swift** ```swift import Foundation import SwiftData /// 报告大白话摘要预生成(§3.1:流程经本服务碰 AIRuntime,UI 不直接调)。 /// 时机:归档保存后立即后台跑(用户继续操作时完成);详情页打开时兜底重试。 /// 写回策略:只在 summary 为空时生成 —— 绝不覆盖 VL 已给出或用户编辑过的摘要。 @MainActor final class ReportInsightService { static let shared = ReportInsightService() private init() {} /// 进行中的报告 ID,防止「保存后台任务」与「详情页兜底」重复触发。 private var inFlight: Set = [] func pregenerateIfNeeded(report: Report, in ctx: ModelContext) async { guard (report.summary ?? "").isEmpty, !report.indicators.isEmpty else { return } let key = String(describing: report.persistentModelID) guard !inFlight.contains(key) else { return } inFlight.insert(key) defer { inFlight.remove(key) } do { try await AIRuntime.shared.prepare() } catch { return // 模型未就绪:静默放弃,详情页下次打开再试 } let prompt = InsightPrompts.reportPlainSummary( title: report.title, typeLabel: report.type.label, indicatorLines: Self.indicatorLines(for: report.indicators) ) var collected = "" do { let stream = await AIRuntime.shared.generate( prompt: prompt, maxTokens: 200, priority: .background) for try await chunk in stream { collected += chunk.text } } catch { return // 被前台任务抢占(CancellationError)或推理失败:放弃,兜底路径再试 } let text = HealthExportService.stripThinkBlocks(collected) .trimmingCharacters(in: .whitespacesAndNewlines) guard !text.isEmpty, (report.summary ?? "").isEmpty else { return } report.summary = text try? ctx.save() } /// 「名 值 单位(参考 range)status」每指标一行;异常项排前,上限 15 行控 prompt 体积。 static func indicatorLines(for indicators: [Indicator]) -> String { let sorted = indicators.sorted { ($0.status == .normal ? 1 : 0) < ($1.status == .normal ? 1 : 0) } return sorted.prefix(15).map { i in var line = "\(i.name) \(i.value)" if !i.unit.isEmpty { line += " \(i.unit)" } if !i.range.isEmpty { line += "(参考 \(i.range))" } line += " \(i.status.rawValue)" return line }.joined(separator: "\n") } } ``` - [x] **Step 5.6: UnifiedCaptureFlow.saveAll 挂后台任务** `saveAll` 末尾的 ```swift try? ctx.save() onClose() ``` 改为 ```swift try? ctx.save() // 后台预生成大白话摘要:用户继续操作,详情页打开时秒开。 // 低优先级 —— 任何前台 AI 任务(再次拍照/问答)都会让它在下一个 token 让位。 Task { await ReportInsightService.shared.pregenerateIfNeeded(report: report, in: ctx) } onClose() ``` - [x] **Step 5.7: TimelineEntryDetailView 摘要卡组件化** `reportBody` 中的 ```swift if let sum = r.summary, !sum.isEmpty { card { Text(String(appLoc: "摘要")) .font(.tjScaled( 12, weight: .semibold)).foregroundStyle(Tj.Palette.text2) Text(sum).font(.tjScaled( 14)).foregroundStyle(Tj.Palette.text) .fixedSize(horizontal: false, vertical: true) } } ``` 替换为 ```swift ReportSummaryCard(report: r) ``` 文件末尾新增组件(card 容器样式与本文件 `card` helper 一致): ```swift // MARK: - 报告摘要卡(无摘要时后台预生成兜底) /// 有摘要直接显示;无摘要且有指标时触发后台预生成(归档时若被抢占,这里兜底), /// 生成期间显示流光线,完成后 SwiftData 观察自动刷新出文本。 private struct ReportSummaryCard: View { @Environment(\.modelContext) private var ctx let report: Report @State private var generating = false var body: some View { Group { if let sum = report.summary, !sum.isEmpty { container { Text(String(appLoc: "摘要")) .font(.tjScaled( 12, weight: .semibold)).foregroundStyle(Tj.Palette.text2) Text(sum).font(.tjScaled( 14)).foregroundStyle(Tj.Palette.text) .fixedSize(horizontal: false, vertical: true) } } else if generating { container { Text("本地 AI 正在解读这份报告…") .font(.tjScaled( 12)).foregroundStyle(Tj.Palette.text3) AIFlowBar() } } } .task { guard (report.summary ?? "").isEmpty, !report.indicators.isEmpty else { return } generating = true await ReportInsightService.shared.pregenerateIfNeeded(report: report, in: ctx) generating = false } } private func container(@ViewBuilder _ body: () -> C) -> some View { VStack(alignment: .leading, spacing: 10) { body() } .padding(14) .frame(maxWidth: .infinity, alignment: .leading) .background( RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous) .fill(Tj.Palette.paper) ) .overlay( RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous) .strokeBorder(Tj.Palette.lineSoft, lineWidth: 1) ) } } ``` - [x] **Step 5.8: 模拟器编译 + 全量既有测试不回归。Commit** ```bash git add 康康/AI/Prompts/InsightPrompts.swift 康康/Services/ReportInsightService.swift 康康Tests/InsightPromptsTests.swift 康康/Features/Capture/UnifiedCaptureFlow.swift 康康/Features/Timeline/TimelineEntryDetailView.swift git commit -m "feat(Capture): 归档后后台预生成大白话摘要,详情页秒开 + 兜底重试" ``` --- ### Task 6: 趋势 AI 解读 + 指纹缓存(用户项 3b) **Files:** - Create: `康康/Services/TrendInsightService.swift` - Create: `康康Tests/TrendInsightCacheTests.swift` - Modify: `康康/Features/Trends/TrendDetailView.swift:72,321-340`(占位换实卡) - [x] **Step 6.1: 写失败测试 TrendInsightCacheTests.swift** ```swift import Testing import SwiftUI @testable import 康康 @MainActor struct TrendInsightCacheTests { private func bucket(values: [Double]) -> SeriesBucket { let points = values.enumerated().map { i, v in SeriesBucket.Point(id: "p\(i)", date: Date(timeIntervalSince1970: Double(i) * 86_400), value: v, status: .normal) } let line = SeriesBucket.SeriesLine(id: "glucose.fasting", seriesKey: "glucose.fasting", label: nil, color: .blue, points: points, referenceRange: 3.9...6.1) return SeriesBucket(id: "glucose.fasting", title: "空腹血糖", unit: "mmol/L", lines: [line], latestDate: .now, kind: .monitor) } @Test func fingerprintStableForSameData() { let a = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5])) let b = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5])) #expect(a == b) } @Test func fingerprintChangesWhenDataChanges() { let a = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5])) let b = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5, 6.0])) #expect(a != b) } @Test func dataLinesFormatsDateAndValue() { let lines = TrendInsightService.dataLines(for: bucket(values: [5.2, 5.5])) #expect(lines.contains("1970-01-01 5.2")) #expect(lines.contains("1970-01-02 5.5")) } @Test func rangeTextRendersReference() { #expect(TrendInsightService.rangeText(for: bucket(values: [5.2])) == ",参考 3.9-6.1") } } ``` - [x] **Step 6.2: 跑测试确认编译失败** - [x] **Step 6.3: 新建 TrendInsightService.swift** ```swift import Foundation /// 趋势 AI 一句话解读:小预算(≤140 token)+ 按数据指纹缓存(UserDefaults)。 /// 数据没变不重算 —— 进趋势详情页秒开;新增/修改记录改变指纹 → 自动重新生成。 @MainActor final class TrendInsightService { static let shared = TrendInsightService() private init() {} struct Cached: Codable, Equatable { var fingerprint: String var text: String var generatedAt: Date } static let storePrefix = "kk.trendInsight." /// 数据指纹:每条线的 key + 点数 + 首末时间 + 末值/极值。体量小,直接当 fingerprint 字符串。 static func fingerprint(for bucket: SeriesBucket) -> String { var parts: [String] = [bucket.id] for line in bucket.lines { let pts = line.points let first = pts.first.map { Int($0.date.timeIntervalSince1970) } ?? 0 let last = pts.last.map { Int($0.date.timeIntervalSince1970) } ?? 0 let lastV = pts.last?.value ?? 0 let minV = pts.map(\.value).min() ?? 0 let maxV = pts.map(\.value).max() ?? 0 parts.append("\(line.seriesKey)#\(pts.count)#\(first)#\(last)#\(lastV)#\(minV)#\(maxV)") } return parts.joined(separator: "|") } /// 命中缓存(指纹一致)返回文本,否则 nil。 func cachedText(for bucket: SeriesBucket) -> String? { guard let data = UserDefaults.standard.data(forKey: Self.storePrefix + bucket.id), let c = try? JSONDecoder().decode(Cached.self, from: data), c.fingerprint == Self.fingerprint(for: bucket) else { return nil } return c.text } /// 现算一条解读并写缓存。模型未就绪/输出为空时抛错,UI 显示「暂不可用 + 重试」。 func generate(for bucket: SeriesBucket) async throws -> String { try await AIRuntime.shared.prepare() let prompt = InsightPrompts.trendInsight( title: bucket.title, unit: bucket.unit, rangeText: Self.rangeText(for: bucket), dataLines: Self.dataLines(for: bucket) ) var collected = "" let stream = await AIRuntime.shared.generate(prompt: prompt, maxTokens: 140) for try await chunk in stream { collected += chunk.text } let text = HealthExportService.stripThinkBlocks(collected) .trimmingCharacters(in: .whitespacesAndNewlines) guard !text.isEmpty else { throw AIRuntimeError.inferenceFailed("空输出") } let cached = Cached(fingerprint: Self.fingerprint(for: bucket), text: text, generatedAt: .now) if let data = try? JSONEncoder().encode(cached) { UserDefaults.standard.set(data, forKey: Self.storePrefix + bucket.id) } return text } /// 每条线最近 24 个点拼成 "yyyy-MM-dd 值";多线(血压)各占一行带 label 前缀。 static func dataLines(for bucket: SeriesBucket) -> String { let df = DateFormatter() df.locale = Locale(identifier: "en_US_POSIX") df.timeZone = TimeZone(identifier: "UTC") df.dateFormat = "yyyy-MM-dd" var lines: [String] = [] for line in bucket.lines { let pts = line.points.suffix(24) let prefix = bucket.lines.count > 1 ? "\(line.label ?? line.seriesKey):" : "" let series = pts.map { "\(df.string(from: $0.date)) \(fmt($0.value))" } .joined(separator: " / ") lines.append(prefix + series) } return lines.joined(separator: "\n") } /// ",参考 lo-hi" 或空串(无参考范围时整段省略)。 static func rangeText(for bucket: SeriesBucket) -> String { guard let r = bucket.lines.first?.referenceRange else { return "" } return ",参考 \(fmt(r.lowerBound))-\(fmt(r.upperBound))" } private static func fmt(_ v: Double) -> String { v.truncatingRemainder(dividingBy: 1) == 0 ? String(format: "%.0f", v) : String(format: "%.1f", v) } } ``` 注意:`dataLines` 用 UTC 时区保证测试与设备时区无关(展示日期仅供模型理解,差几小时无影响)。 - [x] **Step 6.4: 跑测试确认通过** - [x] **Step 6.5: TrendDetailView 换卡** body 中 `aiPlaceholder` 替换为 `TrendInsightCard(bucket: bucket)`;删除 `// MARK: AI 解读占位` 与 `aiPlaceholder` 整块;文件末尾(`enum TrendRange` 之前)加: ```swift // MARK: - AI 趋势解读卡 /// 进入页面先查指纹缓存:命中秒显;未命中本地现算(经 TrendInsightService,§3.1)。 private struct TrendInsightCard: View { let bucket: SeriesBucket @State private var text: String? @State private var running = false @State private var failedMessage: String? var body: some View { VStack(alignment: .leading, spacing: 8) { HStack(spacing: 6) { Image(systemName: "sparkles") .font(.tjScaled( 12)) .foregroundStyle(Tj.Palette.ink) Text("AI 解读") .font(.tjScaled( 12, weight: .semibold)) .foregroundStyle(Tj.Palette.text2) Spacer() } if let text { Text(text) .font(.tjScaled( 13)) .lineSpacing(3) .foregroundStyle(Tj.Palette.text) .fixedSize(horizontal: false, vertical: true) AIDisclaimerFooter() } else if running { Text("本地 AI 解读中…") .font(.tjScaled( 12)) .foregroundStyle(Tj.Palette.text3) AIFlowBar() } else if let failedMessage { HStack { Text(failedMessage) .font(.tjScaled( 12)) .foregroundStyle(Tj.Palette.text3) Spacer() Button("重试") { Task { await load(force: true) } } .font(.tjScaled( 12, weight: .medium)) .foregroundStyle(Tj.Palette.ink) } } } .padding(14) .frame(maxWidth: .infinity, alignment: .leading) .background( RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous) .fill(Tj.Palette.paper) ) .overlay( RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous) .strokeBorder(Tj.Palette.lineSoft, lineWidth: 1) ) .task(id: bucket.id) { await load(force: false) } } @MainActor private func load(force: Bool) async { if !force, let cached = TrendInsightService.shared.cachedText(for: bucket) { text = cached return } running = true failedMessage = nil do { text = try await TrendInsightService.shared.generate(for: bucket) } catch { failedMessage = String(appLoc: "AI 解读暂不可用(模型未就绪或繁忙)") } running = false } } ``` - [x] **Step 6.6: 跑 TrendInsightCacheTests + SeriesBucketTests 不回归。Commit** ```bash git add 康康/Services/TrendInsightService.swift 康康Tests/TrendInsightCacheTests.swift 康康/Features/Trends/TrendDetailView.swift git commit -m "feat(Trends): AI 趋势解读上线 — 数据指纹缓存,秒开不重算" ``` --- ### Task 7: OCR 文本辅助报告识别(用户项 4) > **特别注意:QwenVL-4B 已弃用。** 这里的「报告识别」由 Qwen3.5-2B 多模态承担(MNN Omni `mnn.analyze` 主路 / MLX `VLSession` 兜底)。OCR 参考文本对 2B 视觉读密集小数字尤其有用。 **Files:** - Modify: `康康/AI/Prompts/VLPrompts.swift:34-89`(reportExtraction 加 ocrText + 模板占位 + clip 函数) - Modify: `康康/Services/CaptureService.swift:137-161`(runVL 注入 OCR) - Create: `康康Tests/VLPromptsOCRTests.swift` - [x] **Step 7.1: 写失败测试 VLPromptsOCRTests.swift** ```swift import Testing @testable import 康康 struct VLPromptsOCRTests { @Test func emptyOCRKeepsPromptClean() { let p = VLPrompts.reportExtraction(ocrText: "") #expect(!p.contains("OCR 参考文本")) #expect(!p.contains("{{OCR_SECTION}}")) #expect(p.contains("现在请识别图片并输出 JSON")) } @Test func ocrTextIsInjectedBeforeFinalInstruction() { let p = VLPrompts.reportExtraction(ocrText: "尿酸 486 208-428 μmol/L") #expect(p.contains("OCR 参考文本")) #expect(p.contains("尿酸 486")) let ocrPos = p.range(of: "尿酸 486")!.lowerBound let endPos = p.range(of: "现在请识别图片并输出 JSON")!.lowerBound #expect(ocrPos < endPos) } @Test func clipKeepsShortTextIntact() { #expect(VLPrompts.clipOCR("短文本") == "短文本") } @Test func clipCutsAtLineBoundary() { let long = Array(repeating: "指标行 1.23 mmol/L", count: 400).joined(separator: "\n") let clipped = VLPrompts.clipOCR(long, limit: 200) #expect(clipped.count < 260) #expect(clipped.hasSuffix("(后续内容过长已截断)")) #expect(!clipped.contains("\n指标行 1.23 mmol/L(后续")) // 不留半行 } } ``` - [x] **Step 7.2: 跑测试确认失败** - [x] **Step 7.3: VLPrompts 改造** `reportExtraction` 改为: ```swift static func reportExtraction(today: Date = .now, ocrText: String = "") -> String { let f = DateFormatter() f.locale = Locale(identifier: "en_US_POSIX") f.dateFormat = "yyyy-MM-dd" let todayStr = f.string(from: today) // OCR 参考段:Vision 抄数字比 2B 多模态读密集小字稳;版面仍以图片为准。 let ocrSection: String if ocrText.isEmpty { ocrSection = "" } else { ocrSection = """ OCR 参考文本(系统对同一报告做文字识别的结果,可能有错字、串行或漏行;版面与表格结构以图片为准,但数值、小数点以 OCR 文字更可靠): \(clipOCR(ocrText)) """ } return reportExtractionTemplate .replacingOccurrences(of: "{{TODAY}}", with: todayStr) .replacingOccurrences(of: "{{OCR_SECTION}}", with: ocrSection) } /// OCR 文本截断:限制进入 prompt 的体量(2B 模型上下文有限)。截到最后一个完整行。 static func clipOCR(_ text: String, limit: Int = 1800) -> String { guard text.count > limit else { return text } let clipped = String(text.prefix(limit)) if let lastNewline = clipped.lastIndex(of: "\n") { return String(clipped[.. ParsedReport { do { try await AIRuntime.shared.prepareVL() } catch { throw CaptureError.modelNotReady } let urls = assets.map { FileVault.shared.rootURL.appendingPathComponent($0.relativePath) } // OCR 参考(Vision 本地,<1s/页):给 2B 多模态当数字「抄写员」,降低小字误读。 // 任何失败都静默回退为空串,绝不阻断识别主流程(§3.2)。 let ocr = await Self.ocrReference(for: urls) let raw: String do { raw = try await AIRuntime.shared.analyzeReport( imageURLs: urls, prompt: VLPrompts.reportExtraction(ocrText: ocr) ) } catch { throw CaptureError.inferenceFailed("\(error)") } do { return try CaptureService.parseReportJSON(raw, pageCount: assets.count) } catch let CaptureError.parseFailed(msg) { throw CaptureError.parseFailed(msg) } catch { throw CaptureError.parseFailed("\(error)") } } /// 对 Vault 报告图逐页 OCR 拼参考文本。最多 4 页;失败/空文本返回 ""。 private static func ocrReference(for urls: [URL]) async -> String { var pages: [String] = [] for (idx, url) in urls.prefix(4).enumerated() { guard let src = CGImageSourceCreateWithURL(url as CFURL, nil), let cg = CGImageSourceCreateImageAtIndex(src, 0, nil) else { continue } guard let text = try? await OCRService.recognizeText(in: cg), !text.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty else { continue } pages.append(urls.count > 1 ? "【第 \(idx + 1) 页】\n\(text)" : text) } return pages.joined(separator: "\n") } ``` 文件顶部 import 区加 `import ImageIO`(UIKit 已有)。 - [x] **Step 7.6: 跑 VLPromptsOCRTests + CaptureServiceJSONTests 不回归 + 设备编译。Commit** ```bash git add 康康/AI/Prompts/VLPrompts.swift 康康/Services/CaptureService.swift 康康Tests/VLPromptsOCRTests.swift git commit -m "feat(Capture): 报告识别注入 Vision OCR 参考文本,提升 2B 多模态数字准确率" ``` --- ### Task 8: MNN KV cache 调研文档(用户项 5b) **Files:** - Create: `docs/research/mnn-kv-cache-prefix.md` - [x] **Step 8.1: 写调研文档** 内容要点(基于 `Frameworks/MNN.xcframework/ios-arm64/MNN.framework/Headers/llm/llm.hpp` 实际头文件): - 结论:当前 MNN 构建已暴露 prefix cache 能力,可把各场景固定 prompt 模板的 prefill 结果缓存。 - 依据:`bool setPrefixCacheFile(const std::string&, int flag)`(llm.hpp:161,配套私有成员 `mPrefixCacheMode`/`mPrefixLength`/`completePrefixWrite`)、`bool reuse_kv()`(llm.hpp:171,config 开关)、`void syncPromptCache(const ChatMessages&)`(llm.hpp:176)。 - 适用性:本项目全部是「固定模板前缀 + 可变数据后缀」单轮 `response()`,与 prefix cache 模型吻合;模板体量报告识别 ~900 tok / 导出 ~700 tok / 意图抽取 ~300 tok,按性能自检实测 prefill 速率估算每次省 1~3s。 - 风险:flag 语义无注释;OMNI 多模态分支未验证;cache 文件与模型版本绑定需失效处理。 - 建议:W6 polish 阶段、用性能自检卡量化 prefill 占比后再接入;真机 A/B 各跑 3 次对比 `prefill_us`;异常立即删 cache 文件回退。当前瓶颈在 decode,优先级低于 C1/C2/Live Activity。 - [x] **Step 8.2: Commit** ```bash git add docs/research/mnn-kv-cache-prefix.md git commit -m "docs(AI): MNN prefix KV cache 调研 — setPrefixCacheFile 可用,建议 W6 量化后接入" ``` --- ### Task 9: 收尾验证 - [x] **Step 9.1: 全量单元测试** ```bash xcodebuild test -project 康康.xcodeproj -scheme 康康 \ -destination 'platform=iOS Simulator,name=iPhone 17' \ -derivedDataPath build/cli-dd -only-testing:'康康Tests' 2>&1 | tail -30 ``` 预期:全部 PASS,无回归。 - [x] **Step 9.2: 设备编译(MNN 真实分支)** ```bash xcodebuild build -project 康康.xcodeproj -scheme 康康 \ -destination 'generic/platform=iOS' \ -derivedDataPath build/cli-dd CODE_SIGNING_ALLOWED=NO 2>&1 | tail -15 ``` 预期:BUILD SUCCEEDED,无新增 warning。 - [ ] **Step 9.3: 真机验证清单(留给用户,代码侧无法完成)** 1. 性能自检卡:MNN 与 MLX 各跑一次,对比卡出现两行数据。 2. 问答:发问后先看到「已找到 N 条记录」chips,再流式回答。 3. 归档一份报告 → 不进详情页等 1 分钟 → 进详情页摘要已就绪(秒开)。 4. 趋势详情:首次进入现算,退出再进秒开(缓存);新增一条记录后重新生成。 5. 拍多页化验单:对比 OCR 辅助前后数值准确率。 6. 后台摘要生成中立刻发起问答:问答无感插队,摘要稍后补全。