Files
kangkang/docs/superpowers/plans/2026-06-10-competition-optimizations.md
2026-06-10 07:13:24 +08:00

1627 lines
62 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 比赛优化五件套 Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** 实现 5 项已确认的比赛优化:① 性能自检卡(SME2 证据)② 问答检索可视化 ③ 报告摘要预生成 + 趋势解读缓存 ④ OCR 文本辅助报告识别(Qwen3.5-2B 多模态,QwenVL-4B 已弃用)⑤ AIRuntime 优先级闸门 + MNN KV cache 调研。
**Architecture:** 不改变 §3.1 模块边界(UI → Service → AIRuntime)。在 AIRuntime 增加两后端归一的 `GenerateStats` 与协作式优先级闸门(interactive 插队、background 在下一 token 让位);HealthExportService 的事件流增加 `.retrieved(RetrievalSummary)`;新增 BenchmarkService / ReportInsightService / TrendInsightService 三个轻服务;VL prompt 注入 OCR 参考文本。**注意:视觉推理现在由 Qwen3.5-2B 多模态承担(MNN Omni 主路 / MLX VLSession 兜底,均从 `.llm`/`.mnnLLM` 目录加载),不存在独立 VL 模型。**
**Tech Stack:** SwiftUI + SwiftData(iOS 17+)、MNN(ObjC++ bridge)、MLX Swift(mlx-swift-lm 2.31.3,`GenerateCompletionInfo`)、Vision OCR、Swift Testing(康康Tests)。
**用户编号 → 任务映射:** 用户项 1 → Task 1+2;项 2 → Task 3;项 5 → Task 4+8;项 3 → Task 5+6;项 4 → Task 7。Task 4(优先级闸门)提前是因为 Task 5 的后台预生成依赖 `priority: .background`
**构建/测试命令(全任务通用):**
```bash
# 单元测试(模拟器;首次先 xcrun simctl list devices available | grep iPhone 确认名字)
cd /Users/xuhuayong/apps/康康
xcodebuild test -project 康康.xcodeproj -scheme 康康 \
-destination 'platform=iOS Simulator,name=iPhone 17' \
-derivedDataPath build/cli-dd \
-only-testing:'康康Tests/<测试类>' 2>&1 | tail -25
# 设备编译验证(MNNLLMBridge 真实分支只在 device 切片编译)
xcodebuild build -project 康康.xcodeproj -scheme 康康 \
-destination 'generic/platform=iOS' \
-derivedDataPath build/cli-dd CODE_SIGNING_ALLOWED=NO 2>&1 | tail -15
```
**红线提醒(写每一行代码前记住):** 不出现「患者」字样;新 UI 字号一律 `Font.tjScaled`;颜色只用 `Tj.Palette.*`;UI 不直接调 AIRuntime(自检页例外已既成,新代码走 Service);所有 prompt 带 few-shot + `/no_think` + 失败回退;不碰 Localizable.xcstrings(git status 里已有未提交修改,保持不动)。
---
### Task 1: GenerateStats 两后端归一统计
**Files:**
- Create: `康康/AI/GenerateStats.swift`
- Modify: `康康/AI/MNNBackend.swift`
- Modify: `康康/AI/LLMSession.swift`
- Modify: `康康/AI/AIRuntime.swift`
- [x] **Step 1.1: 新建 GenerateStats.swift**
```swift
import Foundation
/// ,(MNN / MLX)
/// MNN LlmContext(prefill_us / decode_us);MLX GenerateCompletionInfo
struct GenerateStats: Sendable, Equatable {
var promptTokens: Int
var genTokens: Int
/// prefill( prompt),
var prefillSeconds: Double
/// decode( token ),
var decodeSeconds: Double
var prefillTokensPerSecond: Double {
prefillSeconds > 0 ? Double(promptTokens) / prefillSeconds : 0
}
var decodeTokensPerSecond: Double {
decodeSeconds > 0 ? Double(genTokens) / decodeSeconds : 0
}
}
```
- [x] **Step 1.2: MNNBackend 捕获统计**
actor 增加状态与方法:
```swift
/// ( AIRuntime ,)
private(set) var lastStats: GenerateStats?
private func record(_ s: GenerateStats) { lastStats = s }
```
`generate` 的 detached Task 改为(MNNGenerateStats 是 ObjC 对象,先抽成 Sendable 的 GenerateStats 再跨 actor):
```swift
let task = Task.detached(priority: .userInitiated) {
let stats = box.value.generateText(prompt, maxTokens: Int32(maxTokens)) { piece in
let rate = meter.tick()
continuation.yield(TokenChunk(text: piece, decodeRate: rate))
}
await self.record(GenerateStats(
promptTokens: Int(stats.promptTokens),
genTokens: Int(stats.genTokens),
prefillSeconds: stats.prefillMs / 1000.0,
decodeSeconds: stats.decodeMs / 1000.0
))
continuation.finish()
}
```
`analyze` 同样:`_ = try box.value.analyzeImages(...)` 改为接住返回的 stats,`cont.resume(returning:)``await self.record(...)`(注意 analyzeImages 返回 optional,`if let s = ...` 再 record)。
- [x] **Step 1.3: LLMSession 捕获 .info 统计**
actor 增加:
```swift
/// ( .info ,)
private(set) var lastStats: GenerateStats?
private func record(_ s: GenerateStats) { lastStats = s }
```
`generate` 内 switch 的 `.info` 分支改为:
```swift
case .info(let info):
// ,
await self.record(GenerateStats(
promptTokens: info.promptTokenCount,
genTokens: info.generationTokenCount,
prefillSeconds: info.promptTime,
decodeSeconds: info.generateTime
))
```
- [x] **Step 1.4: AIRuntime 暴露统计与后端标签**
actor 增加:
```swift
/// (;)
private(set) var lastGenerateStats: GenerateStats?
/// ( / PPT )
var activeBackendLabel: String {
if InferenceEngine.current == .mnn, mnnStatus == .ready {
return InferenceEngine.cpuSupportsSME2 ? "MNN · SME2" : "MNN · NEON"
}
#if targetEnvironment(simulator)
return "MLX · CPU(模拟器)"
#else
return "MLX · GPU"
#endif
}
```
`generate` MLX 分支 `for try await` 循环之后、`continuation.finish()` 之前加:
```swift
self.lastGenerateStats = await session.lastStats
```
`mnnGenerate` 同位置加:
```swift
self.lastGenerateStats = await self.mnn.lastStats
```
- [x] **Step 1.5: 设备编译验证(命令见顶部),确认无错误。Commit**
```bash
git add 康康/AI/GenerateStats.swift 康康/AI/MNNBackend.swift 康康/AI/LLMSession.swift 康康/AI/AIRuntime.swift
git commit -m "feat(AI): 两后端归一的 GenerateStats(prefill/decode 实测统计)"
```
---
### Task 2: 性能自检卡(用户项 1)
**Files:**
- Create: `康康/Services/BenchmarkService.swift`
- Create: `康康Tests/BenchmarkStoreTests.swift`
- Modify: `康康/Features/Me/ModelSelfTestView.swift`(整体改造)
- Modify: `康康/Features/Me/ModelManagementView.swift:31-42`(入口条件 + 文案)
- [x] **Step 2.1: 写失败测试 BenchmarkStoreTests.swift**
```swift
import Testing
import Foundation
@testable import
struct BenchmarkStoreTests {
private func freshDefaults() -> UserDefaults {
let d = UserDefaults(suiteName: "test.kk.benchmark")!
d.removePersistentDomain(forName: "test.kk.benchmark")
return d
}
@Test func savesAndLoadsPerBackend() {
let d = freshDefaults()
let mnn = BenchmarkResult(backendLabel: "MNN · SME2", promptTokens: 30, genTokens: 80,
prefillTokensPerSecond: 120, decodeTokensPerSecond: 25,
totalSeconds: 4.2, date: .now)
let mlx = BenchmarkResult(backendLabel: "MLX · GPU", promptTokens: 30, genTokens: 80,
prefillTokensPerSecond: 300, decodeTokensPerSecond: 40,
totalSeconds: 2.5, date: .now)
BenchmarkService.save(mnn, defaults: d)
BenchmarkService.save(mlx, defaults: d)
let all = BenchmarkService.load(defaults: d)
#expect(all.count == 2)
#expect(all["MNN · SME2"]?.decodeTokensPerSecond == 25)
}
@Test func overwritesSameBackend() {
let d = freshDefaults()
let old = BenchmarkResult(backendLabel: "MLX · GPU", promptTokens: 1, genTokens: 1,
prefillTokensPerSecond: 1, decodeTokensPerSecond: 1,
totalSeconds: 1, date: .now)
var new = old; new.decodeTokensPerSecond = 99
BenchmarkService.save(old, defaults: d)
BenchmarkService.save(new, defaults: d)
#expect(BenchmarkService.load(defaults: d)["MLX · GPU"]?.decodeTokensPerSecond == 99)
}
@Test func loadOnEmptyReturnsEmpty() {
#expect(BenchmarkService.load(defaults: freshDefaults()).isEmpty)
}
}
```
- [x] **Step 2.2: 跑测试确认编译失败(BenchmarkService 不存在)**
- [x] **Step 2.3: 新建 BenchmarkService.swift**
```swift
import Foundation
/// ,MNN·SME2 vs MLX·GPU(§12 2/6)
struct BenchmarkResult: Codable, Equatable {
var backendLabel: String
var promptTokens: Int
var genTokens: Int
var prefillTokensPerSecond: Double
var decodeTokensPerSecond: Double
var totalSeconds: Double
var date: Date
}
/// : prompt, AIRuntime , UserDefaults
/// UI(ModelSelfTestView) AIRuntime(§3.1)
@MainActor
struct BenchmarkService {
static let shared = BenchmarkService()
private init() {}
static let storeKey = "kk.benchmark.results"
/// prompt:/
static let fixedPrompt = "用中文一句话介绍肝功能里 ALT 这个指标。"
/// onToken UI
func run(onToken: @escaping @MainActor (String, Double) -> Void) async throws -> BenchmarkResult {
try await AIRuntime.shared.prepare()
let start = Date()
let stream = await AIRuntime.shared.generate(prompt: Self.fixedPrompt, maxTokens: 128)
for try await chunk in stream {
onToken(chunk.text, chunk.decodeRate)
}
let total = Date().timeIntervalSince(start)
let label = await AIRuntime.shared.activeBackendLabel
let stats = await AIRuntime.shared.lastGenerateStats
let result = BenchmarkResult(
backendLabel: label,
promptTokens: stats?.promptTokens ?? 0,
genTokens: stats?.genTokens ?? 0,
prefillTokensPerSecond: stats?.prefillTokensPerSecond ?? 0,
decodeTokensPerSecond: stats?.decodeTokensPerSecond ?? 0,
totalSeconds: total,
date: .now
)
Self.save(result)
return result
}
// MARK: - (,)
static func save(_ result: BenchmarkResult, defaults: UserDefaults = .standard) {
var all = load(defaults: defaults)
all[result.backendLabel] = result
if let data = try? JSONEncoder().encode(all) {
defaults.set(data, forKey: storeKey)
}
}
static func load(defaults: UserDefaults = .standard) -> [String: BenchmarkResult] {
guard let data = defaults.data(forKey: storeKey),
let all = try? JSONDecoder().decode([String: BenchmarkResult].self, from: data) else {
return [:]
}
return all
}
}
```
- [x] **Step 2.4: 跑测试确认通过**
- [x] **Step 2.5: 改造 ModelSelfTestView**
保留原 prompt 卡/状态行/输出框骨架,改动:`run()` 改走 BenchmarkService;新增本次结果卡(后端 badge + 读入/生成 tok/s + 总耗时)、历史对比卡(每后端一行 + 「切换引擎后再跑一次即可对比」提示);外层换 ScrollView;标题改「性能自检」。完整代码:
```swift
import SwiftUI
/// : prompt,(MNN·SME2 / MNN·NEON / MLX·GPU)
/// prefill / decode , (§12 2/6)
struct ModelSelfTestView: View {
@State private var output = ""
@State private var phase: Phase = .idle
@State private var rate: Double = 0
@State private var lastResult: BenchmarkResult?
@State private var history: [String: BenchmarkResult] = [:]
private enum Phase: Equatable {
case idle, loading, running, done, failed(String)
var label: String {
switch self {
case .idle: return String(appLoc: "未开始")
case .loading: return String(appLoc: "加载模型…")
case .running: return String(appLoc: "推理中…")
case .done: return String(appLoc: "完成 ✓")
case .failed(let m): return String(appLoc: "失败:\(m)")
}
}
}
private var isBusy: Bool { phase == .loading || phase == .running }
private var statusColor: Color {
switch phase {
case .failed: return Tj.Palette.brick
case .done: return Tj.Palette.leaf
default: return Tj.Palette.text2
}
}
var body: some View {
ScrollView {
VStack(alignment: .leading, spacing: 16) {
promptCard
HStack {
Text(phase.label)
.font(.tjScaled( 13, weight: .medium))
.foregroundStyle(statusColor)
.lineLimit(1)
Spacer()
if rate > 0 {
Text(String(format: "%.1f tok/s", rate))
.font(.tjScaled( 12, design: .monospaced))
.foregroundStyle(Tj.Palette.text3)
}
}
Button {
Task { await run() }
} label: {
Text(isBusy ? "运行中…" : "运行性能自检").frame(maxWidth: .infinity)
}
.buttonStyle(TjPrimaryButton())
.disabled(isBusy)
if isBusy { AIFlowBar() }
if let r = lastResult { statsCard(r) }
outputCard
if !history.isEmpty { historyCard }
}
.padding(16)
}
.background(Tj.Palette.sand.ignoresSafeArea())
.navigationTitle("性能自检")
.navigationBarTitleDisplayMode(.inline)
.onAppear { history = BenchmarkService.load() }
}
private var promptCard: some View {
VStack(alignment: .leading, spacing: 6) {
Text("测试 PROMPT")
.font(.tjScaled( 11, weight: .semibold))
.tracking(0.5)
.foregroundStyle(Tj.Palette.text3)
Text(BenchmarkService.fixedPrompt)
.font(.tjScaled( 14))
.foregroundStyle(Tj.Palette.text)
}
.padding(14)
.frame(maxWidth: .infinity, alignment: .leading)
.tjCard()
}
private func statsCard(_ r: BenchmarkResult) -> some View {
VStack(alignment: .leading, spacing: 10) {
HStack {
Text("本次结果")
.font(.tjScaled( 12, weight: .semibold))
.foregroundStyle(Tj.Palette.text2)
Spacer()
TjBadge(text: r.backendLabel, style: .leaf)
}
HStack(spacing: 0) {
metric(String(appLoc: "读入"), r.prefillTokensPerSecond > 0
? String(format: "%.0f tok/s", r.prefillTokensPerSecond) : "")
metric(String(appLoc: "生成"), String(format: "%.1f tok/s", r.decodeTokensPerSecond))
metric(String(appLoc: "总耗时"), String(format: "%.1fs", r.totalSeconds))
}
Text(String(appLoc: "prompt \(r.promptTokens) tok · 生成 \(r.genTokens) tok · 100% 本地"))
.font(.tjScaled( 10, design: .monospaced))
.foregroundStyle(Tj.Palette.text3)
}
.padding(14)
.frame(maxWidth: .infinity, alignment: .leading)
.tjCard()
}
private func metric(_ label: String, _ value: String) -> some View {
VStack(spacing: 3) {
Text(value)
.font(.tjScaled( 15, weight: .semibold, design: .monospaced))
.foregroundStyle(Tj.Palette.text)
Text(label)
.font(.tjScaled( 10))
.foregroundStyle(Tj.Palette.text3)
}
.frame(maxWidth: .infinity)
}
private var outputCard: some View {
ScrollView {
Text(output.isEmpty ? "(暂无输出)" : output)
.font(.system(.footnote, design: .monospaced))
.foregroundStyle(Tj.Palette.text)
.frame(maxWidth: .infinity, alignment: .leading)
.textSelection(.enabled)
.padding(12)
}
.frame(maxHeight: 220)
.background(
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
.fill(Tj.Palette.paper)
)
.overlay(
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
.strokeBorder(Tj.Palette.lineSoft, lineWidth: 1)
)
}
private var historyCard: some View {
VStack(alignment: .leading, spacing: 10) {
Text("各引擎实测对比")
.font(.tjScaled( 12, weight: .semibold))
.foregroundStyle(Tj.Palette.text2)
ForEach(history.keys.sorted(), id: \.self) { key in
if let r = history[key] {
HStack {
Text(key)
.font(.tjScaled( 12, weight: .medium))
.foregroundStyle(Tj.Palette.text)
Spacer()
Text(String(format: "生成 %.1f tok/s", r.decodeTokensPerSecond))
.font(.tjScaled( 12, design: .monospaced))
.foregroundStyle(Tj.Palette.leaf)
Text(r.date.formatted(.dateTime.month().day()))
.font(.tjScaled( 10))
.foregroundStyle(Tj.Palette.text3)
}
}
}
Text("在「我的 · 推理引擎」切换引擎后再跑一次,即可对比 SME2 与 GPU。")
.font(.tjScaled( 10))
.foregroundStyle(Tj.Palette.text3)
}
.padding(14)
.frame(maxWidth: .infinity, alignment: .leading)
.tjCard()
}
@MainActor
private func run() async {
output = ""
rate = 0
lastResult = nil
phase = .loading
do {
let result = try await BenchmarkService.shared.run { piece, r in
output += piece
if r > 0 { rate = r }
if phase == .loading { phase = .running }
}
lastResult = result
history = BenchmarkService.load()
phase = .done
} catch {
phase = .failed(error.localizedDescription)
}
}
}
#Preview {
NavigationStack { ModelSelfTestView() }
}
```
- [x] **Step 2.6: ModelManagementView 入口条件放宽 + 改文案**
`ModelManagementView.swift:31`
```swift
if service.states[.mnnLLM]?.phase == .ready {
```
改为
```swift
if service.states[.mnnLLM]?.phase == .ready || service.states[.llm]?.phase == .ready {
```
`Text("运行推理自检")` 改为 `Text("性能自检")`,icon `"play.circle"` 改为 `"gauge.with.needle"`
- [x] **Step 2.7: 跑 BenchmarkStoreTests + 模拟器编译通过。Commit**
```bash
git add 康康/Services/BenchmarkService.swift 康康Tests/BenchmarkStoreTests.swift 康康/Features/Me/ModelSelfTestView.swift 康康/Features/Me/ModelManagementView.swift
git commit -m "feat(Me): 性能自检卡 — 后端标识 + prefill/decode 实测 + 引擎对比存档"
```
---
### Task 3: 检索可视化(用户项 2)
**Files:**
- Modify: `康康/Services/HealthExportService.swift`(RetrievalSummary + Event.retrieved + answer 事件化)
- Modify: `康康/Features/Archive/HealthExportSheet.swift`(chips UI)
- Create: `康康Tests/RetrievalSummaryTests.swift`
- [x] **Step 3.1: 写失败测试 RetrievalSummaryTests.swift**
```swift
import Testing
@testable import
struct RetrievalSummaryTests {
@Test func groupsAndCountsPreservingOrder() {
let chips = HealthExportService.RetrievalSummary.groupedChips(
["血压", "血糖", "血压", "血压", "体重"], cap: 8)
#expect(chips == ["血压 ×3", "血糖", "体重"])
}
@Test func capsAndAppendsOverflow() {
let names = (1...12).map { "指标\($0)" }
let chips = HealthExportService.RetrievalSummary.groupedChips(names, cap: 8)
#expect(chips.count == 9)
#expect(chips.last == "+4")
}
@Test func emptyInputGivesEmptyChips() {
#expect(HealthExportService.RetrievalSummary.groupedChips([], cap: 8).isEmpty)
}
}
```
- [x] **Step 3.2: 跑测试确认编译失败**
- [x] **Step 3.3: HealthExportService 增加 RetrievalSummary + Event case**
`enum Event` 上方加:
```swift
/// RAG UI (§12 3)
struct RetrievalSummary: Sendable, Equatable {
var chips: [String]
var indicatorCount: Int
var reportCount: Int
var symptomCount: Int
var diaryCount: Int
var totalCount: Int { indicatorCount + reportCount + symptomCount + diaryCount }
/// (), cap "+N",
static func groupedChips(_ names: [String], cap: Int = 8) -> [String] {
var order: [String] = []
var counts: [String: Int] = [:]
for n in names {
if counts[n] == nil { order.append(n) }
counts[n, default: 0] += 1
}
var chips = order.map { name -> String in
let c = counts[name] ?? 1
return c > 1 ? "\(name) ×\(c)" : name
}
if chips.count > cap {
let overflow = chips.count - cap
chips = Array(chips.prefix(cap)) + ["+\(overflow)"]
}
return chips
}
@MainActor
static func from(snapshot: Snapshot) -> RetrievalSummary {
var chips = groupedChips(snapshot.indicators.map(\.name), cap: 8)
chips += snapshot.reports.prefix(3).map(\.title)
chips += snapshot.symptoms.prefix(3).map(\.name)
if !snapshot.diaries.isEmpty {
chips.append(String(appLoc: "日记 ×\(snapshot.diaries.count)"))
}
return RetrievalSummary(
chips: chips,
indicatorCount: snapshot.indicators.count,
reportCount: snapshot.reports.count,
symptomCount: snapshot.symptoms.count,
diaryCount: snapshot.diaries.count
)
}
}
```
`enum Event` 增加 case(放在 phaseChanged 后):
```swift
case retrieved(RetrievalSummary)
```
- [x] **Step 3.4: 三个流程 yield .retrieved**
`export(prompt:in:)`:`let snapshot = Self.retrieve(...)` 之后、`try Task.checkCancellation()` 之前加:
```swift
continuation.yield(.retrieved(RetrievalSummary.from(snapshot: snapshot)))
```
`export(conversation:in:)`:`let snapshot = Self.retrieveDialogueSnapshot(...)` 之后同样加一行。
`answer(question:conversation:in:)` 返回类型从 `AsyncThrowingStream<TokenChunk, Error>` 改为 `AsyncThrowingStream<Event, Error>`;`let snapshot = ...` 之后加 `continuation.yield(.retrieved(RetrievalSummary.from(snapshot: snapshot)))`;循环里 `continuation.yield(TokenChunk(...))` 改为 `continuation.yield(.token(TokenChunk(...)))`
- [x] **Step 3.5: HealthExportSheet 接事件 + chips UI**
新增状态:
```swift
@State private var retrieval: HealthExportService.RetrievalSummary?
@State private var turnRetrievals: [UUID: HealthExportService.RetrievalSummary] = [:]
```
`sendQuestion()` 的消费循环改为:
```swift
task = Task { @MainActor in
do {
for try await event in stream {
switch event {
case .retrieved(let summary):
withAnimation(.snappy(duration: 0.25)) {
turnRetrievals[assistantTurn.id] = summary
}
case .token(let chunk):
appendToTurn(id: assistantTurn.id, text: chunk.text)
if chunk.decodeRate > 0 { rate = chunk.decodeRate }
case .phaseChanged, .completed:
break
}
}
answeringTurnID = nil
questionFocused = true
} catch {
answeringTurnID = nil
appendToTurn(id: assistantTurn.id, text: error.localizedDescription)
questionFocused = true
}
}
```
`startReportGeneration()` 开头清 `retrieval = nil`,事件循环加:
```swift
case .retrieved(let summary):
withAnimation(.snappy(duration: 0.25)) { retrieval = summary }
```
`stopGeneration()``reset()` 里加 `retrieval = nil`(reset 另加 `turnRetrievals = [:]`)。
`dialogueBubble` 的内层 VStack(role 标签之后)插入:
```swift
if !isUser, let summary = turnRetrievals[turn.id] {
RetrievalChipsView(summary: summary)
}
```
并把空文本等待区的文案按是否已有 summary 切换:
```swift
if turn.id == answeringTurnID && turn.text.isEmpty {
VStack(alignment: .leading, spacing: 8) {
Text(turnRetrievals[turn.id] == nil
? "正在查看本地记录…"
: "正在根据这些记录回答…")
.font(.tjScaled( 13))
.foregroundStyle(Tj.Palette.text3)
AIFlowBar()
}
} else {
```
`phaseIndicator` 的 VStack 里(pills 行之后)插入:
```swift
if let retrieval {
RetrievalChipsView(summary: retrieval)
}
```
文件底部(MarkdownView 之前)新增组件:
```swift
// MARK: - chips( RAG )
private struct RetrievalChipsView: View {
let summary: HealthExportService.RetrievalSummary
var body: some View {
VStack(alignment: .leading, spacing: 6) {
if summary.totalCount == 0 {
Text("本地档案中暂无相关记录,将仅按你的描述整理")
.font(.tjScaled( 11))
.foregroundStyle(Tj.Palette.text3)
} else {
Text(String(appLoc: "已在本地档案中找到 \(summary.totalCount) 条相关记录"))
.font(.tjScaled( 11, weight: .medium))
.foregroundStyle(Tj.Palette.leaf)
ScrollView(.horizontal, showsIndicators: false) {
HStack(spacing: 6) {
ForEach(Array(summary.chips.enumerated()), id: \.offset) { _, chip in
Text(chip)
.font(.tjScaled( 11))
.foregroundStyle(Tj.Palette.text2)
.lineLimit(1)
.padding(.horizontal, 8)
.padding(.vertical, 4)
.background(Capsule().fill(Tj.Palette.sand2))
.overlay(Capsule().strokeBorder(Tj.Palette.lineSoft, lineWidth: 1))
}
}
.padding(.vertical, 1)
}
}
}
.transition(.opacity.combined(with: .move(edge: .top)))
}
}
```
- [x] **Step 3.6: 跑 RetrievalSummaryTests + 既有 HealthExport 相关测试,全部通过。Commit**
```bash
git add 康康/Services/HealthExportService.swift 康康/Features/Archive/HealthExportSheet.swift 康康Tests/RetrievalSummaryTests.swift
git commit -m "feat(Ask): 检索过程可视化 — RAG 命中记录以 chips 展示,生成前先看见"
```
---
### Task 4: AIRuntime 优先级闸门(用户项 5a)
**Files:**
- Modify: `康康/AI/AIRuntime.swift`(闸门改造 + generate 签名)
- Create: `康康Tests/InferencePriorityTests.swift`
- [x] **Step 4.1: 写失败测试**
```swift
import Testing
@testable import
struct InferencePriorityTests {
@Test func interactiveJumpsAheadOfBackground() {
let idx = AIRuntime.gateInsertionIndex(of: .interactive,
in: [.interactive, .background, .background])
#expect(idx == 1)
}
@Test func interactiveKeepsFIFOAmongInteractive() {
let idx = AIRuntime.gateInsertionIndex(of: .interactive,
in: [.interactive, .interactive])
#expect(idx == 2)
}
@Test func backgroundAlwaysAppends() {
let idx = AIRuntime.gateInsertionIndex(of: .background,
in: [.interactive, .background])
#expect(idx == 2)
}
@Test func emptyQueueInsertsAtZero() {
#expect(AIRuntime.gateInsertionIndex(of: .interactive, in: []) == 0)
#expect(AIRuntime.gateInsertionIndex(of: .background, in: []) == 0)
}
}
```
- [x] **Step 4.2: 跑测试确认编译失败**
- [x] **Step 4.3: AIRuntime 闸门改造**
文件顶部(actor 外)加:
```swift
/// interactive = (//);
/// background = (),
nonisolated enum InferencePriority: Sendable, Equatable {
case interactive
case background
}
```
闸门区(替换原 `gateBusy`/`gateWaiters`/`acquireGate`/`releaseGate`,保留原注释主体并补充):
```swift
private struct GateWaiter {
let priority: InferencePriority
let cont: CheckedContinuation<Void, Never>
}
private var gateBusy = false
private var gateHolderPriority: InferencePriority = .interactive
private var preemptRequested = false
private var gateWaiters: [GateWaiter] = []
/// interactive background ; FIFO,
nonisolated static func gateInsertionIndex(of priority: InferencePriority,
in waiting: [InferencePriority]) -> Int {
guard priority == .interactive else { return waiting.count }
return waiting.firstIndex(of: .background) ?? waiting.count
}
private func acquireGate(_ priority: InferencePriority = .interactive) async {
if !gateBusy {
gateBusy = true
gateHolderPriority = priority
return
}
// : token CancellationError
if priority == .interactive, gateHolderPriority == .background {
preemptRequested = true
}
await withCheckedContinuation { (cont: CheckedContinuation<Void, Never>) in
let idx = Self.gateInsertionIndex(of: priority, in: gateWaiters.map(\.priority))
gateWaiters.insert(GateWaiter(priority: priority, cont: cont), at: idx)
}
// releaseGate (gateBusy true)
}
private func releaseGate() {
preemptRequested = false
if gateWaiters.isEmpty {
gateBusy = false
} else {
// ,gateBusy true,
let next = gateWaiters.removeFirst()
gateHolderPriority = next.priority
next.cont.resume()
}
}
/// token :
private func shouldPreempt(_ priority: InferencePriority) -> Bool {
priority == .background && preemptRequested
}
```
- [x] **Step 4.4: generate 加 priority 参数 + 抢占检查**
`generate` 签名:
```swift
func generate(prompt: String,
maxTokens: Int = 256,
priority: InferencePriority = .interactive) -> AsyncThrowingStream<TokenChunk, Error> {
if InferenceEngine.current == .mnn, mnnStatus == .ready {
return mnnGenerate(prompt: prompt, maxTokens: maxTokens, priority: priority)
}
```
MLX 分支 Task 体:`await self.acquireGate()``await self.acquireGate(priority)`;循环内 `try Task.checkCancellation()` 之后加:
```swift
if self.shouldPreempt(priority) { throw CancellationError() }
```
catch 拆开(让取消/抢占以 CancellationError 透传,调用方好区分):
```swift
} catch is CancellationError {
continuation.finish(throwing: CancellationError())
} catch {
continuation.finish(throwing: AIRuntimeError.inferenceFailed("\(error)"))
}
```
`mnnGenerate(prompt:maxTokens:priority:)` 做完全相同的三处修改。`prepare`/`prepareMNN`/`prepareVL`/`analyzeReport` 里的 `acquireGate()` 不带参(默认 interactive,模型加载不可被抢占)。
- [x] **Step 4.5: 跑 InferencePriorityTests + 设备编译。Commit**
```bash
git add 康康/AI/AIRuntime.swift 康康Tests/InferencePriorityTests.swift
git commit -m "feat(AI): 推理闸门双优先级 — 前台插队,后台预生成按 token 让位"
```
---
### Task 5: 报告摘要预生成(用户项 3a)
**Files:**
- Create: `康康/AI/Prompts/InsightPrompts.swift`
- Create: `康康/Services/ReportInsightService.swift`
- Create: `康康Tests/InsightPromptsTests.swift`
- Modify: `康康/Features/Capture/UnifiedCaptureFlow.swift:313`(保存后挂后台任务)
- Modify: `康康/Features/Timeline/TimelineEntryDetailView.swift:260-267`(摘要卡组件化 + 兜底触发)
- [x] **Step 5.1: 写失败测试 InsightPromptsTests.swift**
```swift
import Testing
@testable import
struct InsightPromptsTests {
@Test func reportSummaryPromptCarriesDataAndGuards() {
let p = InsightPrompts.reportPlainSummary(
title: "春季体检", typeLabel: "体检报告",
indicatorLines: "血红蛋白 118 g/L(参考 130-175)low")
#expect(p.contains("春季体检"))
#expect(p.contains("血红蛋白 118"))
#expect(p.contains("/no_think"))
#expect(p.contains("不诊断"))
#expect(!p.contains("患者"))
}
@Test func trendPromptCarriesDataAndGuards() {
let p = InsightPrompts.trendInsight(
title: "空腹血糖", unit: "mmol/L", rangeText: ",参考 3.9-6.1",
dataLines: "2026-05-01 5.2 / 2026-06-01 5.8")
#expect(p.contains("空腹血糖"))
#expect(p.contains("2026-06-01 5.8"))
#expect(p.contains("/no_think"))
#expect(!p.contains("患者"))
}
}
```
- [x] **Step 5.2: 跑测试确认编译失败**
- [x] **Step 5.3: 新建 InsightPrompts.swift**
```swift
import Foundation
/// prompt: +
/// 线:;,(:)
nonisolated enum InsightPrompts {
/// (, Report.summary)
static func reportPlainSummary(title: String, typeLabel: String, indicatorLines: String) -> String {
"""
你是健康档案助手。下面是一份报告的指标列表,请用大白话给本人(称「你」)写 2~3 句整体解读:
- 第 1 句:总体情况(共几项、几项异常)。
- 之后:点名最值得留意的异常项,用生活化语言说明偏高/偏低意味着什么方向。
- 不诊断疾病、不推荐药物或剂量;异常较多时建议「带上报告咨询医生」。
- 只输出正文文字,不要标题、列表、JSON、markdown。
示例:
输入:血常规(化验单),指标:白细胞 5.2 (3.5-9.5) normal;血红蛋白 118 (130-175) low;血小板 210 (125-350) normal
输出:这份血常规共 3 项,2 项正常,血红蛋白略低于参考范围。血红蛋白偏低通常与贫血方向有关,平时可以多补充含铁食物;如果还伴随乏力头晕,建议带上报告咨询医生。
现在的报告:\(title)(\(typeLabel))
指标:
\(indicatorLines)
只输出 2~3 句正文。/no_think
"""
}
/// (TrendDetailView,)
static func trendInsight(title: String, unit: String, rangeText: String, dataLines: String) -> String {
"""
你是健康档案助手。下面是「\(title)」的历史记录(单位 \(unit)\(rangeText)),请用大白话给本人(称「你」)写 1~2 句趋势解读:
- 说清整体走向(上升/下降/平稳/波动)和当前值与参考范围的关系。
- 不诊断疾病、不推荐药物;持续异常时温和建议「复查或咨询医生」。
- 只输出正文文字,不要标题、列表、JSON。
示例:
输入:体重,单位 kg,记录:2026-04-01 72.5 / 2026-04-15 71.8 / 2026-05-01 71.2
输出:近一个月你的体重稳步下降了约 1.3kg,节奏平缓,继续保持现在的习惯就好。
现在的记录:
\(dataLines)
只输出 1~2 句正文。/no_think
"""
}
}
```
- [x] **Step 5.4: 跑测试确认通过**
- [x] **Step 5.5: 新建 ReportInsightService.swift**
```swift
import Foundation
import SwiftData
/// (§3.1: AIRuntime,UI )
/// :();
/// : summary VL
@MainActor
final class ReportInsightService {
static let shared = ReportInsightService()
private init() {}
/// ID,
private var inFlight: Set<String> = []
func pregenerateIfNeeded(report: Report, in ctx: ModelContext) async {
guard (report.summary ?? "").isEmpty, !report.indicators.isEmpty else { return }
let key = String(describing: report.persistentModelID)
guard !inFlight.contains(key) else { return }
inFlight.insert(key)
defer { inFlight.remove(key) }
do {
try await AIRuntime.shared.prepare()
} catch {
return // :,
}
let prompt = InsightPrompts.reportPlainSummary(
title: report.title,
typeLabel: report.type.label,
indicatorLines: Self.indicatorLines(for: report.indicators)
)
var collected = ""
do {
let stream = await AIRuntime.shared.generate(
prompt: prompt, maxTokens: 200, priority: .background)
for try await chunk in stream { collected += chunk.text }
} catch {
return // (CancellationError):,
}
let text = HealthExportService.stripThinkBlocks(collected)
.trimmingCharacters(in: .whitespacesAndNewlines)
guard !text.isEmpty, (report.summary ?? "").isEmpty else { return }
report.summary = text
try? ctx.save()
}
/// ( range)status;, 15 prompt
static func indicatorLines(for indicators: [Indicator]) -> String {
let sorted = indicators.sorted {
($0.status == .normal ? 1 : 0) < ($1.status == .normal ? 1 : 0)
}
return sorted.prefix(15).map { i in
var line = "\(i.name) \(i.value)"
if !i.unit.isEmpty { line += " \(i.unit)" }
if !i.range.isEmpty { line += "(参考 \(i.range))" }
line += " \(i.status.rawValue)"
return line
}.joined(separator: "\n")
}
}
```
- [x] **Step 5.6: UnifiedCaptureFlow.saveAll 挂后台任务**
`saveAll` 末尾的
```swift
try? ctx.save()
onClose()
```
改为
```swift
try? ctx.save()
// :,
// AI (/) token
Task { await ReportInsightService.shared.pregenerateIfNeeded(report: report, in: ctx) }
onClose()
```
- [x] **Step 5.7: TimelineEntryDetailView 摘要卡组件化**
`reportBody` 中的
```swift
if let sum = r.summary, !sum.isEmpty {
card {
Text(String(appLoc: "摘要"))
.font(.tjScaled( 12, weight: .semibold)).foregroundStyle(Tj.Palette.text2)
Text(sum).font(.tjScaled( 14)).foregroundStyle(Tj.Palette.text)
.fixedSize(horizontal: false, vertical: true)
}
}
```
替换为
```swift
ReportSummaryCard(report: r)
```
文件末尾新增组件(card 容器样式与本文件 `card` helper 一致):
```swift
// MARK: - ()
/// ;(,),
/// 线, SwiftData
private struct ReportSummaryCard: View {
@Environment(\.modelContext) private var ctx
let report: Report
@State private var generating = false
var body: some View {
Group {
if let sum = report.summary, !sum.isEmpty {
container {
Text(String(appLoc: "摘要"))
.font(.tjScaled( 12, weight: .semibold)).foregroundStyle(Tj.Palette.text2)
Text(sum).font(.tjScaled( 14)).foregroundStyle(Tj.Palette.text)
.fixedSize(horizontal: false, vertical: true)
}
} else if generating {
container {
Text("本地 AI 正在解读这份报告…")
.font(.tjScaled( 12)).foregroundStyle(Tj.Palette.text3)
AIFlowBar()
}
}
}
.task {
guard (report.summary ?? "").isEmpty, !report.indicators.isEmpty else { return }
generating = true
await ReportInsightService.shared.pregenerateIfNeeded(report: report, in: ctx)
generating = false
}
}
private func container<C: View>(@ViewBuilder _ body: () -> C) -> some View {
VStack(alignment: .leading, spacing: 10) { body() }
.padding(14)
.frame(maxWidth: .infinity, alignment: .leading)
.background(
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
.fill(Tj.Palette.paper)
)
.overlay(
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
.strokeBorder(Tj.Palette.lineSoft, lineWidth: 1)
)
}
}
```
- [x] **Step 5.8: 模拟器编译 + 全量既有测试不回归。Commit**
```bash
git add 康康/AI/Prompts/InsightPrompts.swift 康康/Services/ReportInsightService.swift 康康Tests/InsightPromptsTests.swift 康康/Features/Capture/UnifiedCaptureFlow.swift 康康/Features/Timeline/TimelineEntryDetailView.swift
git commit -m "feat(Capture): 归档后后台预生成大白话摘要,详情页秒开 + 兜底重试"
```
---
### Task 6: 趋势 AI 解读 + 指纹缓存(用户项 3b)
**Files:**
- Create: `康康/Services/TrendInsightService.swift`
- Create: `康康Tests/TrendInsightCacheTests.swift`
- Modify: `康康/Features/Trends/TrendDetailView.swift:72,321-340`(占位换实卡)
- [x] **Step 6.1: 写失败测试 TrendInsightCacheTests.swift**
```swift
import Testing
import SwiftUI
@testable import
@MainActor
struct TrendInsightCacheTests {
private func bucket(values: [Double]) -> SeriesBucket {
let points = values.enumerated().map { i, v in
SeriesBucket.Point(id: "p\(i)",
date: Date(timeIntervalSince1970: Double(i) * 86_400),
value: v, status: .normal)
}
let line = SeriesBucket.SeriesLine(id: "glucose.fasting", seriesKey: "glucose.fasting",
label: nil, color: .blue, points: points,
referenceRange: 3.9...6.1)
return SeriesBucket(id: "glucose.fasting", title: "空腹血糖", unit: "mmol/L",
lines: [line], latestDate: .now, kind: .monitor)
}
@Test func fingerprintStableForSameData() {
let a = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5]))
let b = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5]))
#expect(a == b)
}
@Test func fingerprintChangesWhenDataChanges() {
let a = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5]))
let b = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5, 6.0]))
#expect(a != b)
}
@Test func dataLinesFormatsDateAndValue() {
let lines = TrendInsightService.dataLines(for: bucket(values: [5.2, 5.5]))
#expect(lines.contains("1970-01-01 5.2"))
#expect(lines.contains("1970-01-02 5.5"))
}
@Test func rangeTextRendersReference() {
#expect(TrendInsightService.rangeText(for: bucket(values: [5.2]))
== ",参考 3.9-6.1")
}
}
```
- [x] **Step 6.2: 跑测试确认编译失败**
- [x] **Step 6.3: 新建 TrendInsightService.swift**
```swift
import Foundation
/// AI :(140 token)+ (UserDefaults)
/// ;/
@MainActor
final class TrendInsightService {
static let shared = TrendInsightService()
private init() {}
struct Cached: Codable, Equatable {
var fingerprint: String
var text: String
var generatedAt: Date
}
static let storePrefix = "kk.trendInsight."
/// :线 key + + + /, fingerprint
static func fingerprint(for bucket: SeriesBucket) -> String {
var parts: [String] = [bucket.id]
for line in bucket.lines {
let pts = line.points
let first = pts.first.map { Int($0.date.timeIntervalSince1970) } ?? 0
let last = pts.last.map { Int($0.date.timeIntervalSince1970) } ?? 0
let lastV = pts.last?.value ?? 0
let minV = pts.map(\.value).min() ?? 0
let maxV = pts.map(\.value).max() ?? 0
parts.append("\(line.seriesKey)#\(pts.count)#\(first)#\(last)#\(lastV)#\(minV)#\(maxV)")
}
return parts.joined(separator: "|")
}
/// (), nil
func cachedText(for bucket: SeriesBucket) -> String? {
guard let data = UserDefaults.standard.data(forKey: Self.storePrefix + bucket.id),
let c = try? JSONDecoder().decode(Cached.self, from: data),
c.fingerprint == Self.fingerprint(for: bucket) else {
return nil
}
return c.text
}
/// /,UI +
func generate(for bucket: SeriesBucket) async throws -> String {
try await AIRuntime.shared.prepare()
let prompt = InsightPrompts.trendInsight(
title: bucket.title,
unit: bucket.unit,
rangeText: Self.rangeText(for: bucket),
dataLines: Self.dataLines(for: bucket)
)
var collected = ""
let stream = await AIRuntime.shared.generate(prompt: prompt, maxTokens: 140)
for try await chunk in stream { collected += chunk.text }
let text = HealthExportService.stripThinkBlocks(collected)
.trimmingCharacters(in: .whitespacesAndNewlines)
guard !text.isEmpty else { throw AIRuntimeError.inferenceFailed("空输出") }
let cached = Cached(fingerprint: Self.fingerprint(for: bucket), text: text, generatedAt: .now)
if let data = try? JSONEncoder().encode(cached) {
UserDefaults.standard.set(data, forKey: Self.storePrefix + bucket.id)
}
return text
}
/// 线 24 "yyyy-MM-dd ";线() label
static func dataLines(for bucket: SeriesBucket) -> String {
let df = DateFormatter()
df.locale = Locale(identifier: "en_US_POSIX")
df.timeZone = TimeZone(identifier: "UTC")
df.dateFormat = "yyyy-MM-dd"
var lines: [String] = []
for line in bucket.lines {
let pts = line.points.suffix(24)
let prefix = bucket.lines.count > 1 ? "\(line.label ?? line.seriesKey):" : ""
let series = pts.map { "\(df.string(from: $0.date)) \(fmt($0.value))" }
.joined(separator: " / ")
lines.append(prefix + series)
}
return lines.joined(separator: "\n")
}
/// ", lo-hi" ()
static func rangeText(for bucket: SeriesBucket) -> String {
guard let r = bucket.lines.first?.referenceRange else { return "" }
return ",参考 \(fmt(r.lowerBound))-\(fmt(r.upperBound))"
}
private static func fmt(_ v: Double) -> String {
v.truncatingRemainder(dividingBy: 1) == 0
? String(format: "%.0f", v)
: String(format: "%.1f", v)
}
}
```
注意:`dataLines` 用 UTC 时区保证测试与设备时区无关(展示日期仅供模型理解,差几小时无影响)。
- [x] **Step 6.4: 跑测试确认通过**
- [x] **Step 6.5: TrendDetailView 换卡**
body 中 `aiPlaceholder` 替换为 `TrendInsightCard(bucket: bucket)`;删除 `// MARK: AI 解读占位``aiPlaceholder` 整块;文件末尾(`enum TrendRange` 之前)加:
```swift
// MARK: - AI
/// :;( TrendInsightService,§3.1)
private struct TrendInsightCard: View {
let bucket: SeriesBucket
@State private var text: String?
@State private var running = false
@State private var failedMessage: String?
var body: some View {
VStack(alignment: .leading, spacing: 8) {
HStack(spacing: 6) {
Image(systemName: "sparkles")
.font(.tjScaled( 12))
.foregroundStyle(Tj.Palette.ink)
Text("AI 解读")
.font(.tjScaled( 12, weight: .semibold))
.foregroundStyle(Tj.Palette.text2)
Spacer()
}
if let text {
Text(text)
.font(.tjScaled( 13))
.lineSpacing(3)
.foregroundStyle(Tj.Palette.text)
.fixedSize(horizontal: false, vertical: true)
AIDisclaimerFooter()
} else if running {
Text("本地 AI 解读中…")
.font(.tjScaled( 12))
.foregroundStyle(Tj.Palette.text3)
AIFlowBar()
} else if let failedMessage {
HStack {
Text(failedMessage)
.font(.tjScaled( 12))
.foregroundStyle(Tj.Palette.text3)
Spacer()
Button("重试") { Task { await load(force: true) } }
.font(.tjScaled( 12, weight: .medium))
.foregroundStyle(Tj.Palette.ink)
}
}
}
.padding(14)
.frame(maxWidth: .infinity, alignment: .leading)
.background(
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
.fill(Tj.Palette.paper)
)
.overlay(
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
.strokeBorder(Tj.Palette.lineSoft, lineWidth: 1)
)
.task(id: bucket.id) { await load(force: false) }
}
@MainActor
private func load(force: Bool) async {
if !force, let cached = TrendInsightService.shared.cachedText(for: bucket) {
text = cached
return
}
running = true
failedMessage = nil
do {
text = try await TrendInsightService.shared.generate(for: bucket)
} catch {
failedMessage = String(appLoc: "AI 解读暂不可用(模型未就绪或繁忙)")
}
running = false
}
}
```
- [x] **Step 6.6: 跑 TrendInsightCacheTests + SeriesBucketTests 不回归。Commit**
```bash
git add 康康/Services/TrendInsightService.swift 康康Tests/TrendInsightCacheTests.swift 康康/Features/Trends/TrendDetailView.swift
git commit -m "feat(Trends): AI 趋势解读上线 — 数据指纹缓存,秒开不重算"
```
---
### Task 7: OCR 文本辅助报告识别(用户项 4)
> **特别注意:QwenVL-4B 已弃用。** 这里的「报告识别」由 Qwen3.5-2B 多模态承担(MNN Omni `mnn.analyze` 主路 / MLX `VLSession` 兜底)。OCR 参考文本对 2B 视觉读密集小数字尤其有用。
**Files:**
- Modify: `康康/AI/Prompts/VLPrompts.swift:34-89`(reportExtraction 加 ocrText + 模板占位 + clip 函数)
- Modify: `康康/Services/CaptureService.swift:137-161`(runVL 注入 OCR)
- Create: `康康Tests/VLPromptsOCRTests.swift`
- [x] **Step 7.1: 写失败测试 VLPromptsOCRTests.swift**
```swift
import Testing
@testable import
struct VLPromptsOCRTests {
@Test func emptyOCRKeepsPromptClean() {
let p = VLPrompts.reportExtraction(ocrText: "")
#expect(!p.contains("OCR 参考文本"))
#expect(!p.contains("{{OCR_SECTION}}"))
#expect(p.contains("现在请识别图片并输出 JSON"))
}
@Test func ocrTextIsInjectedBeforeFinalInstruction() {
let p = VLPrompts.reportExtraction(ocrText: "尿酸 486 208-428 μmol/L")
#expect(p.contains("OCR 参考文本"))
#expect(p.contains("尿酸 486"))
let ocrPos = p.range(of: "尿酸 486")!.lowerBound
let endPos = p.range(of: "现在请识别图片并输出 JSON")!.lowerBound
#expect(ocrPos < endPos)
}
@Test func clipKeepsShortTextIntact() {
#expect(VLPrompts.clipOCR("短文本") == "短文本")
}
@Test func clipCutsAtLineBoundary() {
let long = Array(repeating: "指标行 1.23 mmol/L", count: 400).joined(separator: "\n")
let clipped = VLPrompts.clipOCR(long, limit: 200)
#expect(clipped.count < 260)
#expect(clipped.hasSuffix("(后续内容过长已截断)"))
#expect(!clipped.contains("\n指标行 1.23 mmol/L(后续")) //
}
}
```
- [x] **Step 7.2: 跑测试确认失败**
- [x] **Step 7.3: VLPrompts 改造**
`reportExtraction` 改为:
```swift
static func reportExtraction(today: Date = .now, ocrText: String = "") -> String {
let f = DateFormatter()
f.locale = Locale(identifier: "en_US_POSIX")
f.dateFormat = "yyyy-MM-dd"
let todayStr = f.string(from: today)
// OCR :Vision 2B ;
let ocrSection: String
if ocrText.isEmpty {
ocrSection = ""
} else {
ocrSection = """
OCR 参考文本(系统对同一报告做文字识别的结果,可能有错字、串行或漏行;版面与表格结构以图片为准,但数值、小数点以 OCR 文字更可靠):
\(clipOCR(ocrText))
"""
}
return reportExtractionTemplate
.replacingOccurrences(of: "{{TODAY}}", with: todayStr)
.replacingOccurrences(of: "{{OCR_SECTION}}", with: ocrSection)
}
/// OCR : prompt (2B )
static func clipOCR(_ text: String, limit: Int = 1800) -> String {
guard text.count > limit else { return text }
let clipped = String(text.prefix(limit))
if let lastNewline = clipped.lastIndex(of: "\n") {
return String(clipped[..<lastNewline]) + "\n(后续内容过长已截断)"
}
return clipped + "\n(后续内容过长已截断)"
}
```
`reportExtractionTemplate` 末尾的
```
现在请识别图片并输出 JSON:
```
前面插入一行 `{{OCR_SECTION}}`(即示例 2 之后、最后指令之前)。
- [x] **Step 7.4: 跑测试确认通过**
- [x] **Step 7.5: CaptureService.runVL 注入 OCR**
`runVL` 改为:
```swift
private func runVL(on assets: [FileVault.SavedAsset]) async throws -> ParsedReport {
do {
try await AIRuntime.shared.prepareVL()
} catch {
throw CaptureError.modelNotReady
}
let urls = assets.map { FileVault.shared.rootURL.appendingPathComponent($0.relativePath) }
// OCR (Vision ,<1s/): 2B ,
// 退,(§3.2)
let ocr = await Self.ocrReference(for: urls)
let raw: String
do {
raw = try await AIRuntime.shared.analyzeReport(
imageURLs: urls,
prompt: VLPrompts.reportExtraction(ocrText: ocr)
)
} catch {
throw CaptureError.inferenceFailed("\(error)")
}
do {
return try CaptureService.parseReportJSON(raw, pageCount: assets.count)
} catch let CaptureError.parseFailed(msg) {
throw CaptureError.parseFailed(msg)
} catch {
throw CaptureError.parseFailed("\(error)")
}
}
/// Vault OCR 4 ;/ ""
private static func ocrReference(for urls: [URL]) async -> String {
var pages: [String] = []
for (idx, url) in urls.prefix(4).enumerated() {
guard let src = CGImageSourceCreateWithURL(url as CFURL, nil),
let cg = CGImageSourceCreateImageAtIndex(src, 0, nil) else { continue }
guard let text = try? await OCRService.recognizeText(in: cg),
!text.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty else { continue }
pages.append(urls.count > 1 ? "【第 \(idx + 1) 页】\n\(text)" : text)
}
return pages.joined(separator: "\n")
}
```
文件顶部 import 区加 `import ImageIO`(UIKit 已有)。
- [x] **Step 7.6: 跑 VLPromptsOCRTests + CaptureServiceJSONTests 不回归 + 设备编译。Commit**
```bash
git add 康康/AI/Prompts/VLPrompts.swift 康康/Services/CaptureService.swift 康康Tests/VLPromptsOCRTests.swift
git commit -m "feat(Capture): 报告识别注入 Vision OCR 参考文本,提升 2B 多模态数字准确率"
```
---
### Task 8: MNN KV cache 调研文档(用户项 5b)
**Files:**
- Create: `docs/research/mnn-kv-cache-prefix.md`
- [x] **Step 8.1: 写调研文档**
内容要点(基于 `Frameworks/MNN.xcframework/ios-arm64/MNN.framework/Headers/llm/llm.hpp` 实际头文件):
- 结论:当前 MNN 构建已暴露 prefix cache 能力,可把各场景固定 prompt 模板的 prefill 结果缓存。
- 依据:`bool setPrefixCacheFile(const std::string&, int flag)`(llm.hpp:161,配套私有成员 `mPrefixCacheMode`/`mPrefixLength`/`completePrefixWrite`)、`bool reuse_kv()`(llm.hpp:171,config 开关)、`void syncPromptCache(const ChatMessages&)`(llm.hpp:176)。
- 适用性:本项目全部是「固定模板前缀 + 可变数据后缀」单轮 `response()`,与 prefix cache 模型吻合;模板体量报告识别 ~900 tok / 导出 ~700 tok / 意图抽取 ~300 tok,按性能自检实测 prefill 速率估算每次省 1~3s。
- 风险:flag 语义无注释;OMNI 多模态分支未验证;cache 文件与模型版本绑定需失效处理。
- 建议:W6 polish 阶段、用性能自检卡量化 prefill 占比后再接入;真机 A/B 各跑 3 次对比 `prefill_us`;异常立即删 cache 文件回退。当前瓶颈在 decode,优先级低于 C1/C2/Live Activity。
- [x] **Step 8.2: Commit**
```bash
git add docs/research/mnn-kv-cache-prefix.md
git commit -m "docs(AI): MNN prefix KV cache 调研 — setPrefixCacheFile 可用,建议 W6 量化后接入"
```
---
### Task 9: 收尾验证
- [x] **Step 9.1: 全量单元测试**
```bash
xcodebuild test -project 康康.xcodeproj -scheme 康康 \
-destination 'platform=iOS Simulator,name=iPhone 17' \
-derivedDataPath build/cli-dd -only-testing:'康康Tests' 2>&1 | tail -30
```
预期:全部 PASS,无回归。
- [x] **Step 9.2: 设备编译(MNN 真实分支)**
```bash
xcodebuild build -project 康康.xcodeproj -scheme 康康 \
-destination 'generic/platform=iOS' \
-derivedDataPath build/cli-dd CODE_SIGNING_ALLOWED=NO 2>&1 | tail -15
```
预期:BUILD SUCCEEDED,无新增 warning。
- [ ] **Step 9.3: 真机验证清单(留给用户,代码侧无法完成)**
1. 性能自检卡:MNN 与 MLX 各跑一次,对比卡出现两行数据。
2. 问答:发问后先看到「已找到 N 条记录」chips,再流式回答。
3. 归档一份报告 → 不进详情页等 1 分钟 → 进详情页摘要已就绪(秒开)。
4. 趋势详情:首次进入现算,退出再进秒开(缓存);新增一条记录后重新生成。
5. 拍多页化验单:对比 OCR 辅助前后数值准确率。
6. 后台摘要生成中立刻发起问答:问答无感插队,摘要稍后补全。