1627 lines
62 KiB
Markdown
1627 lines
62 KiB
Markdown
# 比赛优化五件套 Implementation Plan
|
||
|
||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||
|
||
**Goal:** 实现 5 项已确认的比赛优化:① 性能自检卡(SME2 证据)② 问答检索可视化 ③ 报告摘要预生成 + 趋势解读缓存 ④ OCR 文本辅助报告识别(Qwen3.5-2B 多模态,QwenVL-4B 已弃用)⑤ AIRuntime 优先级闸门 + MNN KV cache 调研。
|
||
|
||
**Architecture:** 不改变 §3.1 模块边界(UI → Service → AIRuntime)。在 AIRuntime 增加两后端归一的 `GenerateStats` 与协作式优先级闸门(interactive 插队、background 在下一 token 让位);HealthExportService 的事件流增加 `.retrieved(RetrievalSummary)`;新增 BenchmarkService / ReportInsightService / TrendInsightService 三个轻服务;VL prompt 注入 OCR 参考文本。**注意:视觉推理现在由 Qwen3.5-2B 多模态承担(MNN Omni 主路 / MLX VLSession 兜底,均从 `.llm`/`.mnnLLM` 目录加载),不存在独立 VL 模型。**
|
||
|
||
**Tech Stack:** SwiftUI + SwiftData(iOS 17+)、MNN(ObjC++ bridge)、MLX Swift(mlx-swift-lm 2.31.3,`GenerateCompletionInfo`)、Vision OCR、Swift Testing(康康Tests)。
|
||
|
||
**用户编号 → 任务映射:** 用户项 1 → Task 1+2;项 2 → Task 3;项 5 → Task 4+8;项 3 → Task 5+6;项 4 → Task 7。Task 4(优先级闸门)提前是因为 Task 5 的后台预生成依赖 `priority: .background`。
|
||
|
||
**构建/测试命令(全任务通用):**
|
||
|
||
```bash
|
||
# 单元测试(模拟器;首次先 xcrun simctl list devices available | grep iPhone 确认名字)
|
||
cd /Users/xuhuayong/apps/康康
|
||
xcodebuild test -project 康康.xcodeproj -scheme 康康 \
|
||
-destination 'platform=iOS Simulator,name=iPhone 17' \
|
||
-derivedDataPath build/cli-dd \
|
||
-only-testing:'康康Tests/<测试类>' 2>&1 | tail -25
|
||
|
||
# 设备编译验证(MNNLLMBridge 真实分支只在 device 切片编译)
|
||
xcodebuild build -project 康康.xcodeproj -scheme 康康 \
|
||
-destination 'generic/platform=iOS' \
|
||
-derivedDataPath build/cli-dd CODE_SIGNING_ALLOWED=NO 2>&1 | tail -15
|
||
```
|
||
|
||
**红线提醒(写每一行代码前记住):** 不出现「患者」字样;新 UI 字号一律 `Font.tjScaled`;颜色只用 `Tj.Palette.*`;UI 不直接调 AIRuntime(自检页例外已既成,新代码走 Service);所有 prompt 带 few-shot + `/no_think` + 失败回退;不碰 Localizable.xcstrings(git status 里已有未提交修改,保持不动)。
|
||
|
||
---
|
||
|
||
### Task 1: GenerateStats 两后端归一统计
|
||
|
||
**Files:**
|
||
- Create: `康康/AI/GenerateStats.swift`
|
||
- Modify: `康康/AI/MNNBackend.swift`
|
||
- Modify: `康康/AI/LLMSession.swift`
|
||
- Modify: `康康/AI/AIRuntime.swift`
|
||
|
||
- [x] **Step 1.1: 新建 GenerateStats.swift**
|
||
|
||
```swift
|
||
import Foundation
|
||
|
||
/// 单次生成的性能统计,两后端(MNN / MLX)归一。
|
||
/// MNN 取自 LlmContext(prefill_us / decode_us);MLX 取自 GenerateCompletionInfo。
|
||
struct GenerateStats: Sendable, Equatable {
|
||
var promptTokens: Int
|
||
var genTokens: Int
|
||
/// prefill(读入 prompt)耗时,秒。
|
||
var prefillSeconds: Double
|
||
/// decode(逐 token 生成)耗时,秒。
|
||
var decodeSeconds: Double
|
||
|
||
var prefillTokensPerSecond: Double {
|
||
prefillSeconds > 0 ? Double(promptTokens) / prefillSeconds : 0
|
||
}
|
||
var decodeTokensPerSecond: Double {
|
||
decodeSeconds > 0 ? Double(genTokens) / decodeSeconds : 0
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 1.2: MNNBackend 捕获统计**
|
||
|
||
actor 增加状态与方法:
|
||
|
||
```swift
|
||
/// 末次生成统计(供 AIRuntime 在流结束后取走,性能自检用)。
|
||
private(set) var lastStats: GenerateStats?
|
||
|
||
private func record(_ s: GenerateStats) { lastStats = s }
|
||
```
|
||
|
||
`generate` 的 detached Task 改为(MNNGenerateStats 是 ObjC 对象,先抽成 Sendable 的 GenerateStats 再跨 actor):
|
||
|
||
```swift
|
||
let task = Task.detached(priority: .userInitiated) {
|
||
let stats = box.value.generateText(prompt, maxTokens: Int32(maxTokens)) { piece in
|
||
let rate = meter.tick()
|
||
continuation.yield(TokenChunk(text: piece, decodeRate: rate))
|
||
}
|
||
await self.record(GenerateStats(
|
||
promptTokens: Int(stats.promptTokens),
|
||
genTokens: Int(stats.genTokens),
|
||
prefillSeconds: stats.prefillMs / 1000.0,
|
||
decodeSeconds: stats.decodeMs / 1000.0
|
||
))
|
||
continuation.finish()
|
||
}
|
||
```
|
||
|
||
`analyze` 同样:`_ = try box.value.analyzeImages(...)` 改为接住返回的 stats,`cont.resume(returning:)` 前 `await self.record(...)`(注意 analyzeImages 返回 optional,`if let s = ...` 再 record)。
|
||
|
||
- [x] **Step 1.3: LLMSession 捕获 .info 统计**
|
||
|
||
actor 增加:
|
||
|
||
```swift
|
||
/// 末次生成统计(取自流末尾的 .info 完成事件,性能自检用)。
|
||
private(set) var lastStats: GenerateStats?
|
||
|
||
private func record(_ s: GenerateStats) { lastStats = s }
|
||
```
|
||
|
||
`generate` 内 switch 的 `.info` 分支改为:
|
||
|
||
```swift
|
||
case .info(let info):
|
||
// 生成完成统计,是流的最后一个事件
|
||
await self.record(GenerateStats(
|
||
promptTokens: info.promptTokenCount,
|
||
genTokens: info.generationTokenCount,
|
||
prefillSeconds: info.promptTime,
|
||
decodeSeconds: info.generateTime
|
||
))
|
||
```
|
||
|
||
- [x] **Step 1.4: AIRuntime 暴露统计与后端标签**
|
||
|
||
actor 增加:
|
||
|
||
```swift
|
||
/// 末次文本生成的性能统计(性能自检页消费;两后端归一)。
|
||
private(set) var lastGenerateStats: GenerateStats?
|
||
|
||
/// 当前实际生效的后端标签(性能自检 / PPT 截图用)。
|
||
var activeBackendLabel: String {
|
||
if InferenceEngine.current == .mnn, mnnStatus == .ready {
|
||
return InferenceEngine.cpuSupportsSME2 ? "MNN · SME2" : "MNN · NEON"
|
||
}
|
||
#if targetEnvironment(simulator)
|
||
return "MLX · CPU(模拟器)"
|
||
#else
|
||
return "MLX · GPU"
|
||
#endif
|
||
}
|
||
```
|
||
|
||
`generate` MLX 分支 `for try await` 循环之后、`continuation.finish()` 之前加:
|
||
|
||
```swift
|
||
self.lastGenerateStats = await session.lastStats
|
||
```
|
||
|
||
`mnnGenerate` 同位置加:
|
||
|
||
```swift
|
||
self.lastGenerateStats = await self.mnn.lastStats
|
||
```
|
||
|
||
- [x] **Step 1.5: 设备编译验证(命令见顶部),确认无错误。Commit**
|
||
|
||
```bash
|
||
git add 康康/AI/GenerateStats.swift 康康/AI/MNNBackend.swift 康康/AI/LLMSession.swift 康康/AI/AIRuntime.swift
|
||
git commit -m "feat(AI): 两后端归一的 GenerateStats(prefill/decode 实测统计)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 2: 性能自检卡(用户项 1)
|
||
|
||
**Files:**
|
||
- Create: `康康/Services/BenchmarkService.swift`
|
||
- Create: `康康Tests/BenchmarkStoreTests.swift`
|
||
- Modify: `康康/Features/Me/ModelSelfTestView.swift`(整体改造)
|
||
- Modify: `康康/Features/Me/ModelManagementView.swift:31-42`(入口条件 + 文案)
|
||
|
||
- [x] **Step 2.1: 写失败测试 BenchmarkStoreTests.swift**
|
||
|
||
```swift
|
||
import Testing
|
||
import Foundation
|
||
@testable import 康康
|
||
|
||
struct BenchmarkStoreTests {
|
||
|
||
private func freshDefaults() -> UserDefaults {
|
||
let d = UserDefaults(suiteName: "test.kk.benchmark")!
|
||
d.removePersistentDomain(forName: "test.kk.benchmark")
|
||
return d
|
||
}
|
||
|
||
@Test func savesAndLoadsPerBackend() {
|
||
let d = freshDefaults()
|
||
let mnn = BenchmarkResult(backendLabel: "MNN · SME2", promptTokens: 30, genTokens: 80,
|
||
prefillTokensPerSecond: 120, decodeTokensPerSecond: 25,
|
||
totalSeconds: 4.2, date: .now)
|
||
let mlx = BenchmarkResult(backendLabel: "MLX · GPU", promptTokens: 30, genTokens: 80,
|
||
prefillTokensPerSecond: 300, decodeTokensPerSecond: 40,
|
||
totalSeconds: 2.5, date: .now)
|
||
BenchmarkService.save(mnn, defaults: d)
|
||
BenchmarkService.save(mlx, defaults: d)
|
||
let all = BenchmarkService.load(defaults: d)
|
||
#expect(all.count == 2)
|
||
#expect(all["MNN · SME2"]?.decodeTokensPerSecond == 25)
|
||
}
|
||
|
||
@Test func overwritesSameBackend() {
|
||
let d = freshDefaults()
|
||
let old = BenchmarkResult(backendLabel: "MLX · GPU", promptTokens: 1, genTokens: 1,
|
||
prefillTokensPerSecond: 1, decodeTokensPerSecond: 1,
|
||
totalSeconds: 1, date: .now)
|
||
var new = old; new.decodeTokensPerSecond = 99
|
||
BenchmarkService.save(old, defaults: d)
|
||
BenchmarkService.save(new, defaults: d)
|
||
#expect(BenchmarkService.load(defaults: d)["MLX · GPU"]?.decodeTokensPerSecond == 99)
|
||
}
|
||
|
||
@Test func loadOnEmptyReturnsEmpty() {
|
||
#expect(BenchmarkService.load(defaults: freshDefaults()).isEmpty)
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 2.2: 跑测试确认编译失败(BenchmarkService 不存在)**
|
||
|
||
- [x] **Step 2.3: 新建 BenchmarkService.swift**
|
||
|
||
```swift
|
||
import Foundation
|
||
|
||
/// 单次性能自检结果。按后端标签归档,供「MNN·SME2 vs MLX·GPU」对比展示(§12 卖点 2/6)。
|
||
struct BenchmarkResult: Codable, Equatable {
|
||
var backendLabel: String
|
||
var promptTokens: Int
|
||
var genTokens: Int
|
||
var prefillTokensPerSecond: Double
|
||
var decodeTokensPerSecond: Double
|
||
var totalSeconds: Double
|
||
var date: Date
|
||
}
|
||
|
||
/// 性能自检服务:跑固定 prompt,取 AIRuntime 的归一统计,按后端标签存 UserDefaults。
|
||
/// UI(ModelSelfTestView)只经本服务调 AIRuntime(§3.1)。
|
||
@MainActor
|
||
struct BenchmarkService {
|
||
static let shared = BenchmarkService()
|
||
private init() {}
|
||
|
||
static let storeKey = "kk.benchmark.results"
|
||
|
||
/// 固定测试 prompt:跨设备/引擎可比的前提。
|
||
static let fixedPrompt = "用中文一句话介绍肝功能里 ALT 这个指标。"
|
||
|
||
/// 跑一次自检。onToken 把流式输出交给 UI 展示。
|
||
func run(onToken: @escaping @MainActor (String, Double) -> Void) async throws -> BenchmarkResult {
|
||
try await AIRuntime.shared.prepare()
|
||
let start = Date()
|
||
let stream = await AIRuntime.shared.generate(prompt: Self.fixedPrompt, maxTokens: 128)
|
||
for try await chunk in stream {
|
||
onToken(chunk.text, chunk.decodeRate)
|
||
}
|
||
let total = Date().timeIntervalSince(start)
|
||
let label = await AIRuntime.shared.activeBackendLabel
|
||
let stats = await AIRuntime.shared.lastGenerateStats
|
||
let result = BenchmarkResult(
|
||
backendLabel: label,
|
||
promptTokens: stats?.promptTokens ?? 0,
|
||
genTokens: stats?.genTokens ?? 0,
|
||
prefillTokensPerSecond: stats?.prefillTokensPerSecond ?? 0,
|
||
decodeTokensPerSecond: stats?.decodeTokensPerSecond ?? 0,
|
||
totalSeconds: total,
|
||
date: .now
|
||
)
|
||
Self.save(result)
|
||
return result
|
||
}
|
||
|
||
// MARK: - 存档(静态纯函数,单测覆盖)
|
||
|
||
static func save(_ result: BenchmarkResult, defaults: UserDefaults = .standard) {
|
||
var all = load(defaults: defaults)
|
||
all[result.backendLabel] = result
|
||
if let data = try? JSONEncoder().encode(all) {
|
||
defaults.set(data, forKey: storeKey)
|
||
}
|
||
}
|
||
|
||
static func load(defaults: UserDefaults = .standard) -> [String: BenchmarkResult] {
|
||
guard let data = defaults.data(forKey: storeKey),
|
||
let all = try? JSONDecoder().decode([String: BenchmarkResult].self, from: data) else {
|
||
return [:]
|
||
}
|
||
return all
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 2.4: 跑测试确认通过**
|
||
|
||
- [x] **Step 2.5: 改造 ModelSelfTestView**
|
||
|
||
保留原 prompt 卡/状态行/输出框骨架,改动:`run()` 改走 BenchmarkService;新增本次结果卡(后端 badge + 读入/生成 tok/s + 总耗时)、历史对比卡(每后端一行 + 「切换引擎后再跑一次即可对比」提示);外层换 ScrollView;标题改「性能自检」。完整代码:
|
||
|
||
```swift
|
||
import SwiftUI
|
||
|
||
/// 性能自检:跑固定 prompt,展示当前后端(MNN·SME2 / MNN·NEON / MLX·GPU)的
|
||
/// prefill / decode 实测速度,并按后端存档对比 —— 挑战赛考核点的可见证据(§12 卖点 2/6)。
|
||
struct ModelSelfTestView: View {
|
||
@State private var output = ""
|
||
@State private var phase: Phase = .idle
|
||
@State private var rate: Double = 0
|
||
@State private var lastResult: BenchmarkResult?
|
||
@State private var history: [String: BenchmarkResult] = [:]
|
||
|
||
private enum Phase: Equatable {
|
||
case idle, loading, running, done, failed(String)
|
||
|
||
var label: String {
|
||
switch self {
|
||
case .idle: return String(appLoc: "未开始")
|
||
case .loading: return String(appLoc: "加载模型…")
|
||
case .running: return String(appLoc: "推理中…")
|
||
case .done: return String(appLoc: "完成 ✓")
|
||
case .failed(let m): return String(appLoc: "失败:\(m)")
|
||
}
|
||
}
|
||
}
|
||
|
||
private var isBusy: Bool { phase == .loading || phase == .running }
|
||
|
||
private var statusColor: Color {
|
||
switch phase {
|
||
case .failed: return Tj.Palette.brick
|
||
case .done: return Tj.Palette.leaf
|
||
default: return Tj.Palette.text2
|
||
}
|
||
}
|
||
|
||
var body: some View {
|
||
ScrollView {
|
||
VStack(alignment: .leading, spacing: 16) {
|
||
promptCard
|
||
|
||
HStack {
|
||
Text(phase.label)
|
||
.font(.tjScaled( 13, weight: .medium))
|
||
.foregroundStyle(statusColor)
|
||
.lineLimit(1)
|
||
Spacer()
|
||
if rate > 0 {
|
||
Text(String(format: "%.1f tok/s", rate))
|
||
.font(.tjScaled( 12, design: .monospaced))
|
||
.foregroundStyle(Tj.Palette.text3)
|
||
}
|
||
}
|
||
|
||
Button {
|
||
Task { await run() }
|
||
} label: {
|
||
Text(isBusy ? "运行中…" : "运行性能自检").frame(maxWidth: .infinity)
|
||
}
|
||
.buttonStyle(TjPrimaryButton())
|
||
.disabled(isBusy)
|
||
|
||
if isBusy { AIFlowBar() }
|
||
|
||
if let r = lastResult { statsCard(r) }
|
||
|
||
outputCard
|
||
|
||
if !history.isEmpty { historyCard }
|
||
}
|
||
.padding(16)
|
||
}
|
||
.background(Tj.Palette.sand.ignoresSafeArea())
|
||
.navigationTitle("性能自检")
|
||
.navigationBarTitleDisplayMode(.inline)
|
||
.onAppear { history = BenchmarkService.load() }
|
||
}
|
||
|
||
private var promptCard: some View {
|
||
VStack(alignment: .leading, spacing: 6) {
|
||
Text("测试 PROMPT")
|
||
.font(.tjScaled( 11, weight: .semibold))
|
||
.tracking(0.5)
|
||
.foregroundStyle(Tj.Palette.text3)
|
||
Text(BenchmarkService.fixedPrompt)
|
||
.font(.tjScaled( 14))
|
||
.foregroundStyle(Tj.Palette.text)
|
||
}
|
||
.padding(14)
|
||
.frame(maxWidth: .infinity, alignment: .leading)
|
||
.tjCard()
|
||
}
|
||
|
||
private func statsCard(_ r: BenchmarkResult) -> some View {
|
||
VStack(alignment: .leading, spacing: 10) {
|
||
HStack {
|
||
Text("本次结果")
|
||
.font(.tjScaled( 12, weight: .semibold))
|
||
.foregroundStyle(Tj.Palette.text2)
|
||
Spacer()
|
||
TjBadge(text: r.backendLabel, style: .leaf)
|
||
}
|
||
HStack(spacing: 0) {
|
||
metric(String(appLoc: "读入"), r.prefillTokensPerSecond > 0
|
||
? String(format: "%.0f tok/s", r.prefillTokensPerSecond) : "—")
|
||
metric(String(appLoc: "生成"), String(format: "%.1f tok/s", r.decodeTokensPerSecond))
|
||
metric(String(appLoc: "总耗时"), String(format: "%.1fs", r.totalSeconds))
|
||
}
|
||
Text(String(appLoc: "prompt \(r.promptTokens) tok · 生成 \(r.genTokens) tok · 100% 本地"))
|
||
.font(.tjScaled( 10, design: .monospaced))
|
||
.foregroundStyle(Tj.Palette.text3)
|
||
}
|
||
.padding(14)
|
||
.frame(maxWidth: .infinity, alignment: .leading)
|
||
.tjCard()
|
||
}
|
||
|
||
private func metric(_ label: String, _ value: String) -> some View {
|
||
VStack(spacing: 3) {
|
||
Text(value)
|
||
.font(.tjScaled( 15, weight: .semibold, design: .monospaced))
|
||
.foregroundStyle(Tj.Palette.text)
|
||
Text(label)
|
||
.font(.tjScaled( 10))
|
||
.foregroundStyle(Tj.Palette.text3)
|
||
}
|
||
.frame(maxWidth: .infinity)
|
||
}
|
||
|
||
private var outputCard: some View {
|
||
ScrollView {
|
||
Text(output.isEmpty ? "(暂无输出)" : output)
|
||
.font(.system(.footnote, design: .monospaced))
|
||
.foregroundStyle(Tj.Palette.text)
|
||
.frame(maxWidth: .infinity, alignment: .leading)
|
||
.textSelection(.enabled)
|
||
.padding(12)
|
||
}
|
||
.frame(maxHeight: 220)
|
||
.background(
|
||
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
|
||
.fill(Tj.Palette.paper)
|
||
)
|
||
.overlay(
|
||
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
|
||
.strokeBorder(Tj.Palette.lineSoft, lineWidth: 1)
|
||
)
|
||
}
|
||
|
||
private var historyCard: some View {
|
||
VStack(alignment: .leading, spacing: 10) {
|
||
Text("各引擎实测对比")
|
||
.font(.tjScaled( 12, weight: .semibold))
|
||
.foregroundStyle(Tj.Palette.text2)
|
||
ForEach(history.keys.sorted(), id: \.self) { key in
|
||
if let r = history[key] {
|
||
HStack {
|
||
Text(key)
|
||
.font(.tjScaled( 12, weight: .medium))
|
||
.foregroundStyle(Tj.Palette.text)
|
||
Spacer()
|
||
Text(String(format: "生成 %.1f tok/s", r.decodeTokensPerSecond))
|
||
.font(.tjScaled( 12, design: .monospaced))
|
||
.foregroundStyle(Tj.Palette.leaf)
|
||
Text(r.date.formatted(.dateTime.month().day()))
|
||
.font(.tjScaled( 10))
|
||
.foregroundStyle(Tj.Palette.text3)
|
||
}
|
||
}
|
||
}
|
||
Text("在「我的 · 推理引擎」切换引擎后再跑一次,即可对比 SME2 与 GPU。")
|
||
.font(.tjScaled( 10))
|
||
.foregroundStyle(Tj.Palette.text3)
|
||
}
|
||
.padding(14)
|
||
.frame(maxWidth: .infinity, alignment: .leading)
|
||
.tjCard()
|
||
}
|
||
|
||
@MainActor
|
||
private func run() async {
|
||
output = ""
|
||
rate = 0
|
||
lastResult = nil
|
||
phase = .loading
|
||
do {
|
||
let result = try await BenchmarkService.shared.run { piece, r in
|
||
output += piece
|
||
if r > 0 { rate = r }
|
||
if phase == .loading { phase = .running }
|
||
}
|
||
lastResult = result
|
||
history = BenchmarkService.load()
|
||
phase = .done
|
||
} catch {
|
||
phase = .failed(error.localizedDescription)
|
||
}
|
||
}
|
||
}
|
||
|
||
#Preview {
|
||
NavigationStack { ModelSelfTestView() }
|
||
}
|
||
```
|
||
|
||
- [x] **Step 2.6: ModelManagementView 入口条件放宽 + 改文案**
|
||
|
||
`ModelManagementView.swift:31` 的
|
||
|
||
```swift
|
||
if service.states[.mnnLLM]?.phase == .ready {
|
||
```
|
||
|
||
改为
|
||
|
||
```swift
|
||
if service.states[.mnnLLM]?.phase == .ready || service.states[.llm]?.phase == .ready {
|
||
```
|
||
|
||
`Text("运行推理自检")` 改为 `Text("性能自检")`,icon `"play.circle"` 改为 `"gauge.with.needle"`。
|
||
|
||
- [x] **Step 2.7: 跑 BenchmarkStoreTests + 模拟器编译通过。Commit**
|
||
|
||
```bash
|
||
git add 康康/Services/BenchmarkService.swift 康康Tests/BenchmarkStoreTests.swift 康康/Features/Me/ModelSelfTestView.swift 康康/Features/Me/ModelManagementView.swift
|
||
git commit -m "feat(Me): 性能自检卡 — 后端标识 + prefill/decode 实测 + 引擎对比存档"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 3: 检索可视化(用户项 2)
|
||
|
||
**Files:**
|
||
- Modify: `康康/Services/HealthExportService.swift`(RetrievalSummary + Event.retrieved + answer 事件化)
|
||
- Modify: `康康/Features/Archive/HealthExportSheet.swift`(chips UI)
|
||
- Create: `康康Tests/RetrievalSummaryTests.swift`
|
||
|
||
- [x] **Step 3.1: 写失败测试 RetrievalSummaryTests.swift**
|
||
|
||
```swift
|
||
import Testing
|
||
@testable import 康康
|
||
|
||
struct RetrievalSummaryTests {
|
||
|
||
@Test func groupsAndCountsPreservingOrder() {
|
||
let chips = HealthExportService.RetrievalSummary.groupedChips(
|
||
["血压", "血糖", "血压", "血压", "体重"], cap: 8)
|
||
#expect(chips == ["血压 ×3", "血糖", "体重"])
|
||
}
|
||
|
||
@Test func capsAndAppendsOverflow() {
|
||
let names = (1...12).map { "指标\($0)" }
|
||
let chips = HealthExportService.RetrievalSummary.groupedChips(names, cap: 8)
|
||
#expect(chips.count == 9)
|
||
#expect(chips.last == "+4")
|
||
}
|
||
|
||
@Test func emptyInputGivesEmptyChips() {
|
||
#expect(HealthExportService.RetrievalSummary.groupedChips([], cap: 8).isEmpty)
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 3.2: 跑测试确认编译失败**
|
||
|
||
- [x] **Step 3.3: HealthExportService 增加 RetrievalSummary + Event case**
|
||
|
||
在 `enum Event` 上方加:
|
||
|
||
```swift
|
||
/// 检索结果摘要 —— 把「本地 RAG 找到了什么」拿给 UI 演出来(§12 卖点 3)。
|
||
struct RetrievalSummary: Sendable, Equatable {
|
||
var chips: [String]
|
||
var indicatorCount: Int
|
||
var reportCount: Int
|
||
var symptomCount: Int
|
||
var diaryCount: Int
|
||
|
||
var totalCount: Int { indicatorCount + reportCount + symptomCount + diaryCount }
|
||
|
||
/// 同名指标合并计数(保持检索的新→旧顺序),超出 cap 折叠成 "+N"。纯函数,单测覆盖。
|
||
static func groupedChips(_ names: [String], cap: Int = 8) -> [String] {
|
||
var order: [String] = []
|
||
var counts: [String: Int] = [:]
|
||
for n in names {
|
||
if counts[n] == nil { order.append(n) }
|
||
counts[n, default: 0] += 1
|
||
}
|
||
var chips = order.map { name -> String in
|
||
let c = counts[name] ?? 1
|
||
return c > 1 ? "\(name) ×\(c)" : name
|
||
}
|
||
if chips.count > cap {
|
||
let overflow = chips.count - cap
|
||
chips = Array(chips.prefix(cap)) + ["+\(overflow)"]
|
||
}
|
||
return chips
|
||
}
|
||
|
||
@MainActor
|
||
static func from(snapshot: Snapshot) -> RetrievalSummary {
|
||
var chips = groupedChips(snapshot.indicators.map(\.name), cap: 8)
|
||
chips += snapshot.reports.prefix(3).map(\.title)
|
||
chips += snapshot.symptoms.prefix(3).map(\.name)
|
||
if !snapshot.diaries.isEmpty {
|
||
chips.append(String(appLoc: "日记 ×\(snapshot.diaries.count)"))
|
||
}
|
||
return RetrievalSummary(
|
||
chips: chips,
|
||
indicatorCount: snapshot.indicators.count,
|
||
reportCount: snapshot.reports.count,
|
||
symptomCount: snapshot.symptoms.count,
|
||
diaryCount: snapshot.diaries.count
|
||
)
|
||
}
|
||
}
|
||
```
|
||
|
||
`enum Event` 增加 case(放在 phaseChanged 后):
|
||
|
||
```swift
|
||
case retrieved(RetrievalSummary)
|
||
```
|
||
|
||
- [x] **Step 3.4: 三个流程 yield .retrieved**
|
||
|
||
`export(prompt:in:)`:`let snapshot = Self.retrieve(...)` 之后、`try Task.checkCancellation()` 之前加:
|
||
|
||
```swift
|
||
continuation.yield(.retrieved(RetrievalSummary.from(snapshot: snapshot)))
|
||
```
|
||
|
||
`export(conversation:in:)`:`let snapshot = Self.retrieveDialogueSnapshot(...)` 之后同样加一行。
|
||
|
||
`answer(question:conversation:in:)` 返回类型从 `AsyncThrowingStream<TokenChunk, Error>` 改为 `AsyncThrowingStream<Event, Error>`;`let snapshot = ...` 之后加 `continuation.yield(.retrieved(RetrievalSummary.from(snapshot: snapshot)))`;循环里 `continuation.yield(TokenChunk(...))` 改为 `continuation.yield(.token(TokenChunk(...)))`。
|
||
|
||
- [x] **Step 3.5: HealthExportSheet 接事件 + chips UI**
|
||
|
||
新增状态:
|
||
|
||
```swift
|
||
@State private var retrieval: HealthExportService.RetrievalSummary?
|
||
@State private var turnRetrievals: [UUID: HealthExportService.RetrievalSummary] = [:]
|
||
```
|
||
|
||
`sendQuestion()` 的消费循环改为:
|
||
|
||
```swift
|
||
task = Task { @MainActor in
|
||
do {
|
||
for try await event in stream {
|
||
switch event {
|
||
case .retrieved(let summary):
|
||
withAnimation(.snappy(duration: 0.25)) {
|
||
turnRetrievals[assistantTurn.id] = summary
|
||
}
|
||
case .token(let chunk):
|
||
appendToTurn(id: assistantTurn.id, text: chunk.text)
|
||
if chunk.decodeRate > 0 { rate = chunk.decodeRate }
|
||
case .phaseChanged, .completed:
|
||
break
|
||
}
|
||
}
|
||
answeringTurnID = nil
|
||
questionFocused = true
|
||
} catch {
|
||
answeringTurnID = nil
|
||
appendToTurn(id: assistantTurn.id, text: error.localizedDescription)
|
||
questionFocused = true
|
||
}
|
||
}
|
||
```
|
||
|
||
`startReportGeneration()` 开头清 `retrieval = nil`,事件循环加:
|
||
|
||
```swift
|
||
case .retrieved(let summary):
|
||
withAnimation(.snappy(duration: 0.25)) { retrieval = summary }
|
||
```
|
||
|
||
`stopGeneration()` 与 `reset()` 里加 `retrieval = nil`(reset 另加 `turnRetrievals = [:]`)。
|
||
|
||
`dialogueBubble` 的内层 VStack(role 标签之后)插入:
|
||
|
||
```swift
|
||
if !isUser, let summary = turnRetrievals[turn.id] {
|
||
RetrievalChipsView(summary: summary)
|
||
}
|
||
```
|
||
|
||
并把空文本等待区的文案按是否已有 summary 切换:
|
||
|
||
```swift
|
||
if turn.id == answeringTurnID && turn.text.isEmpty {
|
||
VStack(alignment: .leading, spacing: 8) {
|
||
Text(turnRetrievals[turn.id] == nil
|
||
? "正在查看本地记录…"
|
||
: "正在根据这些记录回答…")
|
||
.font(.tjScaled( 13))
|
||
.foregroundStyle(Tj.Palette.text3)
|
||
AIFlowBar()
|
||
}
|
||
} else {
|
||
```
|
||
|
||
`phaseIndicator` 的 VStack 里(pills 行之后)插入:
|
||
|
||
```swift
|
||
if let retrieval {
|
||
RetrievalChipsView(summary: retrieval)
|
||
}
|
||
```
|
||
|
||
文件底部(MarkdownView 之前)新增组件:
|
||
|
||
```swift
|
||
// MARK: - 检索结果 chips(本地 RAG 可视化)
|
||
|
||
private struct RetrievalChipsView: View {
|
||
let summary: HealthExportService.RetrievalSummary
|
||
|
||
var body: some View {
|
||
VStack(alignment: .leading, spacing: 6) {
|
||
if summary.totalCount == 0 {
|
||
Text("本地档案中暂无相关记录,将仅按你的描述整理")
|
||
.font(.tjScaled( 11))
|
||
.foregroundStyle(Tj.Palette.text3)
|
||
} else {
|
||
Text(String(appLoc: "已在本地档案中找到 \(summary.totalCount) 条相关记录"))
|
||
.font(.tjScaled( 11, weight: .medium))
|
||
.foregroundStyle(Tj.Palette.leaf)
|
||
ScrollView(.horizontal, showsIndicators: false) {
|
||
HStack(spacing: 6) {
|
||
ForEach(Array(summary.chips.enumerated()), id: \.offset) { _, chip in
|
||
Text(chip)
|
||
.font(.tjScaled( 11))
|
||
.foregroundStyle(Tj.Palette.text2)
|
||
.lineLimit(1)
|
||
.padding(.horizontal, 8)
|
||
.padding(.vertical, 4)
|
||
.background(Capsule().fill(Tj.Palette.sand2))
|
||
.overlay(Capsule().strokeBorder(Tj.Palette.lineSoft, lineWidth: 1))
|
||
}
|
||
}
|
||
.padding(.vertical, 1)
|
||
}
|
||
}
|
||
}
|
||
.transition(.opacity.combined(with: .move(edge: .top)))
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 3.6: 跑 RetrievalSummaryTests + 既有 HealthExport 相关测试,全部通过。Commit**
|
||
|
||
```bash
|
||
git add 康康/Services/HealthExportService.swift 康康/Features/Archive/HealthExportSheet.swift 康康Tests/RetrievalSummaryTests.swift
|
||
git commit -m "feat(Ask): 检索过程可视化 — RAG 命中记录以 chips 展示,生成前先看见"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 4: AIRuntime 优先级闸门(用户项 5a)
|
||
|
||
**Files:**
|
||
- Modify: `康康/AI/AIRuntime.swift`(闸门改造 + generate 签名)
|
||
- Create: `康康Tests/InferencePriorityTests.swift`
|
||
|
||
- [x] **Step 4.1: 写失败测试**
|
||
|
||
```swift
|
||
import Testing
|
||
@testable import 康康
|
||
|
||
struct InferencePriorityTests {
|
||
|
||
@Test func interactiveJumpsAheadOfBackground() {
|
||
let idx = AIRuntime.gateInsertionIndex(of: .interactive,
|
||
in: [.interactive, .background, .background])
|
||
#expect(idx == 1)
|
||
}
|
||
|
||
@Test func interactiveKeepsFIFOAmongInteractive() {
|
||
let idx = AIRuntime.gateInsertionIndex(of: .interactive,
|
||
in: [.interactive, .interactive])
|
||
#expect(idx == 2)
|
||
}
|
||
|
||
@Test func backgroundAlwaysAppends() {
|
||
let idx = AIRuntime.gateInsertionIndex(of: .background,
|
||
in: [.interactive, .background])
|
||
#expect(idx == 2)
|
||
}
|
||
|
||
@Test func emptyQueueInsertsAtZero() {
|
||
#expect(AIRuntime.gateInsertionIndex(of: .interactive, in: []) == 0)
|
||
#expect(AIRuntime.gateInsertionIndex(of: .background, in: []) == 0)
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 4.2: 跑测试确认编译失败**
|
||
|
||
- [x] **Step 4.3: AIRuntime 闸门改造**
|
||
|
||
文件顶部(actor 外)加:
|
||
|
||
```swift
|
||
/// 推理优先级。interactive = 用户正在屏幕前等(识别/问答/自检);
|
||
/// background = 预生成(报告摘要等),排队让行、解码中可被协作式抢占。
|
||
nonisolated enum InferencePriority: Sendable, Equatable {
|
||
case interactive
|
||
case background
|
||
}
|
||
```
|
||
|
||
闸门区(替换原 `gateBusy`/`gateWaiters`/`acquireGate`/`releaseGate`,保留原注释主体并补充):
|
||
|
||
```swift
|
||
private struct GateWaiter {
|
||
let priority: InferencePriority
|
||
let cont: CheckedContinuation<Void, Never>
|
||
}
|
||
private var gateBusy = false
|
||
private var gateHolderPriority: InferencePriority = .interactive
|
||
private var preemptRequested = false
|
||
private var gateWaiters: [GateWaiter] = []
|
||
|
||
/// interactive 排到所有 background 等待者之前;同优先级保持 FIFO。纯函数,单测覆盖。
|
||
nonisolated static func gateInsertionIndex(of priority: InferencePriority,
|
||
in waiting: [InferencePriority]) -> Int {
|
||
guard priority == .interactive else { return waiting.count }
|
||
return waiting.firstIndex(of: .background) ?? waiting.count
|
||
}
|
||
|
||
private func acquireGate(_ priority: InferencePriority = .interactive) async {
|
||
if !gateBusy {
|
||
gateBusy = true
|
||
gateHolderPriority = priority
|
||
return
|
||
}
|
||
// 前台请求撞上后台持有者:请其让位 —— 后台解码循环在下一个 token 抛 CancellationError。
|
||
if priority == .interactive, gateHolderPriority == .background {
|
||
preemptRequested = true
|
||
}
|
||
await withCheckedContinuation { (cont: CheckedContinuation<Void, Never>) in
|
||
let idx = Self.gateInsertionIndex(of: priority, in: gateWaiters.map(\.priority))
|
||
gateWaiters.insert(GateWaiter(priority: priority, cont: cont), at: idx)
|
||
}
|
||
// 被 releaseGate 唤醒时即已持有闸门(gateBusy 保持 true)。
|
||
}
|
||
|
||
private func releaseGate() {
|
||
preemptRequested = false
|
||
if gateWaiters.isEmpty {
|
||
gateBusy = false
|
||
} else {
|
||
// 把闸门直接交给队首等待者,gateBusy 维持 true,不留空窗。
|
||
let next = gateWaiters.removeFirst()
|
||
gateHolderPriority = next.priority
|
||
next.cont.resume()
|
||
}
|
||
}
|
||
|
||
/// 后台持有者每收到一个 token 查一次:前台在排队就让位。
|
||
private func shouldPreempt(_ priority: InferencePriority) -> Bool {
|
||
priority == .background && preemptRequested
|
||
}
|
||
```
|
||
|
||
- [x] **Step 4.4: generate 加 priority 参数 + 抢占检查**
|
||
|
||
`generate` 签名:
|
||
|
||
```swift
|
||
func generate(prompt: String,
|
||
maxTokens: Int = 256,
|
||
priority: InferencePriority = .interactive) -> AsyncThrowingStream<TokenChunk, Error> {
|
||
if InferenceEngine.current == .mnn, mnnStatus == .ready {
|
||
return mnnGenerate(prompt: prompt, maxTokens: maxTokens, priority: priority)
|
||
}
|
||
```
|
||
|
||
MLX 分支 Task 体:`await self.acquireGate()` → `await self.acquireGate(priority)`;循环内 `try Task.checkCancellation()` 之后加:
|
||
|
||
```swift
|
||
if self.shouldPreempt(priority) { throw CancellationError() }
|
||
```
|
||
|
||
catch 拆开(让取消/抢占以 CancellationError 透传,调用方好区分):
|
||
|
||
```swift
|
||
} catch is CancellationError {
|
||
continuation.finish(throwing: CancellationError())
|
||
} catch {
|
||
continuation.finish(throwing: AIRuntimeError.inferenceFailed("\(error)"))
|
||
}
|
||
```
|
||
|
||
`mnnGenerate(prompt:maxTokens:priority:)` 做完全相同的三处修改。`prepare`/`prepareMNN`/`prepareVL`/`analyzeReport` 里的 `acquireGate()` 不带参(默认 interactive,模型加载不可被抢占)。
|
||
|
||
- [x] **Step 4.5: 跑 InferencePriorityTests + 设备编译。Commit**
|
||
|
||
```bash
|
||
git add 康康/AI/AIRuntime.swift 康康Tests/InferencePriorityTests.swift
|
||
git commit -m "feat(AI): 推理闸门双优先级 — 前台插队,后台预生成按 token 让位"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 5: 报告摘要预生成(用户项 3a)
|
||
|
||
**Files:**
|
||
- Create: `康康/AI/Prompts/InsightPrompts.swift`
|
||
- Create: `康康/Services/ReportInsightService.swift`
|
||
- Create: `康康Tests/InsightPromptsTests.swift`
|
||
- Modify: `康康/Features/Capture/UnifiedCaptureFlow.swift:313`(保存后挂后台任务)
|
||
- Modify: `康康/Features/Timeline/TimelineEntryDetailView.swift:260-267`(摘要卡组件化 + 兜底触发)
|
||
|
||
- [x] **Step 5.1: 写失败测试 InsightPromptsTests.swift**
|
||
|
||
```swift
|
||
import Testing
|
||
@testable import 康康
|
||
|
||
struct InsightPromptsTests {
|
||
|
||
@Test func reportSummaryPromptCarriesDataAndGuards() {
|
||
let p = InsightPrompts.reportPlainSummary(
|
||
title: "春季体检", typeLabel: "体检报告",
|
||
indicatorLines: "血红蛋白 118 g/L(参考 130-175)low")
|
||
#expect(p.contains("春季体检"))
|
||
#expect(p.contains("血红蛋白 118"))
|
||
#expect(p.contains("/no_think"))
|
||
#expect(p.contains("不诊断"))
|
||
#expect(!p.contains("患者"))
|
||
}
|
||
|
||
@Test func trendPromptCarriesDataAndGuards() {
|
||
let p = InsightPrompts.trendInsight(
|
||
title: "空腹血糖", unit: "mmol/L", rangeText: ",参考 3.9-6.1",
|
||
dataLines: "2026-05-01 5.2 / 2026-06-01 5.8")
|
||
#expect(p.contains("空腹血糖"))
|
||
#expect(p.contains("2026-06-01 5.8"))
|
||
#expect(p.contains("/no_think"))
|
||
#expect(!p.contains("患者"))
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 5.2: 跑测试确认编译失败**
|
||
|
||
- [x] **Step 5.3: 新建 InsightPrompts.swift**
|
||
|
||
```swift
|
||
import Foundation
|
||
|
||
/// 本地解读类 prompt:报告大白话摘要 + 趋势一句话解读。
|
||
/// 红线:不诊断、不荐药;称呼「你」,不出现「患者」(产品定位:自我健康记录)。
|
||
nonisolated enum InsightPrompts {
|
||
|
||
/// 报告整体大白话摘要(归档后台预生成,写回 Report.summary)。
|
||
static func reportPlainSummary(title: String, typeLabel: String, indicatorLines: String) -> String {
|
||
"""
|
||
你是健康档案助手。下面是一份报告的指标列表,请用大白话给本人(称「你」)写 2~3 句整体解读:
|
||
- 第 1 句:总体情况(共几项、几项异常)。
|
||
- 之后:点名最值得留意的异常项,用生活化语言说明偏高/偏低意味着什么方向。
|
||
- 不诊断疾病、不推荐药物或剂量;异常较多时建议「带上报告咨询医生」。
|
||
- 只输出正文文字,不要标题、列表、JSON、markdown。
|
||
|
||
示例:
|
||
输入:血常规(化验单),指标:白细胞 5.2 (3.5-9.5) normal;血红蛋白 118 (130-175) low;血小板 210 (125-350) normal
|
||
输出:这份血常规共 3 项,2 项正常,血红蛋白略低于参考范围。血红蛋白偏低通常与贫血方向有关,平时可以多补充含铁食物;如果还伴随乏力头晕,建议带上报告咨询医生。
|
||
|
||
现在的报告:\(title)(\(typeLabel))
|
||
指标:
|
||
\(indicatorLines)
|
||
只输出 2~3 句正文。/no_think
|
||
"""
|
||
}
|
||
|
||
/// 趋势一句话解读(TrendDetailView,按数据指纹缓存)。
|
||
static func trendInsight(title: String, unit: String, rangeText: String, dataLines: String) -> String {
|
||
"""
|
||
你是健康档案助手。下面是「\(title)」的历史记录(单位 \(unit)\(rangeText)),请用大白话给本人(称「你」)写 1~2 句趋势解读:
|
||
- 说清整体走向(上升/下降/平稳/波动)和当前值与参考范围的关系。
|
||
- 不诊断疾病、不推荐药物;持续异常时温和建议「复查或咨询医生」。
|
||
- 只输出正文文字,不要标题、列表、JSON。
|
||
|
||
示例:
|
||
输入:体重,单位 kg,记录:2026-04-01 72.5 / 2026-04-15 71.8 / 2026-05-01 71.2
|
||
输出:近一个月你的体重稳步下降了约 1.3kg,节奏平缓,继续保持现在的习惯就好。
|
||
|
||
现在的记录:
|
||
\(dataLines)
|
||
只输出 1~2 句正文。/no_think
|
||
"""
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 5.4: 跑测试确认通过**
|
||
|
||
- [x] **Step 5.5: 新建 ReportInsightService.swift**
|
||
|
||
```swift
|
||
import Foundation
|
||
import SwiftData
|
||
|
||
/// 报告大白话摘要预生成(§3.1:流程经本服务碰 AIRuntime,UI 不直接调)。
|
||
/// 时机:归档保存后立即后台跑(用户继续操作时完成);详情页打开时兜底重试。
|
||
/// 写回策略:只在 summary 为空时生成 —— 绝不覆盖 VL 已给出或用户编辑过的摘要。
|
||
@MainActor
|
||
final class ReportInsightService {
|
||
static let shared = ReportInsightService()
|
||
private init() {}
|
||
|
||
/// 进行中的报告 ID,防止「保存后台任务」与「详情页兜底」重复触发。
|
||
private var inFlight: Set<String> = []
|
||
|
||
func pregenerateIfNeeded(report: Report, in ctx: ModelContext) async {
|
||
guard (report.summary ?? "").isEmpty, !report.indicators.isEmpty else { return }
|
||
let key = String(describing: report.persistentModelID)
|
||
guard !inFlight.contains(key) else { return }
|
||
inFlight.insert(key)
|
||
defer { inFlight.remove(key) }
|
||
|
||
do {
|
||
try await AIRuntime.shared.prepare()
|
||
} catch {
|
||
return // 模型未就绪:静默放弃,详情页下次打开再试
|
||
}
|
||
|
||
let prompt = InsightPrompts.reportPlainSummary(
|
||
title: report.title,
|
||
typeLabel: report.type.label,
|
||
indicatorLines: Self.indicatorLines(for: report.indicators)
|
||
)
|
||
var collected = ""
|
||
do {
|
||
let stream = await AIRuntime.shared.generate(
|
||
prompt: prompt, maxTokens: 200, priority: .background)
|
||
for try await chunk in stream { collected += chunk.text }
|
||
} catch {
|
||
return // 被前台任务抢占(CancellationError)或推理失败:放弃,兜底路径再试
|
||
}
|
||
let text = HealthExportService.stripThinkBlocks(collected)
|
||
.trimmingCharacters(in: .whitespacesAndNewlines)
|
||
guard !text.isEmpty, (report.summary ?? "").isEmpty else { return }
|
||
report.summary = text
|
||
try? ctx.save()
|
||
}
|
||
|
||
/// 「名 值 单位(参考 range)status」每指标一行;异常项排前,上限 15 行控 prompt 体积。
|
||
static func indicatorLines(for indicators: [Indicator]) -> String {
|
||
let sorted = indicators.sorted {
|
||
($0.status == .normal ? 1 : 0) < ($1.status == .normal ? 1 : 0)
|
||
}
|
||
return sorted.prefix(15).map { i in
|
||
var line = "\(i.name) \(i.value)"
|
||
if !i.unit.isEmpty { line += " \(i.unit)" }
|
||
if !i.range.isEmpty { line += "(参考 \(i.range))" }
|
||
line += " \(i.status.rawValue)"
|
||
return line
|
||
}.joined(separator: "\n")
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 5.6: UnifiedCaptureFlow.saveAll 挂后台任务**
|
||
|
||
`saveAll` 末尾的
|
||
|
||
```swift
|
||
try? ctx.save()
|
||
onClose()
|
||
```
|
||
|
||
改为
|
||
|
||
```swift
|
||
try? ctx.save()
|
||
// 后台预生成大白话摘要:用户继续操作,详情页打开时秒开。
|
||
// 低优先级 —— 任何前台 AI 任务(再次拍照/问答)都会让它在下一个 token 让位。
|
||
Task { await ReportInsightService.shared.pregenerateIfNeeded(report: report, in: ctx) }
|
||
onClose()
|
||
```
|
||
|
||
- [x] **Step 5.7: TimelineEntryDetailView 摘要卡组件化**
|
||
|
||
`reportBody` 中的
|
||
|
||
```swift
|
||
if let sum = r.summary, !sum.isEmpty {
|
||
card {
|
||
Text(String(appLoc: "摘要"))
|
||
.font(.tjScaled( 12, weight: .semibold)).foregroundStyle(Tj.Palette.text2)
|
||
Text(sum).font(.tjScaled( 14)).foregroundStyle(Tj.Palette.text)
|
||
.fixedSize(horizontal: false, vertical: true)
|
||
}
|
||
}
|
||
```
|
||
|
||
替换为
|
||
|
||
```swift
|
||
ReportSummaryCard(report: r)
|
||
```
|
||
|
||
文件末尾新增组件(card 容器样式与本文件 `card` helper 一致):
|
||
|
||
```swift
|
||
// MARK: - 报告摘要卡(无摘要时后台预生成兜底)
|
||
|
||
/// 有摘要直接显示;无摘要且有指标时触发后台预生成(归档时若被抢占,这里兜底),
|
||
/// 生成期间显示流光线,完成后 SwiftData 观察自动刷新出文本。
|
||
private struct ReportSummaryCard: View {
|
||
@Environment(\.modelContext) private var ctx
|
||
let report: Report
|
||
@State private var generating = false
|
||
|
||
var body: some View {
|
||
Group {
|
||
if let sum = report.summary, !sum.isEmpty {
|
||
container {
|
||
Text(String(appLoc: "摘要"))
|
||
.font(.tjScaled( 12, weight: .semibold)).foregroundStyle(Tj.Palette.text2)
|
||
Text(sum).font(.tjScaled( 14)).foregroundStyle(Tj.Palette.text)
|
||
.fixedSize(horizontal: false, vertical: true)
|
||
}
|
||
} else if generating {
|
||
container {
|
||
Text("本地 AI 正在解读这份报告…")
|
||
.font(.tjScaled( 12)).foregroundStyle(Tj.Palette.text3)
|
||
AIFlowBar()
|
||
}
|
||
}
|
||
}
|
||
.task {
|
||
guard (report.summary ?? "").isEmpty, !report.indicators.isEmpty else { return }
|
||
generating = true
|
||
await ReportInsightService.shared.pregenerateIfNeeded(report: report, in: ctx)
|
||
generating = false
|
||
}
|
||
}
|
||
|
||
private func container<C: View>(@ViewBuilder _ body: () -> C) -> some View {
|
||
VStack(alignment: .leading, spacing: 10) { body() }
|
||
.padding(14)
|
||
.frame(maxWidth: .infinity, alignment: .leading)
|
||
.background(
|
||
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
|
||
.fill(Tj.Palette.paper)
|
||
)
|
||
.overlay(
|
||
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
|
||
.strokeBorder(Tj.Palette.lineSoft, lineWidth: 1)
|
||
)
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 5.8: 模拟器编译 + 全量既有测试不回归。Commit**
|
||
|
||
```bash
|
||
git add 康康/AI/Prompts/InsightPrompts.swift 康康/Services/ReportInsightService.swift 康康Tests/InsightPromptsTests.swift 康康/Features/Capture/UnifiedCaptureFlow.swift 康康/Features/Timeline/TimelineEntryDetailView.swift
|
||
git commit -m "feat(Capture): 归档后后台预生成大白话摘要,详情页秒开 + 兜底重试"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 6: 趋势 AI 解读 + 指纹缓存(用户项 3b)
|
||
|
||
**Files:**
|
||
- Create: `康康/Services/TrendInsightService.swift`
|
||
- Create: `康康Tests/TrendInsightCacheTests.swift`
|
||
- Modify: `康康/Features/Trends/TrendDetailView.swift:72,321-340`(占位换实卡)
|
||
|
||
- [x] **Step 6.1: 写失败测试 TrendInsightCacheTests.swift**
|
||
|
||
```swift
|
||
import Testing
|
||
import SwiftUI
|
||
@testable import 康康
|
||
|
||
@MainActor
|
||
struct TrendInsightCacheTests {
|
||
|
||
private func bucket(values: [Double]) -> SeriesBucket {
|
||
let points = values.enumerated().map { i, v in
|
||
SeriesBucket.Point(id: "p\(i)",
|
||
date: Date(timeIntervalSince1970: Double(i) * 86_400),
|
||
value: v, status: .normal)
|
||
}
|
||
let line = SeriesBucket.SeriesLine(id: "glucose.fasting", seriesKey: "glucose.fasting",
|
||
label: nil, color: .blue, points: points,
|
||
referenceRange: 3.9...6.1)
|
||
return SeriesBucket(id: "glucose.fasting", title: "空腹血糖", unit: "mmol/L",
|
||
lines: [line], latestDate: .now, kind: .monitor)
|
||
}
|
||
|
||
@Test func fingerprintStableForSameData() {
|
||
let a = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5]))
|
||
let b = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5]))
|
||
#expect(a == b)
|
||
}
|
||
|
||
@Test func fingerprintChangesWhenDataChanges() {
|
||
let a = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5]))
|
||
let b = TrendInsightService.fingerprint(for: bucket(values: [5.2, 5.5, 6.0]))
|
||
#expect(a != b)
|
||
}
|
||
|
||
@Test func dataLinesFormatsDateAndValue() {
|
||
let lines = TrendInsightService.dataLines(for: bucket(values: [5.2, 5.5]))
|
||
#expect(lines.contains("1970-01-01 5.2"))
|
||
#expect(lines.contains("1970-01-02 5.5"))
|
||
}
|
||
|
||
@Test func rangeTextRendersReference() {
|
||
#expect(TrendInsightService.rangeText(for: bucket(values: [5.2]))
|
||
== ",参考 3.9-6.1")
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 6.2: 跑测试确认编译失败**
|
||
|
||
- [x] **Step 6.3: 新建 TrendInsightService.swift**
|
||
|
||
```swift
|
||
import Foundation
|
||
|
||
/// 趋势 AI 一句话解读:小预算(≤140 token)+ 按数据指纹缓存(UserDefaults)。
|
||
/// 数据没变不重算 —— 进趋势详情页秒开;新增/修改记录改变指纹 → 自动重新生成。
|
||
@MainActor
|
||
final class TrendInsightService {
|
||
static let shared = TrendInsightService()
|
||
private init() {}
|
||
|
||
struct Cached: Codable, Equatable {
|
||
var fingerprint: String
|
||
var text: String
|
||
var generatedAt: Date
|
||
}
|
||
|
||
static let storePrefix = "kk.trendInsight."
|
||
|
||
/// 数据指纹:每条线的 key + 点数 + 首末时间 + 末值/极值。体量小,直接当 fingerprint 字符串。
|
||
static func fingerprint(for bucket: SeriesBucket) -> String {
|
||
var parts: [String] = [bucket.id]
|
||
for line in bucket.lines {
|
||
let pts = line.points
|
||
let first = pts.first.map { Int($0.date.timeIntervalSince1970) } ?? 0
|
||
let last = pts.last.map { Int($0.date.timeIntervalSince1970) } ?? 0
|
||
let lastV = pts.last?.value ?? 0
|
||
let minV = pts.map(\.value).min() ?? 0
|
||
let maxV = pts.map(\.value).max() ?? 0
|
||
parts.append("\(line.seriesKey)#\(pts.count)#\(first)#\(last)#\(lastV)#\(minV)#\(maxV)")
|
||
}
|
||
return parts.joined(separator: "|")
|
||
}
|
||
|
||
/// 命中缓存(指纹一致)返回文本,否则 nil。
|
||
func cachedText(for bucket: SeriesBucket) -> String? {
|
||
guard let data = UserDefaults.standard.data(forKey: Self.storePrefix + bucket.id),
|
||
let c = try? JSONDecoder().decode(Cached.self, from: data),
|
||
c.fingerprint == Self.fingerprint(for: bucket) else {
|
||
return nil
|
||
}
|
||
return c.text
|
||
}
|
||
|
||
/// 现算一条解读并写缓存。模型未就绪/输出为空时抛错,UI 显示「暂不可用 + 重试」。
|
||
func generate(for bucket: SeriesBucket) async throws -> String {
|
||
try await AIRuntime.shared.prepare()
|
||
let prompt = InsightPrompts.trendInsight(
|
||
title: bucket.title,
|
||
unit: bucket.unit,
|
||
rangeText: Self.rangeText(for: bucket),
|
||
dataLines: Self.dataLines(for: bucket)
|
||
)
|
||
var collected = ""
|
||
let stream = await AIRuntime.shared.generate(prompt: prompt, maxTokens: 140)
|
||
for try await chunk in stream { collected += chunk.text }
|
||
let text = HealthExportService.stripThinkBlocks(collected)
|
||
.trimmingCharacters(in: .whitespacesAndNewlines)
|
||
guard !text.isEmpty else { throw AIRuntimeError.inferenceFailed("空输出") }
|
||
let cached = Cached(fingerprint: Self.fingerprint(for: bucket), text: text, generatedAt: .now)
|
||
if let data = try? JSONEncoder().encode(cached) {
|
||
UserDefaults.standard.set(data, forKey: Self.storePrefix + bucket.id)
|
||
}
|
||
return text
|
||
}
|
||
|
||
/// 每条线最近 24 个点拼成 "yyyy-MM-dd 值";多线(血压)各占一行带 label 前缀。
|
||
static func dataLines(for bucket: SeriesBucket) -> String {
|
||
let df = DateFormatter()
|
||
df.locale = Locale(identifier: "en_US_POSIX")
|
||
df.timeZone = TimeZone(identifier: "UTC")
|
||
df.dateFormat = "yyyy-MM-dd"
|
||
var lines: [String] = []
|
||
for line in bucket.lines {
|
||
let pts = line.points.suffix(24)
|
||
let prefix = bucket.lines.count > 1 ? "\(line.label ?? line.seriesKey):" : ""
|
||
let series = pts.map { "\(df.string(from: $0.date)) \(fmt($0.value))" }
|
||
.joined(separator: " / ")
|
||
lines.append(prefix + series)
|
||
}
|
||
return lines.joined(separator: "\n")
|
||
}
|
||
|
||
/// ",参考 lo-hi" 或空串(无参考范围时整段省略)。
|
||
static func rangeText(for bucket: SeriesBucket) -> String {
|
||
guard let r = bucket.lines.first?.referenceRange else { return "" }
|
||
return ",参考 \(fmt(r.lowerBound))-\(fmt(r.upperBound))"
|
||
}
|
||
|
||
private static func fmt(_ v: Double) -> String {
|
||
v.truncatingRemainder(dividingBy: 1) == 0
|
||
? String(format: "%.0f", v)
|
||
: String(format: "%.1f", v)
|
||
}
|
||
}
|
||
```
|
||
|
||
注意:`dataLines` 用 UTC 时区保证测试与设备时区无关(展示日期仅供模型理解,差几小时无影响)。
|
||
|
||
- [x] **Step 6.4: 跑测试确认通过**
|
||
|
||
- [x] **Step 6.5: TrendDetailView 换卡**
|
||
|
||
body 中 `aiPlaceholder` 替换为 `TrendInsightCard(bucket: bucket)`;删除 `// MARK: AI 解读占位` 与 `aiPlaceholder` 整块;文件末尾(`enum TrendRange` 之前)加:
|
||
|
||
```swift
|
||
// MARK: - AI 趋势解读卡
|
||
|
||
/// 进入页面先查指纹缓存:命中秒显;未命中本地现算(经 TrendInsightService,§3.1)。
|
||
private struct TrendInsightCard: View {
|
||
let bucket: SeriesBucket
|
||
@State private var text: String?
|
||
@State private var running = false
|
||
@State private var failedMessage: String?
|
||
|
||
var body: some View {
|
||
VStack(alignment: .leading, spacing: 8) {
|
||
HStack(spacing: 6) {
|
||
Image(systemName: "sparkles")
|
||
.font(.tjScaled( 12))
|
||
.foregroundStyle(Tj.Palette.ink)
|
||
Text("AI 解读")
|
||
.font(.tjScaled( 12, weight: .semibold))
|
||
.foregroundStyle(Tj.Palette.text2)
|
||
Spacer()
|
||
}
|
||
if let text {
|
||
Text(text)
|
||
.font(.tjScaled( 13))
|
||
.lineSpacing(3)
|
||
.foregroundStyle(Tj.Palette.text)
|
||
.fixedSize(horizontal: false, vertical: true)
|
||
AIDisclaimerFooter()
|
||
} else if running {
|
||
Text("本地 AI 解读中…")
|
||
.font(.tjScaled( 12))
|
||
.foregroundStyle(Tj.Palette.text3)
|
||
AIFlowBar()
|
||
} else if let failedMessage {
|
||
HStack {
|
||
Text(failedMessage)
|
||
.font(.tjScaled( 12))
|
||
.foregroundStyle(Tj.Palette.text3)
|
||
Spacer()
|
||
Button("重试") { Task { await load(force: true) } }
|
||
.font(.tjScaled( 12, weight: .medium))
|
||
.foregroundStyle(Tj.Palette.ink)
|
||
}
|
||
}
|
||
}
|
||
.padding(14)
|
||
.frame(maxWidth: .infinity, alignment: .leading)
|
||
.background(
|
||
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
|
||
.fill(Tj.Palette.paper)
|
||
)
|
||
.overlay(
|
||
RoundedRectangle(cornerRadius: Tj.Radius.md, style: .continuous)
|
||
.strokeBorder(Tj.Palette.lineSoft, lineWidth: 1)
|
||
)
|
||
.task(id: bucket.id) { await load(force: false) }
|
||
}
|
||
|
||
@MainActor
|
||
private func load(force: Bool) async {
|
||
if !force, let cached = TrendInsightService.shared.cachedText(for: bucket) {
|
||
text = cached
|
||
return
|
||
}
|
||
running = true
|
||
failedMessage = nil
|
||
do {
|
||
text = try await TrendInsightService.shared.generate(for: bucket)
|
||
} catch {
|
||
failedMessage = String(appLoc: "AI 解读暂不可用(模型未就绪或繁忙)")
|
||
}
|
||
running = false
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 6.6: 跑 TrendInsightCacheTests + SeriesBucketTests 不回归。Commit**
|
||
|
||
```bash
|
||
git add 康康/Services/TrendInsightService.swift 康康Tests/TrendInsightCacheTests.swift 康康/Features/Trends/TrendDetailView.swift
|
||
git commit -m "feat(Trends): AI 趋势解读上线 — 数据指纹缓存,秒开不重算"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 7: OCR 文本辅助报告识别(用户项 4)
|
||
|
||
> **特别注意:QwenVL-4B 已弃用。** 这里的「报告识别」由 Qwen3.5-2B 多模态承担(MNN Omni `mnn.analyze` 主路 / MLX `VLSession` 兜底)。OCR 参考文本对 2B 视觉读密集小数字尤其有用。
|
||
|
||
**Files:**
|
||
- Modify: `康康/AI/Prompts/VLPrompts.swift:34-89`(reportExtraction 加 ocrText + 模板占位 + clip 函数)
|
||
- Modify: `康康/Services/CaptureService.swift:137-161`(runVL 注入 OCR)
|
||
- Create: `康康Tests/VLPromptsOCRTests.swift`
|
||
|
||
- [x] **Step 7.1: 写失败测试 VLPromptsOCRTests.swift**
|
||
|
||
```swift
|
||
import Testing
|
||
@testable import 康康
|
||
|
||
struct VLPromptsOCRTests {
|
||
|
||
@Test func emptyOCRKeepsPromptClean() {
|
||
let p = VLPrompts.reportExtraction(ocrText: "")
|
||
#expect(!p.contains("OCR 参考文本"))
|
||
#expect(!p.contains("{{OCR_SECTION}}"))
|
||
#expect(p.contains("现在请识别图片并输出 JSON"))
|
||
}
|
||
|
||
@Test func ocrTextIsInjectedBeforeFinalInstruction() {
|
||
let p = VLPrompts.reportExtraction(ocrText: "尿酸 486 208-428 μmol/L")
|
||
#expect(p.contains("OCR 参考文本"))
|
||
#expect(p.contains("尿酸 486"))
|
||
let ocrPos = p.range(of: "尿酸 486")!.lowerBound
|
||
let endPos = p.range(of: "现在请识别图片并输出 JSON")!.lowerBound
|
||
#expect(ocrPos < endPos)
|
||
}
|
||
|
||
@Test func clipKeepsShortTextIntact() {
|
||
#expect(VLPrompts.clipOCR("短文本") == "短文本")
|
||
}
|
||
|
||
@Test func clipCutsAtLineBoundary() {
|
||
let long = Array(repeating: "指标行 1.23 mmol/L", count: 400).joined(separator: "\n")
|
||
let clipped = VLPrompts.clipOCR(long, limit: 200)
|
||
#expect(clipped.count < 260)
|
||
#expect(clipped.hasSuffix("(后续内容过长已截断)"))
|
||
#expect(!clipped.contains("\n指标行 1.23 mmol/L(后续")) // 不留半行
|
||
}
|
||
}
|
||
```
|
||
|
||
- [x] **Step 7.2: 跑测试确认失败**
|
||
|
||
- [x] **Step 7.3: VLPrompts 改造**
|
||
|
||
`reportExtraction` 改为:
|
||
|
||
```swift
|
||
static func reportExtraction(today: Date = .now, ocrText: String = "") -> String {
|
||
let f = DateFormatter()
|
||
f.locale = Locale(identifier: "en_US_POSIX")
|
||
f.dateFormat = "yyyy-MM-dd"
|
||
let todayStr = f.string(from: today)
|
||
// OCR 参考段:Vision 抄数字比 2B 多模态读密集小字稳;版面仍以图片为准。
|
||
let ocrSection: String
|
||
if ocrText.isEmpty {
|
||
ocrSection = ""
|
||
} else {
|
||
ocrSection = """
|
||
|
||
|
||
OCR 参考文本(系统对同一报告做文字识别的结果,可能有错字、串行或漏行;版面与表格结构以图片为准,但数值、小数点以 OCR 文字更可靠):
|
||
\(clipOCR(ocrText))
|
||
|
||
"""
|
||
}
|
||
return reportExtractionTemplate
|
||
.replacingOccurrences(of: "{{TODAY}}", with: todayStr)
|
||
.replacingOccurrences(of: "{{OCR_SECTION}}", with: ocrSection)
|
||
}
|
||
|
||
/// OCR 文本截断:限制进入 prompt 的体量(2B 模型上下文有限)。截到最后一个完整行。
|
||
static func clipOCR(_ text: String, limit: Int = 1800) -> String {
|
||
guard text.count > limit else { return text }
|
||
let clipped = String(text.prefix(limit))
|
||
if let lastNewline = clipped.lastIndex(of: "\n") {
|
||
return String(clipped[..<lastNewline]) + "\n(后续内容过长已截断)"
|
||
}
|
||
return clipped + "\n(后续内容过长已截断)"
|
||
}
|
||
```
|
||
|
||
`reportExtractionTemplate` 末尾的
|
||
|
||
```
|
||
现在请识别图片并输出 JSON:
|
||
```
|
||
|
||
前面插入一行 `{{OCR_SECTION}}`(即示例 2 之后、最后指令之前)。
|
||
|
||
- [x] **Step 7.4: 跑测试确认通过**
|
||
|
||
- [x] **Step 7.5: CaptureService.runVL 注入 OCR**
|
||
|
||
`runVL` 改为:
|
||
|
||
```swift
|
||
private func runVL(on assets: [FileVault.SavedAsset]) async throws -> ParsedReport {
|
||
do {
|
||
try await AIRuntime.shared.prepareVL()
|
||
} catch {
|
||
throw CaptureError.modelNotReady
|
||
}
|
||
let urls = assets.map { FileVault.shared.rootURL.appendingPathComponent($0.relativePath) }
|
||
// OCR 参考(Vision 本地,<1s/页):给 2B 多模态当数字「抄写员」,降低小字误读。
|
||
// 任何失败都静默回退为空串,绝不阻断识别主流程(§3.2)。
|
||
let ocr = await Self.ocrReference(for: urls)
|
||
let raw: String
|
||
do {
|
||
raw = try await AIRuntime.shared.analyzeReport(
|
||
imageURLs: urls,
|
||
prompt: VLPrompts.reportExtraction(ocrText: ocr)
|
||
)
|
||
} catch {
|
||
throw CaptureError.inferenceFailed("\(error)")
|
||
}
|
||
do {
|
||
return try CaptureService.parseReportJSON(raw, pageCount: assets.count)
|
||
} catch let CaptureError.parseFailed(msg) {
|
||
throw CaptureError.parseFailed(msg)
|
||
} catch {
|
||
throw CaptureError.parseFailed("\(error)")
|
||
}
|
||
}
|
||
|
||
/// 对 Vault 报告图逐页 OCR 拼参考文本。最多 4 页;失败/空文本返回 ""。
|
||
private static func ocrReference(for urls: [URL]) async -> String {
|
||
var pages: [String] = []
|
||
for (idx, url) in urls.prefix(4).enumerated() {
|
||
guard let src = CGImageSourceCreateWithURL(url as CFURL, nil),
|
||
let cg = CGImageSourceCreateImageAtIndex(src, 0, nil) else { continue }
|
||
guard let text = try? await OCRService.recognizeText(in: cg),
|
||
!text.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty else { continue }
|
||
pages.append(urls.count > 1 ? "【第 \(idx + 1) 页】\n\(text)" : text)
|
||
}
|
||
return pages.joined(separator: "\n")
|
||
}
|
||
```
|
||
|
||
文件顶部 import 区加 `import ImageIO`(UIKit 已有)。
|
||
|
||
- [x] **Step 7.6: 跑 VLPromptsOCRTests + CaptureServiceJSONTests 不回归 + 设备编译。Commit**
|
||
|
||
```bash
|
||
git add 康康/AI/Prompts/VLPrompts.swift 康康/Services/CaptureService.swift 康康Tests/VLPromptsOCRTests.swift
|
||
git commit -m "feat(Capture): 报告识别注入 Vision OCR 参考文本,提升 2B 多模态数字准确率"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 8: MNN KV cache 调研文档(用户项 5b)
|
||
|
||
**Files:**
|
||
- Create: `docs/research/mnn-kv-cache-prefix.md`
|
||
|
||
- [x] **Step 8.1: 写调研文档**
|
||
|
||
内容要点(基于 `Frameworks/MNN.xcframework/ios-arm64/MNN.framework/Headers/llm/llm.hpp` 实际头文件):
|
||
|
||
- 结论:当前 MNN 构建已暴露 prefix cache 能力,可把各场景固定 prompt 模板的 prefill 结果缓存。
|
||
- 依据:`bool setPrefixCacheFile(const std::string&, int flag)`(llm.hpp:161,配套私有成员 `mPrefixCacheMode`/`mPrefixLength`/`completePrefixWrite`)、`bool reuse_kv()`(llm.hpp:171,config 开关)、`void syncPromptCache(const ChatMessages&)`(llm.hpp:176)。
|
||
- 适用性:本项目全部是「固定模板前缀 + 可变数据后缀」单轮 `response()`,与 prefix cache 模型吻合;模板体量报告识别 ~900 tok / 导出 ~700 tok / 意图抽取 ~300 tok,按性能自检实测 prefill 速率估算每次省 1~3s。
|
||
- 风险:flag 语义无注释;OMNI 多模态分支未验证;cache 文件与模型版本绑定需失效处理。
|
||
- 建议:W6 polish 阶段、用性能自检卡量化 prefill 占比后再接入;真机 A/B 各跑 3 次对比 `prefill_us`;异常立即删 cache 文件回退。当前瓶颈在 decode,优先级低于 C1/C2/Live Activity。
|
||
|
||
- [x] **Step 8.2: Commit**
|
||
|
||
```bash
|
||
git add docs/research/mnn-kv-cache-prefix.md
|
||
git commit -m "docs(AI): MNN prefix KV cache 调研 — setPrefixCacheFile 可用,建议 W6 量化后接入"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 9: 收尾验证
|
||
|
||
- [x] **Step 9.1: 全量单元测试**
|
||
|
||
```bash
|
||
xcodebuild test -project 康康.xcodeproj -scheme 康康 \
|
||
-destination 'platform=iOS Simulator,name=iPhone 17' \
|
||
-derivedDataPath build/cli-dd -only-testing:'康康Tests' 2>&1 | tail -30
|
||
```
|
||
|
||
预期:全部 PASS,无回归。
|
||
|
||
- [x] **Step 9.2: 设备编译(MNN 真实分支)**
|
||
|
||
```bash
|
||
xcodebuild build -project 康康.xcodeproj -scheme 康康 \
|
||
-destination 'generic/platform=iOS' \
|
||
-derivedDataPath build/cli-dd CODE_SIGNING_ALLOWED=NO 2>&1 | tail -15
|
||
```
|
||
|
||
预期:BUILD SUCCEEDED,无新增 warning。
|
||
|
||
- [ ] **Step 9.3: 真机验证清单(留给用户,代码侧无法完成)**
|
||
|
||
1. 性能自检卡:MNN 与 MLX 各跑一次,对比卡出现两行数据。
|
||
2. 问答:发问后先看到「已找到 N 条记录」chips,再流式回答。
|
||
3. 归档一份报告 → 不进详情页等 1 分钟 → 进详情页摘要已就绪(秒开)。
|
||
4. 趋势详情:首次进入现算,退出再进秒开(缓存);新增一条记录后重新生成。
|
||
5. 拍多页化验单:对比 OCR 辅助前后数值准确率。
|
||
6. 后台摘要生成中立刻发起问答:问答无感插队,摘要稍后补全。
|