docs(plan): 身体档案输入框语音听写实施计划
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
296
docs/superpowers/plans/2026-06-10-voice-export-composer.md
Normal file
296
docs/superpowers/plans/2026-06-10-voice-export-composer.md
Normal file
@@ -0,0 +1,296 @@
|
|||||||
|
# 「身体档案」输入框语音听写 Implementation Plan
|
||||||
|
|
||||||
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
|
**Goal:** 在「身体档案」(`HealthExportSheet`)底部聊天输入框加端侧语音听写:点 mic 开始、识别文字实时流进输入框、再点停止,不调 LLM、不自动发送。
|
||||||
|
|
||||||
|
**Architecture:** 复用 `SpeechDictationService`(@State 持有);新增 static 纯函数 `merge(prefix:partial:)` 处理"已有文字 + 听写文字"拼接(唯一可单测逻辑);`HealthExportSheet` 加 6 个 @State + mic 按钮 + 3 个流程函数。Spec:`docs/superpowers/specs/2026-06-10-voice-export-composer-design.md`。
|
||||||
|
|
||||||
|
**Tech Stack:** SwiftUI、Speech(经 SpeechDictationService)、Swift Testing。
|
||||||
|
|
||||||
|
**工程约定:** 同 `2026-06-10-voice-diary.md` 的「执行前必读」(同步组免改 pbxproj、CLI 用 `DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer` + `-derivedDataPath ./build/cli-dd`、全量并行测试不可信要 `-only-testing` 定点跑、commit 逐文件 add 不带 `Localizable.xcstrings`)。**当前环境注意**:xcode-select 已指向完整 Xcode 且许可证未接受——`git` 用 `DEVELOPER_DIR=/Library/Developer/CommandLineTools` 前缀绕过;`xcodebuild` 必须先让用户跑 `sudo xcodebuild -license accept`。直接在 `feat/mnn-sme2-runtime` 分支上做(上一功能合并后该分支即集成分支,不再另开分支避免并发会话分支错位)。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 1: `merge(prefix:partial:)`(TDD)
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Test: `康康Tests/SpeechDictationMergeTests.swift`(新建)
|
||||||
|
- Modify: `康康/Services/SpeechDictationService.swift`(`isAvailable` 之后加 static 方法)
|
||||||
|
|
||||||
|
- [ ] **Step 1: 写失败测试**
|
||||||
|
|
||||||
|
新建 `康康Tests/SpeechDictationMergeTests.swift`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
import Testing
|
||||||
|
@testable import 康康
|
||||||
|
|
||||||
|
struct SpeechDictationMergeTests {
|
||||||
|
@Test func emptyPrefixReturnsPartial() {
|
||||||
|
#expect(SpeechDictationService.merge(prefix: "", partial: "今天头晕") == "今天头晕")
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test func plainPrefixJoinsWithSpace() {
|
||||||
|
#expect(SpeechDictationService.merge(prefix: "已有内容", partial: "新听写")
|
||||||
|
== "已有内容 新听写")
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test func whitespaceTerminatedPrefixConcatsDirectly() {
|
||||||
|
#expect(SpeechDictationService.merge(prefix: "第一行\n", partial: "新听写")
|
||||||
|
== "第一行\n新听写")
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test func emptyPartialKeepsPrefix() {
|
||||||
|
#expect(SpeechDictationService.merge(prefix: "已有内容", partial: "") == "已有内容")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 2: 跑测试确认编译失败**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /Users/xuhuayong/apps/康康
|
||||||
|
export DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer
|
||||||
|
xcodebuild test -project 康康.xcodeproj -scheme 康康 \
|
||||||
|
-destination 'platform=iOS Simulator,name=iPhone 17' \
|
||||||
|
-only-testing:'康康Tests/SpeechDictationMergeTests' \
|
||||||
|
-derivedDataPath ./build/cli-dd CODE_SIGNING_ALLOWED=NO 2>&1 | grep -E "error:|TEST (SUCCEEDED|FAILED)" | head -5
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: `error: type 'SpeechDictationService' has no member 'merge'`(TEST FAILED)。
|
||||||
|
|
||||||
|
- [ ] **Step 3: 实现 merge**
|
||||||
|
|
||||||
|
在 `康康/Services/SpeechDictationService.swift` 的 `static var isAvailable` 行之后加:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
|
||||||
|
/// 听写文本拼接:听写开始时输入框已有 prefix,partial 持续拼在其后。
|
||||||
|
/// prefix 空 → 直接用 partial;prefix 以空白/换行结尾 → 直接连接;否则补一个空格。
|
||||||
|
static func merge(prefix: String, partial: String) -> String {
|
||||||
|
guard !partial.isEmpty else { return prefix }
|
||||||
|
guard !prefix.isEmpty else { return partial }
|
||||||
|
if let last = prefix.unicodeScalars.last,
|
||||||
|
CharacterSet.whitespacesAndNewlines.contains(last) {
|
||||||
|
return prefix + partial
|
||||||
|
}
|
||||||
|
return prefix + " " + partial
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 4: 跑测试确认通过**
|
||||||
|
|
||||||
|
同 Step 2 命令。Expected: `** TEST SUCCEEDED **`,4 个用例通过。
|
||||||
|
|
||||||
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /Users/xuhuayong/apps/康康
|
||||||
|
DEVELOPER_DIR=/Library/Developer/CommandLineTools git add 康康Tests/SpeechDictationMergeTests.swift 康康/Services/SpeechDictationService.swift
|
||||||
|
DEVELOPER_DIR=/Library/Developer/CommandLineTools git commit -m "feat(语音听写): SpeechDictationService.merge 前缀拼接(TDD)"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 2: HealthExportSheet 接入
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `康康/Features/Archive/HealthExportSheet.swift`(状态区 :27-30、canAsk :38、canGenerateReport :49、快捷问答 chip :133、onDisappear :103、alert :104、composer :410)
|
||||||
|
|
||||||
|
- [ ] **Step 1: 加听写状态(「快捷问答」状态块之后、`init` 之前)**
|
||||||
|
|
||||||
|
在 `@State private var newPromptText = ""` 之后插入:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
|
||||||
|
// 语音听写(spec 2026-06-10-voice-export-composer)。
|
||||||
|
// dictation 必须 @State:struct View 重建时普通 let 会换新实例(日记踩过的坑)。
|
||||||
|
@State private var dictation = SpeechDictationService()
|
||||||
|
@State private var isDictating = false
|
||||||
|
/// 听写开始时输入框已有文字,partial 始终拼在它后面。
|
||||||
|
@State private var dictationPrefix = ""
|
||||||
|
@State private var dictationTask: Task<Void, Never>?
|
||||||
|
@State private var dictationWatchdog: Task<Void, Never>?
|
||||||
|
@State private var dictationDeniedAlert = false
|
||||||
|
/// 录音上限,与日记一致(防麦克风悬挂)。
|
||||||
|
private static let dictationMaxSeconds = 180
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 2: 录音中禁发送/生成/chip**
|
||||||
|
|
||||||
|
`canAsk` 加条件:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
private var canAsk: Bool {
|
||||||
|
!isAnswering &&
|
||||||
|
!isGeneratingReport &&
|
||||||
|
!isDictating &&
|
||||||
|
!draftQuestion.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`canGenerateReport` 的 `!isGeneratingReport &&` 后加 `!isDictating &&`。
|
||||||
|
|
||||||
|
快捷问答 chip 动作(`draftQuestion = p.prompt` 处)改为:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
guard !isDictating else { return }
|
||||||
|
draftQuestion = p.prompt
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 3: composer 加 mic 按钮 + TextField 录音中禁用**
|
||||||
|
|
||||||
|
TextField 的 `.disabled(isAnswering || isGeneratingReport)` 改为 `.disabled(isAnswering || isGeneratingReport || isDictating)`。
|
||||||
|
|
||||||
|
TextField 与发送 Button 之间插入:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
|
||||||
|
if SpeechDictationService.isAvailable {
|
||||||
|
Button { toggleDictation() } label: {
|
||||||
|
Image(systemName: isDictating ? "stop.fill" : "mic.fill")
|
||||||
|
.font(.tjScaled(15, weight: .semibold))
|
||||||
|
.foregroundStyle(isDictating ? Tj.Palette.paper : Tj.Palette.brick)
|
||||||
|
.frame(width: 40, height: 40)
|
||||||
|
.background(Circle().fill(isDictating ? Tj.Palette.brick : Tj.Palette.brickSoft))
|
||||||
|
.symbolEffect(.pulse, options: .repeating, isActive: isDictating)
|
||||||
|
}
|
||||||
|
.disabled(isAnswering || isGeneratingReport)
|
||||||
|
.accessibilityLabel(isDictating ? String(appLoc: "停止听写") : String(appLoc: "语音输入"))
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 4: 生命周期 + 权限 alert**
|
||||||
|
|
||||||
|
`.onDisappear { task?.cancel() }` 改为:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
.onDisappear {
|
||||||
|
task?.cancel()
|
||||||
|
dictationTask?.cancel()
|
||||||
|
dictationWatchdog?.cancel()
|
||||||
|
dictation.abort()
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
现有「添加快捷问答」alert 的 `}` 闭合之后追加:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
.alert(String(appLoc: "需要麦克风与语音识别权限"), isPresented: $dictationDeniedAlert) {
|
||||||
|
Button(String(appLoc: "前往设置")) {
|
||||||
|
if let url = URL(string: UIApplication.openSettingsURLString) {
|
||||||
|
UIApplication.shared.open(url)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
Button(String(appLoc: "取消"), role: .cancel) {}
|
||||||
|
} message: {
|
||||||
|
Text("语音输入全程在本机完成,声音和文字都不会上传。请在设置中允许麦克风和语音识别。")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 5: 流程函数(`// MARK: - Actions` 之后、`sendQuestion` 之前)**
|
||||||
|
|
||||||
|
```swift
|
||||||
|
// MARK: 语音听写
|
||||||
|
|
||||||
|
private func toggleDictation() {
|
||||||
|
if isDictating { stopDictation() } else { startDictation() }
|
||||||
|
}
|
||||||
|
|
||||||
|
private func startDictation() {
|
||||||
|
questionFocused = false
|
||||||
|
dictationTask = Task { @MainActor in
|
||||||
|
guard await dictation.requestAuthorization() else {
|
||||||
|
dictationDeniedAlert = true
|
||||||
|
return
|
||||||
|
}
|
||||||
|
do {
|
||||||
|
dictationPrefix = draftQuestion
|
||||||
|
try dictation.start { partial in
|
||||||
|
draftQuestion = SpeechDictationService.merge(prefix: dictationPrefix,
|
||||||
|
partial: partial)
|
||||||
|
}
|
||||||
|
withAnimation(.snappy(duration: 0.2)) { isDictating = true }
|
||||||
|
dictationWatchdog = Task { @MainActor in
|
||||||
|
try? await Task.sleep(nanoseconds: UInt64(Self.dictationMaxSeconds) * 1_000_000_000)
|
||||||
|
guard !Task.isCancelled, isDictating else { return }
|
||||||
|
stopDictation()
|
||||||
|
}
|
||||||
|
} catch {
|
||||||
|
isDictating = false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private func stopDictation() {
|
||||||
|
guard isDictating else { return }
|
||||||
|
dictationWatchdog?.cancel()
|
||||||
|
dictationTask = Task { @MainActor in
|
||||||
|
let final = (await dictation.stop()).trimmingCharacters(in: .whitespacesAndNewlines)
|
||||||
|
if !final.isEmpty {
|
||||||
|
draftQuestion = SpeechDictationService.merge(prefix: dictationPrefix,
|
||||||
|
partial: final)
|
||||||
|
}
|
||||||
|
// final 为空:partial 已实时在输入框,保持现状即天然兜底(spec:不提示「没听清」)
|
||||||
|
withAnimation(.snappy(duration: 0.2)) { isDictating = false }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 6: touch 强制重编验证**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /Users/xuhuayong/apps/康康
|
||||||
|
touch 康康/Features/Archive/HealthExportSheet.swift
|
||||||
|
export DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer
|
||||||
|
xcodebuild -project 康康.xcodeproj -scheme 康康 \
|
||||||
|
-destination 'platform=iOS Simulator,name=iPhone 17' \
|
||||||
|
-configuration Debug build -derivedDataPath ./build/cli-dd \
|
||||||
|
CODE_SIGNING_ALLOWED=NO 2>&1 | grep -E "\.swift:[0-9]+:[0-9]+: (error|warning):|BUILD (SUCCEEDED|FAILED)"
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: `BUILD SUCCEEDED`,无新增 warning。
|
||||||
|
|
||||||
|
- [ ] **Step 7: 定点回归(语音相关全部测试)**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
xcodebuild test -project 康康.xcodeproj -scheme 康康 \
|
||||||
|
-destination 'platform=iOS Simulator,name=iPhone 17' \
|
||||||
|
-only-testing:'康康Tests/SpeechDictationMergeTests' \
|
||||||
|
-only-testing:'康康Tests/SpeechDictationAvailabilityTests' \
|
||||||
|
-only-testing:'康康Tests/DiaryOrganizePromptTests' \
|
||||||
|
-derivedDataPath ./build/cli-dd CODE_SIGNING_ALLOWED=NO 2>&1 | grep -E "Test case.*(passed|failed)|TEST (SUCCEEDED|FAILED)"
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: `** TEST SUCCEEDED **`,7 用例通过。
|
||||||
|
|
||||||
|
- [ ] **Step 8: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /Users/xuhuayong/apps/康康
|
||||||
|
DEVELOPER_DIR=/Library/Developer/CommandLineTools git add 康康/Features/Archive/HealthExportSheet.swift
|
||||||
|
DEVELOPER_DIR=/Library/Developer/CommandLineTools git commit -m "feat(语音听写): 身体档案输入框听写实时上屏"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 3: 真机手测清单
|
||||||
|
|
||||||
|
- [ ] **Step 1: 真机逐项确认**
|
||||||
|
|
||||||
|
1. 「身体档案」composer 出现 mic 按钮(模拟器不支持端侧识别时隐藏)
|
||||||
|
2. 点 mic → 说话 → 字实时出现在输入框;输入框已有文字时保留并以空格衔接
|
||||||
|
3. 录音中:输入框/发送/「生成整理报告」/快捷问答 chip 均不可用;mic 为红色停止态
|
||||||
|
4. 再点 mic → 停止,文字落定,点发送正常走问答
|
||||||
|
5. 权限拒绝 → alert 跳设置
|
||||||
|
6. 录音中直接关 sheet → 无崩溃、麦克风指示灯熄灭
|
||||||
|
7. 3 分钟自动停止
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Self-Review 记录
|
||||||
|
|
||||||
|
- **Spec 覆盖**:merge 纯函数+单测(T1)、@State 持有/实时上屏/停止落定/空结果保持现状(T2 S5)、mic 隐藏与禁用矩阵(T2 S2-S3)、权限 alert + onDisappear abort + 看门狗(T2 S4-S5)、真机清单(T3)。无缺口。
|
||||||
|
- **占位符**:无;所有代码步骤给全。
|
||||||
|
- **类型一致性**:`merge(prefix:partial:)` T1 定义、T2 S5 调用一致;`dictationMaxSeconds`/`isDictating`/`dictationPrefix` 命名前后一致;`SpeechDictationService.isAvailable/requestAuthorization/start/stop/abort` 与现有实现签名一致。
|
||||||
Reference in New Issue
Block a user