Files
kangkang/康康/AI/Prompts/VLPrompts.swift
link2026 77a4ee1c37 缺少代码差异信息,无法生成具体的commit message。请提供code differences内容以便分析并生成符合Angular规范的提交信息。
当您提供代码差异后,我将按照以下格式生成:

```
<type>(<scope>): <subject>

<body>
```

其中type会根据更改类型选择(feat、fix、docs、style、refactor等),scope表示影响范围,subject简要描述变更内容,body详细说明修改内容。
2026-06-07 14:17:18 +08:00

203 lines
10 KiB
Swift

import Foundation
/// VL (Qwen3-VL) / prompt
/// : JSON,markdown
/// CaptureService 退(§3.2 退线)
enum VLPrompts {
/// JSON ( prompt ):
/// ```
/// {
/// "title": "", // , ""
/// "type": "checkup|lab|imaging|prescription|other",
/// "report_date": "YYYY-MM-DD", // ()
/// "institution": "XX ", //
/// "page_count": 1,
/// "summary": "", //
/// "indicators": [
/// {
/// "name": "",
/// "value": "3.84",
/// "unit": "mmol/L",
/// "range": "< 3.40",
/// "status": "high|low|normal",
/// "source_page": 1,
/// "source_box": [0.18, 0.42, 0.68, 0.49]
/// }
/// ]
/// }
/// ```
/// `kind` UI indicators A2() B3()
/// VL "", few-shot ,
/// prompt,退
static func reportExtraction(today: Date = .now) -> String {
let f = DateFormatter()
f.locale = Locale(identifier: "en_US_POSIX")
f.dateFormat = "yyyy-MM-dd"
let todayStr = f.string(from: today)
return reportExtractionTemplate.replacingOccurrences(of: "{{TODAY}}", with: todayStr)
}
private static let reportExtractionTemplate: String = #"""
你是一个医学体检报告识别助手。请只输出一段合法 JSON,不要解释、不要 markdown 围栏、不要任何前后缀文字。
今天的日期是 {{TODAY}}。
JSON schema(严格):
{
"title": string,
"type": "checkup" | "lab" | "imaging" | "prescription" | "other",
"report_date": "YYYY-MM-DD",
"institution": string,
"page_count": number,
"summary": string,
"indicators": [
{
"name": string,
"value": string,
"unit": string,
"range": string,
"status": "high" | "low" | "normal",
"source_page": number,
"source_box": [number, number, number, number]
}
]
}
规则:
- status 根据 value 与 range 自己判断:value > range 上限 → "high",< 下限 → "low",否则 → "normal"
- range 字段保留原文(如 "< 3.40""3.9 - 6.1""0 - 5"),不要解析成区间对象。
- 无法识别的字段填空字符串(institution / summary)。
- report_date 必须从图片中识别;实在看不清就填上面给出的「今天」({{TODAY}})。下面示例里的日期只是格式参考,不要直接抄。
- 不要发明指标。数值看不清的整行跳过;但**没有参考范围不是跳过的理由**,结论页叙述式文字(如「总胆红素: 23.0(μmol/L)↑」)同样要提取,range 填 "",status 按箭头/「偏高」等标记判断。
- 化验单一般 type = "lab",体检套餐 = "checkup"
- source_page 是该指标所在图片页码,从 1 开始。
- source_box 是该指标整行在该页图片里的归一化矩形 [x,y,width,height],左上角为 (0,0),右下角为 (1,1)。尽量框住指标名、数值、单位、参考范围和异常标记所在整行;不确定位置时填 [0,0,0,0]。
示例 1(化验单 · 单项):
输入: 一张化验单照片,只能看清「低密度脂蛋白 3.84 mmol/L 参考 <3.40」
输出:
{"title":"","type":"lab","report_date":"2026-05-25","institution":"","page_count":1,"summary":"","indicators":[{"name":"","value":"3.84","unit":"mmol/L","range":"< 3.40","status":"high","source_page":1,"source_box":[0.18,0.42,0.68,0.08]}]}
示例 2(体检 · 多项):
输入: 一份春季体检,3 项可读
输出:
{"title":"","type":"checkup","report_date":"2026-04-12","institution":"","page_count":1,"summary":"","indicators":[{"name":"","value":"3.84","unit":"mmol/L","range":"< 3.40","status":"high","source_page":1,"source_box":[0.12,0.31,0.76,0.07]},{"name":"","value":"32","unit":"U/L","range":"9 - 50","status":"normal","source_page":1,"source_box":[0.12,0.39,0.76,0.07]},{"name":"","value":"5.2","unit":"mmol/L","range":"3.9 - 6.1","status":"normal","source_page":1,"source_box":[0.12,0.47,0.76,0.07]}]}
现在请识别图片并输出 JSON:
"""#
// MARK: - ()
/// :/****()
/// indicators ,// , Report
static func regionExtraction(today: Date = .now) -> String {
let f = DateFormatter()
f.locale = Locale(identifier: "en_US_POSIX")
f.dateFormat = "yyyy-MM-dd"
let todayStr = f.string(from: today)
return regionExtractionTemplate.replacingOccurrences(of: "{{TODAY}}", with: todayStr)
}
private static let regionExtractionTemplate: String = #"""
你是一个医学化验单识别助手。下面给你的是一张化验单/体检报告的**局部照片**,通常只框住了一两行指标。
照片内容可能是表格行,也可能是**结论页的叙述式文字**(如「九、检验:(1)总胆红素(TB): 23.0(μmol/L)↑」),两种都要提取。
请只输出一段合法 JSON,不要解释、不要 markdown 围栏、不要任何前后缀文字。
今天的日期是 {{TODAY}}。
JSON schema(严格):
{
"indicators": [
{
"name": string,
"value": string,
"unit": string,
"range": string,
"status": "high" | "low" | "normal"
}
]
}
规则:
- 凡是「指标名 + 数值」清楚可读的,都要提取——**没有参考范围不是跳过的理由**。只有数值本身看不清才跳过,绝不发明指标。
- status 判断优先级:① 文字旁的箭头或标记(↑/H/偏高 → "high",↓/L/偏低 → "low")最优先;② 没有标记时再用 value 与 range 比较;③ 都没有 → "normal"
- range 字段保留原文(如 "< 3.40""3.9 - 6.1""0 - 5"),不要解析成区间对象;照片里没有参考范围就填 ""
- 识别不出单位/范围就填空字符串,不要编造。
- name 用规范指标名;如果同一行重复出现指标名(如「总胆红素(TB): 总胆红素: 23.0」),只取一次。
- 不要输出 title / institution / date / summary 等任何报告级字段,只输出 indicators 数组。
示例 1(表格单行):
输入: 局部照片,清楚可读「低密度脂蛋白 3.84 mmol/L 参考 <3.40 ↑」
输出:
{"indicators":[{"name":"","value":"3.84","unit":"mmol/L","range":"< 3.40","status":"high"}]}
示例 2(表格两行):
输入: 局部照片,清楚可读「尿酸 486 μmol/L 208-428」与「空腹血糖 5.2 mmol/L 3.9-6.1」
输出:
{"indicators":[{"name":"尿","value":"486","unit":"μmol/L","range":"208 - 428","status":"high"},{"name":"","value":"5.2","unit":"mmol/L","range":"3.9 - 6.1","status":"normal"}]}
示例 3(结论页叙述式 · 无参考范围,只有箭头):
输入: 局部照片,体检结论文字「九、检验: (1)总胆红素(TB): 总胆红素: 23.0(μmol/L)↑」,周围还有其他结论文字
输出:
{"indicators":[{"name":"","value":"23.0","unit":"μmol/L","range":"","status":"high"}]}
现在请识别这张局部照片并输出 JSON:
"""#
// MARK: - OCR (LLM , VL)
/// : Vision OCR , Qwen3-1.7B
/// 3B VL OCR (//)
static func indicatorsFromText(_ ocrText: String, today: Date = .now) -> String {
let f = DateFormatter()
f.locale = Locale(identifier: "en_US_POSIX")
f.dateFormat = "yyyy-MM-dd"
let todayStr = f.string(from: today)
return indicatorsFromTextTemplate
.replacingOccurrences(of: "{{TODAY}}", with: todayStr)
.replacingOccurrences(of: "{{OCR_TEXT}}", with: ocrText)
}
private static let indicatorsFromTextTemplate: String = #"""
你是医学化验单/体检报告的结构化助手。下面是对一张报告做 OCR 得到的纯文本,可能有错字、错位、多余符号或换行混乱。
请从中提取所有「指标名 + 数值」,只输出一段合法 JSON,不要解释、不要 markdown 围栏、不要任何前后缀文字。
今天的日期是 {{TODAY}}。
JSON schema(严格):
{
"indicators": [
{
"name": string,
"value": string,
"unit": string,
"range": string,
"status": "high" | "low" | "normal"
}
]
}
规则:
- 只提取「有明确数值」的检验/体检指标;页眉、医院名、医生签名、采样时间、栏目标题、OCR 噪声一律忽略。
- status 判断优先级:① 文本里的箭头/标记(↑/H/偏高 → "high",↓/L/偏低 → "low")最优先;② 没有标记时用 value 与 range 比较;③ 都没有 → "normal"
- range 保留原文(如 "3.9 - 6.1""< 3.40""208 - 428");OCR 把破折号写成 "--" / "~" 都归一成 " - ";没有参考范围就填 ""
- 单位识别不出就填 "",不要编造;不要发明指标;同一指标只输出一次。
- name 用规范中文指标名(行内重复的去掉,英文缩写括注可保留)。
- 数值明显是 OCR 乱码(字母混入数字)且无法判断的,跳过该行。
示例 OCR 文本:
淋巴细胞数 3.0 1.8 -- 6.3 X10^9/L
尿酸 486 208-428 μmol/L
总胆红素(TB): 23.0 (μmol/L) ↑
对应输出:
{"indicators":[{"name":"","value":"3.0","unit":"X10^9/L","range":"1.8 - 6.3","status":"normal"},{"name":"尿","value":"486","unit":"μmol/L","range":"208 - 428","status":"high"},{"name":"","value":"23.0","unit":"μmol/L","range":"","status":"high"}]}
现在请解析下面这段 OCR 文本,只输出 JSON。/no_think
OCR 文本:
{{OCR_TEXT}}
"""#
}