12. 診断精度の分析: 特徴
• 研究疑問の定式化
12
P 関心のある母集団は何か?どういう状態の患者か?
I 関心のある検査は何か?
R 指標検査の検討に用いられる参照基準は何か?現
在のところ何が最善の検査か?
A 診断精度の指標は何か? (感度、特異度、尤度比?)
T 検査による分類はどのようになされるか?カットオフ
ががどのように定められるか?
E 指標検査の用途は何か?
“Chapter4: Planning a systematic review of diagnostic test accuracy evidence”,
Synthesizing Evidence of Diagnostic Accuracy, Lippincot Williams & Wilkins, 2011
13. 診断精度の分析: 特徴
• 研究疑問の定式化
13
P 妊婦の疑いがある女性
I 32-34日目のdouble decidual sac sign (DDSS)
R 懐胎7週目の経膣超音波検査 (TVS)
A 感度、特異度、陽性・陰性尤度、陽性・陰性的中率
T 指標検査: 超音波検査でDDSが視認されるか否か。
参照基準: 超音波検査によるエキスパートの判断
E トリアージ 子宮内妊娠を確定検査前に正確に診断でき
るので、効率良く子宮外妊娠者を除外できる
Richardson, A., Hopkisson, J., Campbell, B., & Raine‐Fenning, N. (2016). Use of the double decidual sac sign to
confirm intra‐uterine pregnancy location prior to ultrasonographic visualisation of embryonic contents: a diagnostic
accuracy study. Ultrasound in Obstetrics & Gynecology.
14. 診断精度研究の特徴
• 研究デザイン
– 基本的に横断研究, 参加者のリクルート方法で
Single Gate型とTwo-Gate型に分けられる
14
Kohn, M. A., Carpenter, C. R., & Newman, T. B. (2013). Understanding the direction of bias in
studies of diagnostic test accuracy. Academic Emergency Medicine, 20(11), 1194-1206.
Positive
(+D)
Negative
(-D)
Positive
(+D)
Negative
(-D)
Positive
(+)
TP
FP
Positive
(+)
TP
FP
TN TN
Negative
(-)
FN
Negative
(-)
FN
症例対照研究
(two gate)
横断研究
(single gate)
Separate
samples
症例対照研究では、陰性・陽性的中率 (PPV, NPV), 有病率 (apparent or true
prevalence)が正しく算出できないので、報告している研究の結果の解釈は要注意
35. 「診断精度の分析」の書き方
• 適正報告調査
Korevaar, D. A., van Enst, W. A., Spijker, R., Bossuyt, P. M., & Hooft, L. (2014).
Reporting quality of diagnostic accuracy studies: a systematic review and meta-analysi
of investigations on adherence to STARD. Evidence Based Medicine, 19(2), 47-54.
• 報告ガイドライン: STARD2015 Bossuyt, P. M., Reitsma, J. B.,
Bruns, D. E., Gatsonis, C. A., Glasziou, P. P., Irwig, L., ... & Kressel, H. Y. (2015).
STARD 2015: an updated list of essential items for reporting diagnostic accuracy
studies. Radiology, 277(3), 826-832.
35
診断精度研究のガイドライン
バイアスがかかるポイントを押さえて、そのポイントに関し説明・記述することが重要
41. 序論 (STARD: 項目3)
• 記載例) 用途 (トリアージ)
• 記載例) 仮説
41
A gestation sac is the first ultrasonographic sign of an intrauterine pregnancy
(IUP). It appears as a uniformly round, hypoechoic structure with an
echogenic rim. Initially it does not contain any internal echoes and can
therefore be difficult to differentiate from a ‘pseudosac’, that is, an
endometrial fluid collection that occurs in up to 15% of ectopic pregnancies
(EPs) (1). It is clinically important not to confuse these two structures and
hence several different ultrasonographic signs have been proposed to help
differentiate between them prior to visualisation of any embryonic contents.
The double decidual sac sign (DDSS) is one such sign.
our hypothesis being that all intrauterine fluid collections that
exhibit the DDSS represent a true gestation sac.
Richardson, A., Hopkisson, J., Campbell, B., & Raine‐Fenning, N. (2016). Use of the double decidual sac sign to
confirm intra‐uterine pregnancy location prior to ultrasonographic visualisation of embryonic contents: a diagnostic
accuracy study. Ultrasound in Obstetrics & Gynecology.
45. 参加者
• 結果の節
参加者のフローダイアグラム
– 記載例
45
Richardson, A., Hopkisson, J., Campbell, B., & Raine‐Fenning, N. (2016). Use of the double decidual sac sign to confirm
intra‐uterine pregnancy location prior to ultrasonographic visualisation of embryonic contents: a diagnostic accuracy
study. Ultrasound in Obstetrics & Gynecology.
46. 参加者
• 結果の節
ー 参加者のフローダイアグラム
• 記載例
46
Between 1st January and 31st October 2015, 620 IVF/ICSI cycles were undertaken
within the unit. Of these, 124 (20%) women agreed to participate in the study. In
addition to these, a further six women were approached by one of the authors at
the time of embryo transfer and declined to participate in the study due to
various reasons, namely work commitments (n=3), reluctance to have a TVS
(n=2) and distance to travel to the clinic (n=1). 45 (36.3%) of the 124 women
were subsequently excluded as they had a negative urinary pregnancy test. Of
the 79 women who had a positive pregnancy test, two (2.53%) did not attend
for the index test and nine (11.39%) of those that did attend did not have an
intrauterine fluid collection present on TVS and were therefore excluded. 77
intrauterine fluid collections were observed in the remaining 68 women (nine of
the women had two intrauterine fluid collections detected).
Richardson, A., Hopkisson, J., Campbell, B., & Raine‐Fenning, N. (2016). Use of the double decidual sac sign to confirm
intra‐uterine pregnancy location prior to ultrasonographic visualisation of embryonic contents: a diagnostic accuracy
study. Ultrasound in Obstetrics & Gynecology.
48. 参加者
• 方法の節
– 研究デザイン
– 組み入れ可能な対象者を特定する基準
– 組み入れ可能な対象者をいつどこで特定したか?
• 記載例
48
Participants were recruited prospectively from Nurture Fertility,
Nottingham, United Kingdom between 1st January and 31st October
2015. Women were aged between 18 and 45 years of age and had
undergone IVF/ICSI treatment using a standard long agonist or antagonist
protocol depending on ovarian reserve tests as previously described (13).
The study was well advertised within the IVF unit using posters and
patient information leaflets. Whenever possible, one of the authors (AR)
was also present to discuss the study with women following their embryo
transfer procedure. All women were invited to participate in the study.
49. 参加者
• 方法の節
– 組み入れ (or 除外)基準の詳細
• 記載例
49
Women were excluded from the study if they had a negative urinary
pregnancy test (performed 18 days after oocyte retrieval in a fresh cycle or
13-16 days after embryo transfer in a frozen embryo replacement cycle
depending on the stage of embryo development at the time of transfer) or if,
at the time of the index test, there was either no ultrasonographic evidence
of an intrauterine fluid collection, or a yolk sac and/or fetal pole was clearly
visible within the intrauterine fluid collection. Women were also excluded if
no outcome data were available or if, following the reference standard, the
final diagnosis was not known (for example resolving or persistent
pregnancies of unknown location).
50. 参加者
• 結果の節
– 指標検査と参照基準の測定間隔やその間実施され
た臨床介入を明記する
• 記載例
50
If the urinary pregnancy test was positive, an early ultrasound scan was
scheduled for either 19 or 20 days after oocyte retrieval corresponding to a
gestational age of 33 or 34 days. This range was specifically chosen to
optimize the chances of a gestation sac being present but a yolk sac or fetal
pole being absent (14, 15).
All women were scheduled to have a routine viability ultrasound scan at
between 6 and 7 weeks gestation (between 8 and 16 days after the index
test) as per the fertility unit’s standard practice.
指標検査
参照基準
検証バイアス(differential verification bias)②をチェック
※ この研究では、臨床介入なし
51. 参加者
• 結果の節
– ベースライン属性、臨床特性
• 記載例(Table 1)
51
The baseline characteristics of
study participants are illustrated in
Table 1 (values refer to mean
±standard deviation). These were
not significantly different from
the baseline characteristics of the
general population attending the
IVF unit during the same time
period.
54. 検査手法
• 方法の節
検査の盲検化 記載例
指標検査
参照検査
54
Interpretation of the reference standard was performed by
an experienced gynaecologist without knowledge of the
findings from the index test.
The findings from the early scan were interpreted
immediately and recorded separate to the main clinical
notes. 実施時期は参照基準前なので、参照基準の情報
は知りようがない
バイアスのリスク評価項目
55. 検査手法
• 結果の節
– クロス集計
• 記載例
55
Of the six intrauterine fluid collections that did not display
the DDSS, four were subsequently proven to have an IUP and
two were found to have an EP (Table 2).
本研究での記載はないが、参照基準や指標検査によって
診断が確定できなかった人数も検査の性能を知る上で重要なので報告する
56. 検査手法
• 結果の節
– 診断精度指標とその正確性(信頼区間)
• 記載例 (2値検査)
56
The DDSS therefore has a sensitivity of 93.9% (95% CI 85.0%-98.3%), specificity of 100% (95% CI
15.8%-100%) and overall diagnostic accuracy of 94.0% (95% CI 88.3%-99.7%) for predicting an
IUP. The positive and negative predictive values are 100% (95% CI 94.1%-100%) and 33.3% (95%
CI 4.3%-77.7%) respectively whilst the positive likelihood ratio was infinite and the negative
likelihood ratio was 0.06 (95% CI 0.02-0.16).
57. 検査手法
• 結果の節
– 有害事象
• (指標・参照)検査の実施により生じた有害事象を記載
• 記載例
57
No adverse events from performing the index
test or reference standard were reported.
60. 分析
• 多様性のソース
1. 患者共変量
属性、症状の種類、合併、実施施設など
1. 標的条件と関連する要因
重症度や実施地域など
1. 検査のデバイスやモダリティに関連する要因
検査機器の経年による精度の変化など
1. 検査結果の評価者要因
熟練度など
60
Obuchowski, N. A., & McClish, D. K. (2011). Statistical methods in diagnostic medicine. Wiley.
61. 分析
• 例数設計
– 例数の設定根拠を具体的に記載する。
• 抑うつの診断精度研究、例数設計の方法を明記している
のは3%のみ
• 抑うつの診断精度研究、感度の信頼区間が10%以下であ
る研究は8%、62%が95%信頼幅が21%以上
精度の点推定値のみでなく、
正確性 (信頼区間幅)を考慮した例数設計が必要
61
Thombs, B. D., & Rice, D. B. (2016). Sample sizes and precision of estimates of sensitivity and specificity from primary studies on the
diagnostic accuracy of depression screening tools: a survey of recently published studies. International journal of methods in psychiatric
research, 25(2), 145-152.
62. 分析
• 例数設計の手法
– 1つの検査の診断精度
• 2値検査の感度、特異度
• 連続検査のROC
– 2つの検査の診断精度の比較
• 2値検査の感度、特異度
• 連続検査のROC
62
Hajian-Tilaki, K. (2014). Sample size estimation in diagnostic test studies of biomedical
informatics. Journal of biomedical informatics, 48, 193-204.
63. 分析
• 例数設計: 2値検査の感度・特異度
63
Hajian-Tilaki, K. (2014). Sample size estimation in diagnostic test studies of biomedical
informatics. Journal of biomedical informatics, 48, 193-204.
Za
2
2
P(1- P)
d2
´(1- Prev)
P = 感度 or 特異度
Zα/2 = 1.96(α=0.05),
d = 正確度(許容誤差)
Prev= 有病率
有意水準=0.05, 感度 = 90, 特異度 = 70, 正確度 = 0.07,
有病率 = 0.10とすると、必要な例数は…
1.962 × 90 ×10
0.072 × (1-90)
= 706
1.962 × 70 ×30
0.072 × (1-90)
= 1647
感度 特異度
64. 分析
• 例数設計: 2値検査の感度・特異度
– 記載例
64
Our sample size calculation was based on the following formula as described by
Karimollah16. …(中略)…As for our study, the predetermined values of sensitivity
and specificity were 99% and 98%, respectively, Zα/2=1.96, and the margin of
error (d) was set as ±5%, which yielded results that would be accurate to
within ±5 percentage points. Based on the formula, the sample sizes for
sensitivity and specificity were 15 and 30, respectively. Subsequently, the overall
sample sizes for sensitivity and specificity were calculated using the following
formulae, respectively:…(中略)… Prev denotes the prevalence of disease in the
population. The prevalence of disease in the population was 40% in our present
study, and thus the overall sample sizes calculated based on sensitivity and
specificity were 38.0 and 50.2, respectively. The maximum total number of
participants based on sensitivity and specificity was 50.2, and thus a sample size
of 51 was finally selected in our study.
Gao, J., Wu, H., Wang, L., Zhang, H., Duan, H., Lu, J., & Liang, Z. (2016). Validation of targeted next-generation sequencing
for RAS mutation detection in FFPE colorectal cancer tissues: comparison with Sanger sequencing and ARMS-Scorpion
real-time PCR. BMJ open, 6(1), e009532.
65. 分析
• 例数設計 連続値検査、ROC AUC
65
Hajian-Tilaki, K. (2014). Sample size estimation in diagnostic test studies of biomedical
informatics. Journal of biomedical informatics, 48, 193-204.
n =
Za
2
2
V(AUC)
d2
V(AUC) = (0.0099´e-a2
/2
)´(6a2
+16)
a =j-1
(AUC)´1.414 j-1
は逆累積標準正規分布
AUC=.70で、正確度0.07の場合に必要な例数は…
a =j-1
(0.70)´1.414 = 0.741502
V(AUC) = (0.0099´e-0.7415022
/2
)´(60.7415022
+16)= 0.145136
n =
1.962
´0.145136
0.072
=114
66. 分析
• 例数設計 2値検査、2つの検査の比較
66
Hajian-Tilaki, K. (2014). Sample size estimation in diagnostic test studies of biomedical
informatics. Journal of biomedical informatics, 48, 193-204.
Za
2
2´ P(1- P) + Zb P1(1- P1)+ P2 (1- P2 )
é
ë
ê
ù
û
ú
2
(P1 - P2 )2
= 2つの検査の感度(or特異度)の平均
P1 = 一方の検査の感度(or特異度)
P2= もう一方の検査の感度(or特異度)
Zα/2 = 1.96(α=0.05), Zβ = 0.84(β=0.80)
P
1.96 2´0.75(1-0.25) +0.84 0.70(1-0.30)+0.80(1-0.20)é
ë
ù
û
2
(0.10)2
= 293
67. 分析
• 例数設計 連続値検査、ROC AUC、比較
67
Hajian-Tilaki, K. (2014). Sample size estimation in diagnostic test studies of biomedical
informatics. Journal of biomedical informatics, 48, 193-204.
n =
Za
2
2VH 0 (AUC) + Zb V(AUC1)+V(AUC2 )
é
ë
ê
ù
û
ú
2
AUC1 - AUC2[ ]
2
のAUCは比較する2つの検査のAUCの平均
AUC1=.70で、比較するテストとの差AUC2-AUC1=0.10を
検出力.80、95%信頼区間で検出したい場合の必要例数
VH 0(AUC)
n =
1.96 2´0.1348 +0.84 0.14531+0.11946é
ë
ù
û
2
0.80-0.70[ ]
2
= 211
V1(AUC),V2(AUC),VH 0(AUC) は1つの検査の時と同様に求める
68. 分析
• 欠測
– 欠測の理由と割合を報告することが重要
欠測への対処
– verification bias
• BG法による補正
• 多重代入
– Differential verification bias
• Bayesian methods
など、各種脱落に応じた手法が開発されているが、
普及してはいない…
68
de Groot, J. A., Bossuyt, P. M., Reitsma, J. B., Rutjes, A. W., Dendukuri, N., Janssen, K. J., & Moons, K. G. (2011).
Verification problems in diagnostic accuracy studies: consequences and solutions. BMJ, 343, d4770.
69. 事前登録, プロトコル公開
• 診断精度研究の事前登録率は15%程度[1]。
• 結果良好な診断精度研究はより早く出版[2]。
診断精度研究でも、
事前の研究登録、プロトコル公開は必須
69
[1] Korevaar, D. A., van Es, N., Zwinderman, A. H., Cohen, J. F., & Bossuyt, P. M. (2016). Time to publication among
completed diagnostic accuracy studies: associated with reported accuracy estimates. BMC medical research methodology,
16(1), 1.
[1] Korevaar DA, Bossuyt PM, Hooft L. Infrequent and incomplete registration of test accuracy studies: analysis of
recent study reports. BMJ Open. 2014;4(1):e004596.
70. STARD2015の重要追加事項
• 研究の登録番号と登録名 (項目28)
• 研究プロトコルの入手可能性 (項目29)
• 資金源 (項目30)
記載例) 方法の節, 論文末尾
70
The study was registered with www.clinicaltrials.gov
(NCT02700789) and conducted following STARD guidelines
(12). The full study protocol can be accessed by contacting
the corresponding author.
FUNDING
University of Nottingham and Nurture Fertility
Richardson, A., Hopkisson, J., Campbell, B., & Raine‐Fenning, N. (2016). Use of the double decidual sac sign to confirm
intra‐uterine pregnancy location prior to ultrasonographic visualisation of embryonic contents: a diagnostic accuracy
study. Ultrasound in Obstetrics & Gynecology.
71. プロトコル公開
• 研究プロトコル、論文として公開
• 上記を論文に記載
71
Ethical approval was given by the
National Research Ethics Service
Committee North-West (Cheshire)
on February 19, 2013
(13/NW/0010; 118638), and the
study protocol was published
(Macey et al. 2013).
Macey, R., Glenny, A., Walsh, T., Tickle, M., Worthington, H., Ashley, J., & Brocklehurst, P. (2015). The efficacy of screening
for common dental diseases by hygiene-therapists a diagnostic test accuracy study. Journal of dental research,
0022034514567335.
72. Take Home Message
• STARD2015を参考に、透明性の高い研究計画
• 診断精度の指標は、記述的な指標なので、測定対象
の影響をもろに受けるので、研究デザインが極めて
重要
• バイアスのリスクを考慮した、研究デザイン
• 研究の事前登録や例数設計は、RCTだけではなく、
診断精度研究でも必須
72
80. 参考図書
• STARD2003の解説論文の和訳が掲載。
中山健夫, & 津谷喜一郎. (2008). 臨床研究と疫学研究のための国際ルール集.
• 日本語で診断精度研究のデザイン, バイアスに関して解説。
HULLEY, S. B., et al. 木原雅子・木原正博 (訳): 医学的研究のデザイン. 2004.
• 診断精度研究の入門書
– Knottnerus, J. A., & Buntinx, F. (2009). The evidence base of clinical diagnosis.
Theory and methods of diagnostic research.
• 診断精度分析の概説書: 研究計画や統計手法 (統計手法充実)
– Zhou, X. H., McClish, D. K., & Obuchowski, N. A. (2009). Statistical methods in
diagnostic medicine (Vol. 569). John Wiley & Sons.
80