傾向スコア解析のmodern view

230 Views

September 08, 22

スライド概要

第15回 BSネットワーク 特別講演 @online|2020年9月5日

profile-image

Jr assoc prof at Tokyo University of Science. PhD in health sciences/MPH at the University of Tokyo. Causal inference in epidemiology/biostatistics.

シェア

埋め込む »CMSなどでJSが使えない場合

各ページのテキスト
1.

傾向スコア解析の modern view 東京理科大学 工学部情報工学科 篠崎 智大 shinozaki@rs.tus.ac.jp 第15回 Biostatistics Network 2020年9月5日

2.

自己紹介 篠崎 智大(36歳) 学歴 福井県立藤島高校 卒業 東京大学 医学部 健康科学・看護学科 卒業 同 大学院医学系研究科 公共健康医学専攻 修了 同 健康科学・看護学専攻 博士後期課程 中途退学 SCT 2018 @Portland, OR にて 職歴 2012~ 東京大学 大学院医学系研究科 生物統計学分野 助教 2019~ 東京理科大学 工学部情報工学科 講師 2

3.

わたしとBSネットワーク 3

4.

きょうのトーク 傾向スコア解析 propensity score analysis 調整(層別 or 回帰調整)、重み付け、マッチング いろいろな実装方法  例: 共変量バランスをとる重みのクラス Li, et al. (JASA 2018) Balancing covariates via propensity score weighting 組み合わせたりなんかもしてみたり…     傾向スコアの回帰調整 × 標準化 Franklin et al. (SIM 2017) Comparing the performance of propensity score… 傾向スコアで細かく層別 × 重み付け Desai, et al. (Epi 2017) A propensity-score-based fine stratification approach … 傾向スコアでマッチ × 重み付け Austin & Stuart (SMMR 2017) Estimating the effect of treatment on … 傾向スコアでマッチ × 回帰調整 Austin (SMMR 2017) Double propensity-score adjustment: A solution to … 二重ロバスト推定 double-robust estimation 傾向スコアモデルと回帰モデルとの組み合わせ これまたいろんな組み合わせ方 4

5.

求め方のちがいでなく、求めるものから 標的集団ごとの周辺効果 marginal effects を直接推定しにいく 「交絡調整 = 層ごとの効果を求めること」 という伝統からシフトしてきている Kaufman (Epi 2010) Commentary: Marginalia よくもわるくも、「交絡変数の層ごとの効果」にかんして仮定が不要(agnostic) Vansteelandt & Keiding (Epi 2010) Invited commentary: G-computation–lost in translation? g-推定から見た手法間の統一的な説明を試みます 条件つき効果 conditional effects WNAR 2019 @Portland, OR でのトーク 一般化線形モデルの知識を前提とします 5

6.

Invited Session: Recent Advances on Causal Inference in Observational Studies G-estimation of risk difference and ratio via propensity score weighting and matching Department of Information and Computer Technology Faculty of Engineering, Tokyo University of Science Tomohiro Shinozaki 2019 WNAR/IMS/JR Annual Meeting June 25, 2019 at Embassy Suites by Hilton, Portland, OR

7.

Potential outcomes Y a : potential outcome that would be observed under A = a Y1 : observed as Y if treated (A = 1) Y0 : observed as Y if untreated (A = 0) Causal estimand E[Y1] – E[Y0] We must rely on… Conditional exchangeability assumption Also known as the “no-unmeasured confounders” assumption Y a ∐ A | X (a = 0, 1) Says “A is assigned independently of Y a within confounders X strata” 7

8.

Propensity score pa(x) = P(A = a | X = x) A : treatment (0 or 1) We call p1(x) propensity score Balancing property of p1(X) A ∐ X | p1(X) Says “A is independent of X within the same propensity score value” Under conditional exchangeability (assumption) Balancing property implies Y a ∐ A | X ⇒ Y a ∐ A | p1(X) (a = 0, 1) 8

9.

Propensity score methods 1. Stratification (or regression adjustment) Estimate E[Y|A = 1, p1(X)] – E[Y|A = 0, p1(X)]  2. Optional: standardize the estimates by the distribution P(p1(X)) Matching Match those with A = 1 and those with A = 0 with same p1(X) Estimate Ematched[Y|A = 1] – Ematched[Y|A = 0] 3. Inverse probability weighting Estimate EW[Y|A = 1] – EW[Y|A = 0] in weighted population with WIP = 1/pA(X) 4. G-estimation 9

10.

G-estimation Semiparametric estimation method for confounder conditional effect: E[Y1|X = x] – E[Y0|X = x] Structural mean model (SMM) E[Y a|A = a, X = x] – E[Y0|A = a, X = x] = a(ψ1 + ψ2x) (ψ1, ψ2): causal parameter  ψ2 : effect modification by X Semiparametric: no need to model E[Y a| X = x] nor E[Y 0| X = x] 10

11.
[beta]
G-estimation: estimating equation approach
Solve the g-estimating equations (Robins et al. 1992; Vansteelandt & Joffe, 2014)
0 = ∑i

1
{Ai – p1(Xi)}{Yi – ψ
� 1Ai – ψ
� 2AiXi – m0(Xi)}
Xi

= E(Y 0|A, X) under correct SMM

Encodes the conditional exchangeability Y a ∐ A | X
Double-robust against misspecification of the models



p1(X; α) = P(A = 1|X)
m0(X; β) = E[Y0|X]

(Approximately) locally semiparametric-efficient
Vansteelandt & Joffe (Statistical Science 2014). Structural nested models and g-estimation: the partially realized promise
Robins, Mark & Newey (Biometrics 1992). Estimating exposure effects by modelling the expectation of exposure conditional on confounders

11

12.

Causal risk ratio Log-linear SMM E[Y a|A = a, X = x] = a(ψ1 + ψ2x) log 0 E[Y |A = a, X = x] G-estimating equations 1 0 = ∑i {Ai – p1(Xi)}{Yi exp(–� ψ1Ai – ψ � 2AiXi) – m0(Xi)} Xi = E(Y 0|A, X) under correct SMM 12

13.

Novel approach for g-estimation Regression adjustment for propensity score p1(X) Fit a working linear-normal model (Vansteelandt & Daniel, 2014) E[Y|A, X] = β0 + β1A + β2X + β3AX + β4p1(X) + β5p1(X)X Fit a working log linear-gamma model (Dukes & Vansteelandt, 2018) log E[Y|A, X] = β0 + β1A + β2X + β3AX + β4p1(X) + β5p1(X)X (β� 1, β� 3) = (� ψ 1, ψ � 2) of linear or log-linear SMM No need for correct model specification of E[Y|A, X, p1(X)] Just to get the g-estimating equations through GLM score equations Vansteelandt & Daniel (Statistics in Medicine 2014). On regression adjustment for the propensity score Dukes & Vansteelandt (American Journal of Epidemiology 2018). A note on g-estimation of causal risk ratios 13

14.

Weighting approach Weighted GLM using overlap (OL) weight var[A|X] p1(X)p0(X) WOL = = = p1–A(X) pA(X) pA(X)   WOL = p0(X) for A = 1 WOL = p1(X) for A = 0 Within the class of balancing weight = IPW × reweight (Li et al., 2017) Li, Morgan & Zaslavsky (Journal of the American Statistical Association 2017). Balancing covariates via propensity score weighting I show the equivalence between OL-weighted GLM and g-estimation Linear-normal (canonical) Log linear-Poisson (canonical) 14

15.

Overlap (OL) weighted GLM Working GLM g{E[Y|A = a, X = x; β]} = β0 + β1a + β2x + β3ax g(⋅): identity or log link Weighted score equations (where WOL = p1–A(X)) � i, Xi)} = 0 ∑i WOL,i {Yi – Y(A � i, Xi)} = 0 ∑i WOL,i Ai {Yi – Y(A � i, Xi)} = 0 ∑i WOL,i Xi {Yi – Y(A � i, Xi)} = 0 ∑i WOL,i Ai Xi {Yi – Y(A � i, Xi) = g{E[Y|A = a, X = x; β� ]} Y(A 15

16.

OL weighted estimating equations Because WOL = p1–A(X) = (2A – 1){A – p1(X)}, ∑i {Ai – p1(Xi)}{Yi – g–1(β� 0 + β� 1Ai + �β2Xi + �β3AiXi)} = 0 If g(⋅) is identity link ∑i {Ai – p1(Xi)}{Yi – β� 1Ai – �β3AiXi – (β� 0 + �β2Xi)} = 0 If g(⋅) is log link ∑i {Ai – p1(Xi)}{Yi exp(–β� 1Ai – �β3AiXi) – exp(β� 0 + �β2Xi)} = 0 � i0 – m0(Xi)} Within the class of g-estimating equation: 0 = ∑i d(Xi){Ai – p1(Xi)}{Y 16

17.

Revisit propensity score matching Randomly sample from A = 1 and from A = 0 within same p1(X) strata Asymptotically equivalent to matching weighted population (Li & Green, 2013) min{p1(x), p0(x)} WM = pA(X) Li & Greene (International Journal of Biostatistics 2013). A weighting analogue to pair matching in propensity score analysis Also in the class of balancing weight = IPW × reweight (Li et al., 2017) Li, Morgan & Zaslavsky (Journal of the American Statistical Association 2017). Balancing covariates via propensity score weighting 17

18.

Distributions of propensity scores in weighted/matched populations Yoshida et al. (Epidemiology 2017). Matching weights to simultaneously compare three treatment groups: comparison to three-way matching 18

19.

min{p1(x), p0(x)} Matching weight: WM = pA(X) WM 1 A=1 A=0 (1 – p)/(1 – p) p/p p/(1 – p) 0 (1 – p)/p 0.5 1 p = p1(X) 19

20.

Mirror histogram of propensity score Match A = 1 to A = 0 1–p 0 All A = 1 are matched because ■ (p) < □ (1 – p) 0.5 p A=0 1 p = p1(X) A=1 Match ■ (p) to □ (1 – p) with matching probability = (1 – p)/p < 1 20

21.

min{p1(x), p0(x)} Matching weight: WM = pA(X) WM 1 A=1 A=0 (1 – p)/(1 – p) p/p p/(1 – p) 0 Actual PS matching: Samples data with matching probability (Weights: 0 (sampled) or 1 (not sampled)) (1 – p)/p 0.5 Matching weight: Weights data by matching probability 1 p = p1(X) 21

22.

Matching weighted GLM Working GLM g{E[Y|A = a, X = x; β]} = β0 + β1a + β2x + β3ax g(⋅): identity or log link Weighted score equations with WM � i, Xi)} = 0 ∑i WM,i {Yi – Y(A � i, Xi)} = 0 ∑i WM,i Ai {Yi – Y(A � i, Xi)} = 0 ∑i WM,i Xi {Yi – Y(A � i, Xi)} = 0 ∑i WM,i Ai Xi {Yi – Y(A � i, Xi) = g{E[Y|A = a, X = x; β� ]} Y(A 22

23.

Matching weighted estimating equations min{p1(x), p0(x)} = (2A – 1){A – p1(X)}max{p1(X), p0(X)}–1 Because WM = pA(X) ∑i max{p1(Xi), p0(Xi)}–1{Ai – p1(Xi)}{Yi – g–1(β� 0 + β� 1Ai + �β2Xi)} = 0 If g(⋅) is identity link ∑i max{p1(Xi), p0(Xi)}–1{Ai – p1(Xi)}{Yi – β� 1Ai – (β� 0 + �β2Xi)} = 0 If g(⋅) is log link ∑i max{p1(Xi), p0(Xi)}–1 {Ai – p1(Xi)}{Yi exp(–β� 1Ai) – exp(β� 0 + �β2Xi)} = 0 � i0 – m0(Xi)} Within the class of g-estimating equation: 0 = ∑i d(Xi){Ai – p1(Xi)}{Y 23

24.

Summary of the results We can get g-estimators of difference/ratio scale through Propensity score adjusted normal/gamma GLM Overlap weighted normal/Poisson GLM Propensity score matched (or matching weighted) normal/Poisson GLM ALL: asymptotic variance estimation via a robust sandwich estimator 24

25.

Simulations Covariates X1 ~ N(0,1), X2 ~ N(X1,1) Treatment P(A = 1|X1, X2) = {1 + exp(–2X1 – 4X2)}–1 Outcome P(Y = 1|A, X) = 0.1A + 0.9 {1 + exp(–0.4X1 – 0.25X2 + 0.5X1X2)}–1 (Linear SMM: RD = 0.1) P(Y = 1|A, X) = exp(A – 1){1 + exp(–0.4X1 – 0.25X2 + 0.5X1X2)}–1 (Log linear SMM: log RR = 1) Fit the misspecified regression models E(Y|A, X) with main terms of X1 & X2 Covariate adjusted OLS/Poisson models PS adjustment OLS/gamma model (g-estimator) IPW, OL weighted (g-estimator), and matching weighted (g-estimator) OLS/Poisson models Repeat in 500 samples with size n = 1000 All estimates are accompanied with robust variance estimator 25

26.

Simulation results (n = 1000) Linear SMM (true ψ = 0.1) Ave.Est Bias (%) SE 95% CP Log-linear SMM (true ψ = 1.0) Ave.Est Bias (%) SE 95% CP OLS 0.171 70.61 0.050 0.71 Poi reg 1.50 49.71 0.18 0.19 PS adj 0.100 0.36 0.068 0.95 PS adj 1.04 3.57 0.25 0.97 IPW 0.109 9.05 0.174 0.73 IPW 1.20 20.42 0.60 0.72 OLW 0.100 0.37 0.068 0.94 OLW 1.02 2.30 0.24 0.95 MW 0.101 1.01 0.068 0.94 MW 1.02 2.49 0.24 0.95 Covariate adjusted OLS/Poisson models PS adjustment OLS/gamma model (g-estimator) IPW, OL weighted (g-estimator), and matching weighted (g-estimator) OLS/Poisson models 26

27.

Simulation results (n = 4000) Linear SMM (true ψ = 0.1) Ave.Est Bias (%) SE 95% CP Log-linear SMM (true ψ = 1.0) Ave.Est Bias (%) SE 95% CP OLS 0.170 70.11 0.024 0.18 Poi reg 1.49 49.08 0.09 0.00 PS adj 0.099 -1.49 0.033 0.96 PS adj 1.00 0.36 0.11 0.98 IPW 0.094 -5.67 0.135 0.76 IPW 1.09 8.77 0.49 0.74 OLW 0.098 -1.52 0.033 0.95 OLW 1.00 0.18 0.11 0.95 MW 0.099 -1.07 0.033 0.95 MW 1.00 0.36 0.11 0.95 Covariate adjusted OLS/Poisson models PS adjustment OLS/gamma model (g-estimator) IPW, OL weighted (g-estimator), and matching weighted (g-estimator) OLS/Poisson models 27

28.

Simulation results (n = 10000) Linear SMM (true ψ = 0.1) Ave.Est Bias (%) SE 95% CP Log-linear SMM (true ψ = 1.0) Ave.Est Bias (%) SE 95% CP OLS 0.171 71.12 0.015 0.00 Poi reg 1.49 48.94 0.06 0.00 PS adj 0.099 -1.49 0.021 0.97 PS adj 1.00 -0.06 0.07 0.99 IPW 0.096 -4.45 0.117 0.76 IPW 1.05 4.98 0.43 0.76 OLW 0.098 -1.51 0.021 0.96 OLW 1.00 -0.17 0.07 0.96 MW 0.099 -1.46 0.021 0.96 MW 1.00 -0.08 0.07 0.96 Covariate adjusted OLS/Poisson models PS adjustment OLS/gamma model (g-estimator) IPW, OL weighted (g-estimator), and matching weighted (g-estimator) OLS/Poisson models 28

29.

Conclusion Distinct propensity score-based approaches Adjustment Matching Weighting Can be unified in the semiparametric g-estimation framework Propensity score adjusted normal/gamma GLM Overlap weighted normal/Poisson GLM Propensity score matched (or matching weighted) normal/Poisson GLM Thank you for your attention! 29

31.

Overlap weight: WOL WOL = var(A|X) pA(X) = p1–A(X) = (2A – 1){A – p1(X)} Wopt 1 A=1 A=0 0 0.5 1 p1(X) 31

32.

min{p1(x), p0(x)} Matching weight: WM = pA(X) WM A=1 A=0 1 p/(1 – p) 0 (1 – p)/p 0.5 1 p = p1(X) WM = (2A – 1){A – p1(X)}max{p1(X), p0(X)}–1 32

33.

Under model misspecification Correct SMM E[Y a|A = a, X = x] – E[Y0|A = a, X = x] = a(ψ1 + ψ2x) Fit by ψ1 only 0 = ∑i {Ai – p1(Xi)}{Yi – ψ � 1Ai – m0(Xi)} Still converges in probability to interpretable effect estimand 33

34.

Optimally weighted effect E{var[A|X]∆(X)} E{var[A|X]} = E{p1(X)p0(X)∆(X)} E{p1(X)p0(X)} ∆(x) = E[Y1 – Y0 | X = x] Optimality Estimand that minimizes semiparametric efficiency bound Standardize ∆(x) by var[A|X] = p1(X)p0(X) ∆(x) = ∆ (const.) ⇒ average treatment effect E[Y1] – E[Y0] Large weight around 0.5  The strata with “clinical equipoise” to treatment decision Crump et al. (2006). Moving the goalposts: addressing limited overlap in the estimation of average treatment effects by changing the estimand 34

35.

Matching estimand E[min {p0(X), p1(X)}∆(X)] E[min {p0(X), p1(X)}] Standardize ∆(x) by min{p0(x), p1(x)} ∆(x) = ∆ (const.) ⇒ average treatment effect E[Y1] – E[Y0] Large weight around 0.5  The strata with “clinical equipoise” to treatment decision 35

36.

Weights in matching estimand vs. “optimal” estimand 2.5 E[min {p0(X), p1(X)}∆(X)] 2 Weight E[min {p0(X), p1(X)}] 1.5 E{var(A|X)∆(X)} 1 E{var(A|X)} 0.5 0 0 0.5 1 p = p1(X) 36

37.

Matching weighted estimator (of difference) Model specification WM,i = P(A = 1 | X = x) = p(x; α) E[Y | A = a, X = x] = ma(x; β) MW estimator ∑ WM,i AiYi ∑ WM,i Ai – min{p(Xi; α � ), 1 – p(Xi; α � )} � ) + (1 – Ai){1 – p(Xi; α � )} Ai p(Xi; α ∑ WM,i (1 – Ai)Yi ∑ WM,i (1 – Ai) Augmented MW estimator (locally semiparametric efficient, doubly robust) ∑ WM,i Aim1(Xi; β� ) ∑ WM,i m1(Xi; β� ) ∑ WM,i AiYi – + ∑ WM,i Ai ∑ WM,iAi ∑ WM,i – ∑ WM,i(1 – Ai)Yi ∑ WM,i(1 – Ai) + ∑ WM,i m0(Xi; β� ) ∑ WM,i – ∑ WM,i (1 – Ai)m0(Xi; β� ) ∑ WM,i(1 – Ai) 37

38.

OL weighted estimator (of difference) Model specification Wol,i = P(A = 1 | X = x) = p(x; α) E[Y | A = a, X = x] = ma(x; β) MW estimator ∑ Wol,i AiYi – ∑ Wol,i Ai p(Xi; α � ){1 – p(Xi; α � )} � ) + (1 – Ai){1 – p(Xi; α � )} Ai p(Xi; α ∑ Wol,i (1 – Ai)Yi ∑ Wol,i (1 – Ai) Augmented MW estimator (locally semiparametric efficient, doubly robust) ∑ Wol,i AiYi ∑ Wol,i Ai – + ∑ Wol,i m1(Xi; β� ) ∑ Wol,i ∑ Wol,i(1 – Ai)Yi ∑ Wol,i(1 – Ai) + – ∑ Wol,i Aim1(Xi; β� ) ∑ Wol,iAi ∑ Wol,i m0(Xi; β� ) ∑ Wol,i – ∑ Wol,i (1 – Ai)m0(Xi; β� ) ∑ Wol,i(1 – Ai) 38

39.

OL weighted estimator Solution of the following g-estimating equation 0 = ∑i {Ai – p1(Xi)}{Yi – ψ � Ai – m0(Xi)} Vansteelandt and Joffe, Stat Sci 2014 Robins, Mark & Newey, Biometrics 1992 Corresponding SMM E[Y – Y0 | A = a, X] = ψ ⋅a Without effect modification by X Even under misspecification of SMM, ψ �→ p “optimal” effect If correctly specified, “optimal” effect = ψ 39

40.

1:k PS matching Matching probability in 1:2 PS matching WM A=1 A=0 1 2p/(1 – p) 0 min{2p1(x), p0(x)}(1/2)A WM = pA(X) (1 – p)/2p 1/3 0.5 1 p = p1(X) 40

41.

1:k PS matching Generally, matching probability in 1:k PS matching equals WM,k min{kp1(x), p0(x)}(1/k)A = pA(X) = (2A – 1) {A – p1(X)}max{kp1(X), p0(X)}–1 k1–A Again, matching weighted GLM reduce to g-estimating equations � i0 – m0(Xi)} 0 = ∑i d(Xi){Ai – p1(Xi)}{Y 41