Abstract
Record linkage databases have been increasingly available and used in pharmacoepidemiology, pharmacoeconomic and outcome studies, where the relationship between drug exposure or intervention and outcome is the main concern. Sometimes the linkage between outcome data and exposure data may be missing so that only a proportion of patients in the outcome database can be linked to other databases. This paper proposes maximum likelihood (ML) and GEE procedures to obtain consistent estimates of parameters in the model relating the outcome and risk factors. Asymptotic variances of the estimates were derived for the situation where the missing rate is estimated from the same dataset. We show that using the estimated missing rate, rather than the known missing rate, may result in more accurate estimates of the parameters. The confidence interval of the predicted occurrence rate, when the missing rate was estimated, was derived. Simulations for different scenarios were performed in order to explore the small-sample behaviour of the ML procedure using the estimated missing rate. The results confirmed the greater efficiency of using the estimated missing rate instead of the true one for large sample sizes. However, this may not be true for small samples. The ML procedure was applied to an analysis of coronary artery bypass operations in patients with acute coronary syndrome.
Original language | English |
---|---|
Pages (from-to) | 873-884 |
Number of pages | 12 |
Journal | Journal of Applied Statistics |
Volume | 29 |
Issue number | 6 |
DOIs | |
Publication status | Published - 2002 |
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty