1 Reflection on modern methods Risk Ratio regression - simple concept yet complex computation

2025-04-30 0 0 280.29KB 18 页 10玖币
侵权投诉
1
Reflection on modern methods: Risk Ratio regression - simple concept yet
complex computation
Murthy N Mittinty1,2* John Lynch1,2,3
1School of Public Health, The University of Adelaide, Adelaide, Australia.
2Robinson Research Institute, University of Adelaide, Adelaide, Australia.
3Population Health Sciences, University of Bristol, Bristol, UK.
*Corresponding author. School of Public Health, University of Adelaide, Adelaide 5005,
South Australia. E-mail: murthy.mittinty@adelaide.edu.au
Word count:
Unstructured abstract :150 words
2469 main body without abstract, key words, figure captions and references.
2
Abstract
The Risk Ratio (RR) is the ratio of the outcome among the exposed to risk of the outcome
among the unexposed. This is a simple concept, which makes one wonder why it has not
gained the same popularity as the odds ratio. Using logistic regression to estimate the odds
ratio is quite common in epidemiology and interpreting the odds ratio as a risk ratio, under
the assumption that the outcome is rare, is also common. On one hand, estimating the odds
ratio is simple but interpreting it is hard. On the other, estimating the risk ratio is challenging
but its interpretation is straightforward. Issues with estimating risk ratio still remains after
four decades. These issues include convergence of the algorithm, the choice of regression
specification (e.g. log-binomial, Poisson) and many more. Various new computational
methods are available that help overcome the issue of convergence and provide doubly robust
estimates of RR.
Keywords: Relative risk, regression, generalized linear models, epidemiology.
Key Message
Estimating RR using a simple cross tabulation is easy. However, when it comes to
estimating RR using regression, there is no one particular model.
Use of log-binomial models with continuous covariates may lead to convergence
issues.
Computational methods such as combinatorial expectation maximisation allow
convergence of generalised linear models using the binomial family and log link
function. However, specification of starting values can be difficult.
The binary regression method which allows direct modelling of risk ratios may be a
better choice.
3
Introduction
Relative risk is a common term used in epidemiology to refer to risk and rate ratios.1 The
concept of risk ratio (RR) when introduced first to students is taught using a simple
table and a hand calculator. The table is created using a simple cross tabulation of a
binary exposure and a binary outcome. Using the information from this cross tabulation, RR
is estimated as the ratio of risk of the outcome among the exposed versus the risk of the
outcome among the unexposed. For example, let’s say the outcome is low birthweight (Yes =
1 or No = 0) and the exposure is maternal smoking during pregnancy (Yes = 1 and No = 0).
Risk ratio, in this example, is the ratio of the proportion of low birthweight children among
smokers to the proportion of low birthweight children among non-smokers. In this form it is
simple and easy to calculate.
Let’s consider adjusting for one confounder like gender which is binary; in this case, RR can
be estimated within the stratum of gender. Now suppose there is a long list of confounders
which includes age, education, income, pregnancy related factors and others. To estimate RR
in this case, one may need to use regression. Use of regression methods for estimating RR
gained popularity when they became available in regular commercial and non-commercial
statistical software. Even with this availability, it is still not free from problems which has
concerned researchers since the 1980s.2 Other methods such as logistic regression gained
immense popularity and have become essential tools in epidemiology due to the
computational ease, and as the odds ratio (OR) can approximate the RR in the case of rare
events. Evidence suggests that logistic regression is used to estimate the OR but is commonly
interpreted as RR.3 However, OR overestimates RR, whenever RR is greater than 1, and
hence should not be interpreted as RR.4,5,6
4
If logistic regression is used to estimate RR under the rare disease assumption, then one must
note that this assumes that the conditional probability of having an outcome given the
unexposed state (baseline prevalence,     ) approaches zero (as shown in
web supplement S1). Moreover, as suggested by the reviewer, relation between OR and RR
can be derived as shown in S1, using this derivation if we assume 
 and   we have 
  . Thus, if the RRmax is less than or
equal to 10 and the baseline prevalence is 1 in 100- then the relative error OR/RR is 1%.
With a prevalence of 1 in 10000 it is 0.1%, when the prevalence is very small but not zero,
the approximation errors are small enough to be practically negligible. We assume the RR >1
but less than some maximum plausible value   .
Alternatively, let’s examine this using a simple table with four cells. Let these cells be
labelled as a, b, c and d, where ‘a’ is the count when the outcome is 1 and the exposure is 1,
b is the count when the exposure is 1 and outcome is 0, c is the count when the outcome is
1 and the exposure is 0 and d is when both outcome and exposure are 0. Now to estimate
RR, we use the formula

 . If we rearrange the terms, we estimate the RR as 


, whereas the OR is estimated as 
 . Again, from these formulae, one may note that RR
does not equal (or even approximate) OR without some assumptions. One common
assumption can be that the outcome is rare in both the groups of the exposure (if exposure is
the only variable, else, the outcome of interest must be rare for all the levels of the
covariates). Furthermore, let’s put some numbers instead of a, b, c, d, say a = 1, b = 5, c = 1
and d = 11. In this case, the estimate of RR using the above formula equates to 12/6 = 2
which is the ratio of the marginal totals of the exposure when    and  
. Now, if we estimate the OR (= 2.2), as shown in the supplement S2 the OR equates to
the ratio of not having the outcome when the exposure is absent versus not having the
摘要:

1Reflectiononmodernmethods:RiskRatioregression-simpleconceptyetcomplexcomputationMurthyNMittinty1,2*JohnLynch1,2,31SchoolofPublicHealth,TheUniversityofAdelaide,Adelaide,Australia.2RobinsonResearchInstitute,UniversityofAdelaide,Adelaide,Australia.3PopulationHealthSciences,UniversityofBristol,Bristol,...

展开>> 收起<<
1 Reflection on modern methods Risk Ratio regression - simple concept yet complex computation.pdf

共18页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:18 页 大小:280.29KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 18
客服
关注