1 Reflection on modern methods Risk Ratio regression - simple concept yet complex computation

2025-04-30 0 0 280.29KB 18 页 10玖币

侵权投诉

Reflection on modern methods: Risk Ratio regression - simple concept yet

complex computation

Murthy N Mittinty1,2* John Lynch1,2,3

1School of Public Health, The University of Adelaide, Adelaide, Australia.

2Robinson Research Institute, University of Adelaide, Adelaide, Australia.

3Population Health Sciences, University of Bristol, Bristol, UK.

*Corresponding author. School of Public Health, University of Adelaide, Adelaide 5005,

South Australia. E-mail: murthy.mittinty@adelaide.edu.au

Word count:

Unstructured abstract :150 words

2469 main body without abstract, key words, figure captions and references.

Abstract

The Risk Ratio (RR) is the ratio of the outcome among the exposed to risk of the outcome

among the unexposed. This is a simple concept, which makes one wonder why it has not

gained the same popularity as the odds ratio. Using logistic regression to estimate the odds

ratio is quite common in epidemiology and interpreting the odds ratio as a risk ratio, under

the assumption that the outcome is rare, is also common. On one hand, estimating the odds

ratio is simple but interpreting it is hard. On the other, estimating the risk ratio is challenging

but its interpretation is straightforward. Issues with estimating risk ratio still remains after

four decades. These issues include convergence of the algorithm, the choice of regression

specification (e.g. log-binomial, Poisson) and many more. Various new computational

methods are available that help overcome the issue of convergence and provide doubly robust

estimates of RR.

Keywords: Relative risk, regression, generalized linear models, epidemiology.

Key Message

• Estimating RR using a simple cross tabulation is easy. However, when it comes to

estimating RR using regression, there is no one particular model.

• Use of log-binomial models with continuous covariates may lead to convergence

issues.

• Computational methods such as combinatorial expectation maximisation allow

convergence of generalised linear models using the binomial family and log link

function. However, specification of starting values can be difficult.

• The binary regression method which allows direct modelling of risk ratios may be a

better choice.

Introduction

Relative risk is a common term used in epidemiology to refer to risk and rate ratios.1 The

concept of risk ratio (RR) when introduced first to students is taught using a simple 

table and a hand calculator. The  table is created using a simple cross tabulation of a

binary exposure and a binary outcome. Using the information from this cross tabulation, RR

is estimated as the ratio of risk of the outcome among the exposed versus the risk of the

outcome among the unexposed. For example, let’s say the outcome is low birthweight (Yes =

1 or No = 0) and the exposure is maternal smoking during pregnancy (Yes = 1 and No = 0).

Risk ratio, in this example, is the ratio of the proportion of low birthweight children among

smokers to the proportion of low birthweight children among non-smokers. In this form it is

simple and easy to calculate.

Let’s consider adjusting for one confounder like gender which is binary; in this case, RR can

be estimated within the stratum of gender. Now suppose there is a long list of confounders

which includes age, education, income, pregnancy related factors and others. To estimate RR

in this case, one may need to use regression. Use of regression methods for estimating RR

gained popularity when they became available in regular commercial and non-commercial

statistical software. Even with this availability, it is still not free from problems which has

concerned researchers since the 1980s.2 Other methods such as logistic regression gained

immense popularity and have become essential tools in epidemiology due to the

computational ease, and as the odds ratio (OR) can approximate the RR in the case of rare

events. Evidence suggests that logistic regression is used to estimate the OR but is commonly

interpreted as RR.3 However, OR overestimates RR, whenever RR is greater than 1, and

hence should not be interpreted as RR.4,5,6

If logistic regression is used to estimate RR under the rare disease assumption, then one must

note that this assumes that the conditional probability of having an outcome given the

unexposed state (baseline prevalence,      ) approaches zero (as shown in

web supplement S1). Moreover, as suggested by the reviewer, relation between OR and RR

can be derived as shown in S1, using this derivation if we assume  

 and   we have 

  . Thus, if the RRmax is less than or

equal to 10 and the baseline prevalence is 1 in 100- then the relative error OR/RR is 1%.

With a prevalence of 1 in 10000 it is 0.1%, when the prevalence is very small but not zero,

the approximation errors are small enough to be practically negligible. We assume the RR >1

but less than some maximum plausible value   .

Alternatively, let’s examine this using a simple  table with four cells. Let these cells be

labelled as a, b, c and d, where ‘a’ is the count when the outcome is 1 and the exposure is 1,

‘b’ is the count when the exposure is 1 and outcome is 0, ‘c’ is the count when the outcome is

1 and the exposure is 0 and ‘d’ is when both outcome and exposure are 0. Now to estimate

RR, we use the formula 





 . If we rearrange the terms, we estimate the RR as 





, whereas the OR is estimated as 

 . Again, from these formulae, one may note that RR

does not equal (or even approximate) OR without some assumptions. One common

assumption can be that the outcome is rare in both the groups of the exposure (if exposure is

the only variable, else, the outcome of interest must be rare for all the levels of the

covariates). Furthermore, let’s put some numbers instead of a, b, c, d, say a = 1, b = 5, c = 1

and d = 11. In this case, the estimate of RR using the above formula equates to 12/6 = 2

which is the ratio of the marginal totals of the exposure when     and    

. Now, if we estimate the OR (= 2.2), as shown in the supplement S2 the OR equates to

the ratio of not having the outcome when the exposure is absent versus not having the

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

1Reflectiononmodernmethods:RiskRatioregression-simpleconceptyetcomplexcomputationMurthyNMittinty1,2*JohnLynch1,2,31SchoolofPublicHealth,TheUniversityofAdelaide,Adelaide,Australia.2RobinsonResearchInstitute,UniversityofAdelaide,Adelaide,Australia.3PopulationHealthSciences,UniversityofBristol,Bristol,...

展开>> 收起<<

1 Reflection on modern methods Risk Ratio regression - simple concept yet complex computation.pdf

共18页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

1 Reflection on modern methods Risk Ratio regression - simple concept yet complex computation

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: