A maybe not so short summary …

### Prevalence

Prevalenceis the proportion of individuals in the population who have the disease of attribute of interest at a specific time point

\[\text{Prevalence} = \frac{\text{Number of people with the disease}}{\text{Total number of individuals in the population}}\]

**Prevalence** is very useful in epidemiology, but it is not helpful when studying diseases with *short duration* and *not of much help in causal inference*.

##### Example:

During 1980 the Framingham Het Study examined 2,477 subjects for cataracts and found that 310 had them.

\[\mathsf{ \small \text{Prevalence} = \frac{310}{2477} = 0.125, \text{ or } 12.5\%. }\]

### Cumulative Incidence (CI)

Cumulative Incidenceis the proportion of the population with a new event during a given time period.\[ {\textstyle \text{CI} = \frac{\text{Number of new cases during the period of interest}}{\text{Number of disease-free individuals at the start of this time period}} } \]

Cumulative incidence, as prevalence, has no units and can take values from 0 to 1, or expressed as a percentage. Cumulative incidence can be calculate, if there is a follow-up of the participants in a study. It is not possible to do so from a survey, which has no follow-up period. For cumulative incidence, the follow-up period must be the same for all participants, and no new participants can enter the study during the follow-up.

##### Example:

In a study of diabetics, 100 of the 189 diabetic men died durint the 13-year follow-up period. Calcluate the cumulative incidence (Risk)

\[\mathsf{ \small \text{Risk} = \frac{100}{189} \times 100 = 52.9\% \text{ during 13-years of follow-up.} }\]

### Incidence rate (IR)

Person-time measures the time participants spend in the study

\[{\textstyle \text{IR} = \frac{\text{Number of new cases during the follow-up period}}{\text{Total person-time by disease-free individuals}}}\]

**Rates**can only be expressed as new cases per unit of person-time

Incidence rate is a powerful tool to describe the occurrence of a disease in the population. It can be used when cumulative incidence is problematic or cannot be properly defined. Use IR if:

- subjects become
**lost to follow-up**, or new subjects entering or leaving the study population - there are
**competing risks**- For example, in a study where the outcome is a cancer diagnosis, someone could get killed in an accident before the end of the follow-up period. This individual would no longer be at risk of cancer. But we don’t know if they would have developed cancer had they not been killed in the accident.

##### Example:

During a six-month time period, a total of 53 nosocomial infections were recorded by an infection control nirse at a community hospital. During this time, there were 832 patients with a total of 1290 patient days. What is the rate of nosocomial infection per 100 patient days?

\[\mathsf{ \small \text{IR} = \frac{53}{1290} \times 100 = 4.1 \text{ infections per 100 patients days.} }\]

### Measures of Association

For causal inference and for associations between variables we use a different set of measures, called **measures of association**. They can be divided in two broad categories, **relative** and **absolute** measures.

#### Relative measures

\[\mathsf{ \small { \text{Risk Ratio (RR)} = \frac{\text{Risk in the exposed}}{\text{Risk in the unexposed}} \\ \text{Incidence rate ratio (IRR)} = \frac{\text{Incidence rate in the exposed}}{\text{Incidence rate in the unexposed}} \\ \text{Odds Ratio (OR)} = \frac{\text{Odds in the exposed}}{\text{Odds in the unexposed}} }}\]

All ratios have no dimensions, so you only need to report the numerical value and the time point or study period.

##### Example for RR calculation:

Of 600 people who had high blood pressure, 35 experienced a stroke within 10 years of follow-up. Amonge 3250 peope who had low blood pressure, 40 experienced a stroke within the same follow-up period. Calculate the risk ratio of having a stroke among people with high blodd pressure compared of those with low blodd pressure.

To calculate this, we are going to use the epiR package.

```
# Use the epiR package
# library(epiR)
# our data
dat <- matrix(c(35, 600-35,
40, 3250-40),
nrow = 2,
byrow = TRUE)
rownames(dat) <- c("High blood pressure (E+)", "Low blood pressure (E-)")
colnames(dat) <- c("Stroke (D+)", "No stroke (D-)")
# method = "cohort.count"
epi.2by2(dat = dat[, 1:2], #remove the total column
method = "cohort.count", #indicats the study design
conf.level = 0.95, #confidence intervals
units = 100,
outcome = "as.columns" #indicating how the outcome variable is represented in the 2x2 table
)
## Outcome + Outcome - Total Inc risk *
## Exposed + 35 565 600 5.83
## Exposed - 40 3210 3250 1.23
## Total 75 3775 3850 1.95
## Odds
## Exposed + 0.0619
## Exposed - 0.0125
## Total 0.0199
##
## Point estimates and 95% CIs:
## -------------------------------------------------------------------
## Inc risk ratio 4.74 (3.04, 7.40)
## Odds ratio 4.97 (3.13, 7.89)
## Attrib risk * 4.60 (2.69, 6.52)
## Attrib risk in population * 0.72 (0.14, 1.30)
## Attrib fraction in exposed (%) 78.90 (67.07, 86.48)
## Attrib fraction in population (%) 36.82 (22.09, 48.76)
## -------------------------------------------------------------------
## Test that odds ratio = 1: chi2(1) = 56.172 Pr>chi2 = < 0.001
## Wald confidence limits
## CI: confidence interval
## * Outcomes per 100 population units
# According to the output the Risk Ratio = 4.74
```

##### Examples for IRR calculation:

A cohort study is conducted to determine whether hormone replacement therapy is associated with an increased risk of Coronary Artery Disease (CAD) in adults overthe age of 40. The study found that the frequency of CAD amongst those using hormone replacement therapy was 27 per 1,000 person-years. The study alseo founf that the frequency of CAD amongst those using hormone replacement therapy was 3 per 1,000 person-years. What is the incidence rate ratio?

```
# the coronary artery disease data
dat <- matrix(c(27, 1000,
3, 1000),
nrow = 2,
byrow = TRUE)
rownames(dat) <- c("Hormone therapy (E+)", "No hormone therapy (E-)")
colnames(dat) <- c("CAD+", "Person-years")
dat
## CAD+ Person-years
## Hormone therapy (E+) 27 1000
## No hormone therapy (E-) 3 1000
# choose method = "cohort.time"
epi.2by2(dat = dat,
method = "cohort.time",
conf.level = 0.95,
units = 100,
outcome = "as.columns"
)
## Outcome + Time at risk Inc rate *
## Exposed + 27 1000 2.7
## Exposed - 3 1000 0.3
## Total 30 2000 1.5
##
## Point estimates and 95% CIs:
## -------------------------------------------------------------------
## Inc rate ratio 9.00 (2.77, 46.35)
## Attrib rate * 2.40 (1.33, 3.47)
## Attrib rate in population * 1.20 (0.56, 1.84)
## Attrib fraction in exposed (%) 88.89 (63.89, 97.84)
## Attrib fraction in population (%) 80.00 (59.06, 93.89)
## -------------------------------------------------------------------
## Wald confidence limits
## CI: confidence interval
## * Outcomes per 100 units of population time at risk
# According to the output the incidence rate ratio = 9
```

##### Example for OR calculation:

In the study^{1}, 186 of the 263 adolescents previously judged as having experienced a suicidal behaviour requiring immediate psychiatric consultation did not exhibit suicidal behaviour (non-suicidal, NS) at six months follow-up. Of this group, 86 young people had been assessed as having depression at baseline. Of the 77 young people with persistent suicidal behaviour at follow-up (suicidal behaviour, SB), 45 had been assessed as having depression at baseline.

```
# Suicidal behaviour data
dat <- matrix(c(45, 86, 45+86,
77-45, 186-86, 77-45+186-86,
77, 186, 77+186),
nrow = 3,
byrow = TRUE)
colnames(dat) <- c("Suicidal behaviour (E+)", "Non-suicidal (E-)", "Total")
rownames(dat) <- c("Depression (D+)", "No depression (D-)", "Total")
dat
## Suicidal behaviour (E+) Non-suicidal (E-) Total
## Depression (D+) 45 86 131
## No depression (D-) 32 100 132
## Total 77 186 263
# use method = "case.control"
epi.2by2(dat = dat[1:2, 1:2], #remove the unneeded rows and columns
method = "case.control", #indicats the study design
conf.level = 0.95, #confidence intervals
units = 100,
outcome = "as.columns" #indicating how the outcome variable is represented in the 2x2 table
)
## Outcome + Outcome - Total Prevalence *
## Exposed + 45 86 131 34.4
## Exposed - 32 100 132 24.2
## Total 77 186 263 29.3
## Odds
## Exposed + 0.523
## Exposed - 0.320
## Total 0.414
##
## Point estimates and 95% CIs:
## -------------------------------------------------------------------
## Odds ratio (W) 1.64 (0.96, 2.80)
## Attrib prevalence * 10.11 (-0.83, 21.04)
## Attrib prevalence in population * 5.04 (-4.11, 14.18)
## Attrib fraction (est) in exposed (%) 38.73 (-8.22, 65.60)
## Attrib fraction (est) in population (%) 22.70 (-3.98, 42.54)
## -------------------------------------------------------------------
## Test that odds ratio = 1: chi2(1) = 3.245 Pr>chi2 = 0.072
## Wald confidence limits
## CI: confidence interval
## * Outcomes per 100 population units
# According to the output the Odds Ratio = 1.64
```

### Absolute measures

Here we have:

- Risk difference
- Incidence rate difference

\[\mathsf{ \small { \text{Risk difference} = \text{Risk among the exposed} - \text{Risk among the unexposed} \\ \text{Incidence rate difference} = \text{Incidence rate among the exposed} - \text{Incidence rate among the unexposed} }}\]

For instance, RD = 0.2 means

- there was a 0.2 or 20% excess risk in those exposed compared to those unexposed over the study period.
- there were 20 more cases per 100 people in those exposed compared to those unexposed over the study period.

### Reproducibility

```
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 3.5.3 (2019-03-11)
## os macOS Mojave 10.14.6
## system x86_64, darwin15.6.0
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz Europe/Stockholm
## date 2019-11-10
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
## package * version date lib source
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.5.2)
## backports 1.1.5 2019-10-02 [1] CRAN (R 3.5.2)
## BiasedUrn 1.07 2015-12-28 [1] CRAN (R 3.5.0)
## bibtex 0.4.2 2017-06-30 [1] CRAN (R 3.5.0)
## blogdown 0.16 2019-10-01 [1] CRAN (R 3.5.2)
## bookdown 0.14 2019-10-01 [1] CRAN (R 3.5.2)
## callr 3.3.2 2019-09-22 [1] CRAN (R 3.5.2)
## cli 1.1.0 2019-03-19 [1] CRAN (R 3.5.2)
## crayon 1.3.4 2017-09-16 [1] CRAN (R 3.5.0)
## desc 1.2.0 2018-05-01 [1] CRAN (R 3.5.0)
## devtools * 2.2.1 2019-09-24 [1] CRAN (R 3.5.2)
## digest 0.6.21 2019-09-20 [1] CRAN (R 3.5.2)
## ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.5.2)
## epiR * 1.0-4 2019-08-23 [1] CRAN (R 3.5.2)
## evaluate 0.14 2019-05-28 [1] CRAN (R 3.5.2)
## fs 1.3.1 2019-05-06 [1] CRAN (R 3.5.2)
## glue 1.3.1.9000 2019-10-12 [1] Github (tidyverse/glue@71eeddf)
## htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.5.2)
## httr 1.4.1 2019-08-05 [1] CRAN (R 3.5.2)
## jsonlite 1.6 2018-12-07 [1] CRAN (R 3.5.0)
## knitcitations * 1.0.10 2019-09-15 [1] CRAN (R 3.5.2)
## knitr 1.25 2019-09-18 [1] CRAN (R 3.5.2)
## lattice 0.20-38 2018-11-04 [1] CRAN (R 3.5.3)
## lubridate 1.7.4 2018-04-11 [1] CRAN (R 3.5.0)
## magrittr 1.5 2014-11-22 [1] CRAN (R 3.5.0)
## Matrix 1.2-17 2019-03-22 [1] CRAN (R 3.5.2)
## memoise 1.1.0 2017-04-21 [1] CRAN (R 3.5.0)
## pkgbuild 1.0.6 2019-10-09 [1] CRAN (R 3.5.2)
## pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.5.0)
## plyr 1.8.4 2016-06-08 [1] CRAN (R 3.5.0)
## prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.5.0)
## processx 3.4.1 2019-07-18 [1] CRAN (R 3.5.2)
## ps 1.3.0 2018-12-21 [1] CRAN (R 3.5.0)
## R6 2.4.0 2019-02-14 [1] CRAN (R 3.5.2)
## Rcpp 1.0.2 2019-07-25 [1] CRAN (R 3.5.2)
## RefManageR 1.2.12 2019-04-03 [1] CRAN (R 3.5.2)
## remotes 2.1.0 2019-06-24 [1] CRAN (R 3.5.2)
## rlang 0.4.0 2019-06-25 [1] CRAN (R 3.5.2)
## rmarkdown 1.16 2019-10-01 [1] CRAN (R 3.5.2)
## rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.5.0)
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.5.0)
## stringi 1.4.3 2019-03-12 [1] CRAN (R 3.5.2)
## stringr 1.4.0 2019-02-10 [1] CRAN (R 3.5.2)
## survival * 2.44-1.1 2019-04-01 [1] CRAN (R 3.5.2)
## testthat 2.2.1 2019-07-25 [1] CRAN (R 3.5.2)
## usethis * 1.5.1 2019-07-04 [1] CRAN (R 3.5.2)
## withr 2.1.2 2018-03-15 [1] CRAN (R 3.5.0)
## xfun 0.10 2019-10-01 [1] CRAN (R 3.5.2)
## xml2 1.2.2 2019-08-09 [1] CRAN (R 3.5.2)
## yaml 2.2.0 2018-07-25 [1] CRAN (R 3.5.0)
##
## [1] /Library/Frameworks/R.framework/Versions/3.5/Resources/library
```

### References

This example is taken from Explaining Odds Ratios article.↩