Package 'EpiStats' reference manual

Title:	Tools for Epidemiologists
Description:	Provides set of functions aimed at epidemiologists. The package includes commands for measures of association and impact for case control studies and cohort studies. It may be particularly useful for outbreak investigations including univariable analysis and stratified analysis. The functions for cohort studies include the CS(), CSTable() and CSInter() commands. The functions for case control studies include the CC(), CCTable() and CCInter() commands. References - Cornfield, J. 1956. A statistical problem arising from retrospective studies. In Vol. 4 of Proceedings of the Third Berkeley Symposium, ed. J. Neyman, 135-148. Berkeley, CA - University of California Press. Woolf, B. 1955. On estimating the relation between blood group disease. Annals of Human Genetics 19 251-253. Reprinted in Evolution of Epidemiologic Ideas Annotated Readings on Concepts and Methods, ed. S. Greenland, pp. 108-110. Newton Lower Falls, MA Epidemiology Resources. Gilles Desve & Peter Makary, 2007. 'CSTABLE Stata module to calculate summary table for cohort study' Statistical Software Components S456879, Boston College Department of Economics. Gilles Desve & Peter Makary, 2007. 'CCTABLE Stata module to calculate summary table for case-control study' Statistical Software Components S456878, Boston College Department of Economics.
Authors:	Jean Pierre Decorps [aut], Esther Kissling [ctb], Lore Merdrignac [cre]
Maintainer:	Lore Merdrignac <[email protected]>
License:	LGPL-3
Version:	1.6-2
Built:	2025-02-17 06:17:38 UTC
Source:	https://github.com/cran/EpiStats

Univariate analysis of case control studies

Description

CC is used with case-control studies to determine the association between an exposure and an outcome. Note that all variables need to be numeric and binary and coded as "0" and "1". Point estimates and confidence intervals for the odds ratio are calculated, along with attributable or prevented fractions for the exposed and total population.

Additionally you can select if you want to display the Fisher's exact test, by specifying exact = TRUE.

If you specify full = TRUE you can easily access useful statistics from the output tables.

Usage

CC(data, cases, exposure, exact = FALSE, full = FALSE, title = "CC")
CC(data, cases, exposure, exact = FALSE, full = FALSE, title = "CC")

Arguments

`data`	data.frame
`cases`	character - Case variable
`exposure`	character - Exposure variable
`exact`	boolean - TRUE if you would like to display Fisher's exact p-value
`full`	boolean - TRUE if you need to display useful statistics and values for formatting
`title`	character - title of tables

Value

list:

`df1`	data.frame - two by two table
`df2`	data.frame - statistics
`df1.align`	character - alignment for kable/xtable
`df2.align`	character - alignment for kable/xtable
`df1.digits`	integer vector - digit number displayed for kable/xtable
`df2.digits`	integer vector - digit number displayed for kable/xtable
`st`	list - individual statistics

The item st returns the odds ratio and its 95 percent confidence intervals, the attributable fraction among the exposed and its 95 percent confidence intervals, the attributable fraction among the population and its 95 percent confidence intervals, the Chi square value, the Chi square p-value and the Fisher's exact test p-value.

Note

You can use the lowercase command "cc" in place of "CC"

Please note also that when the outcome is frequent the odds ratio will overestimate the risk ratio (if OR>1) or underestimate the risk ratio (OR<1). If the outcome is rare, the risk ratio and the odds ratio are similiar.

In a case control study, the attributable fraction among the exposed and among the population assume that the OR approximates the risk ratio.

Please interpret all measures with caution.

Author(s)

[email protected]

References

Stata 13: cc https://www.stata.com/manuals13/stepitab.pdf

Examples

library(EpiStats)

# Dataset by Anja Hauri, RKI.
data(Tiramisu)
DF <- Tiramisu

# The CC command looks at the association between the outcome variable "ill"
# and an exposure "mousse"

CC(DF, "ill", "mousse")

# The option exact = TRUE provides Fisher's exact test p-values
CC(DF, "ill", "mousse", exact = TRUE)

# With the option full = TRUE you can easily use individual elements of the results:
result <- CC(DF, "ill", "mousse", full = TRUE)
result$st$odds_ratio$point_estimate

library(EpiStats)

# Dataset by Anja Hauri, RKI.
data(Tiramisu)
DF <- Tiramisu

# The CC command looks at the association between the outcome variable "ill"
# and an exposure "mousse"

CC(DF, "ill", "mousse")

# The option exact = TRUE provides Fisher's exact test p-values
CC(DF, "ill", "mousse", exact = TRUE)

# With the option full = TRUE you can easily use individual elements of the results:
result <- CC(DF, "ill", "mousse", full = TRUE)
result$st$odds_ratio$point_estimate

Stratified analysis for case control studies

Description

Note that the outcome and exposure variable need to be numeric and binary and coded as "0" and 1". The third variable needs to be numeric, but may have more categories, such as "0", "1" and "2".

Usage

CCInter(x, cases, exposure, by, table = FALSE, full = FALSE)
CCInter(x, cases, exposure, by, table = FALSE, full = FALSE)

Arguments

`x`	data.frame
`cases`	string: case binary variable (0 / 1)
`exposure`	string: exposure binary variable (0 / 1)
`by`	string: stratifying variable (a factor)
`table`	boolean - TRUE if you need to display interaction table
`full`	boolean - TRUE if you need to display useful values for formatting

Details

CCInter is useful to determine the effects of a third variable on the association between an exposure and an outcome. CCInter produces 2 by 2 tables with stratum specific odds ratios, attributable risk among exposed and population attributable risk. Note that the outcome and exposure variable need to be numeric and binary and coded as "0" and 1". The third variable needs to be numeric, but may have more categories, such as "0", "1" and "2". CCInter displays a summary with the crude OR, the Mantel Haenszel adjusted OR and the result of a Woolf test for homogeneity of stratum-specific OR.

The option "full = TRUE" provides you with useful formatting information, which can be handy if you're using "markdown".

Value

list:

`df1`	data.frame - cross-table
`df2`	data.frame - statistics
`df1.digits`	integer vector - digit number displayed for kable/xtable
`df1.align`	character - alignment for kable/xtable
`df2.digits`	integer vector - digit number displayed for kable/xtable
`df2.align`	character - alignment for kable/xtable

Note

- You can use lowercas command "ccinter" instead of "CCInter" - The "by" variable (the stratifying variable) can have more than 2 levels

Author(s)

[email protected]

References

ccinter for Stata by *Gilles Desve*

Examples

library(EpiStats)

data(Tiramisu)
DF <- Tiramisu

# Here you can see the association between wmousse and ill for each stratum of tira:
CCInter(DF, "ill", "wmousse", by = "tira")

# By storing the results in the object "res", you can use individual elements of the results.
# For example if you would like to view just the Mantel-Haenszel odds ratio for beer adjusted
# for tportion, you can view it by typing:

res <- CCInter(DF, "ill", "beer", "tportion", full = TRUE)
res$df2$Stats[3]

library(EpiStats)

data(Tiramisu)
DF <- Tiramisu

# Here you can see the association between wmousse and ill for each stratum of tira:
CCInter(DF, "ill", "wmousse", by = "tira")

# By storing the results in the object "res", you can use individual elements of the results.
# For example if you would like to view just the Mantel-Haenszel odds ratio for beer adjusted
# for tportion, you can view it by typing:

res <- CCInter(DF, "ill", "beer", "tportion", full = TRUE)
res$df2$Stats[3]

Summary table for univariate analysis of case control studies

Description

CCTable is used for univariate analysis of case control studies with several exposures. The results are summarised in one table with one row per exposure making comparisons between exposures easier and providing a useful table for integrating into reports. Note that all variables need to be numeric and binary and coded as "0" and "1".

The results of this function contain: The name of exposure variables, the total number of cases, the number of exposed cases, the percentage of exposed among cases, the number of controls, the number of exposed controls, the percentage of exposed among controls, odds ratios, 95%CI intervals, p-values.

You can optionally choose to display the Fisher's exact p-value instead of the Chi squared p-value, with the option exact = TRUE.

You can specify the sort order, with the option sort = "or" to order by odds ratios. The default sort order is by p-values.

The option full = TRUE provides you with useful formatting information, which can be handy if you're using "markdown".

Usage

CCTable(x, cases, exposure = c(), exact = FALSE, sort = "pvalue", full = FALSE)
CCTable(x, cases, exposure = c(), exact = FALSE, sort = "pvalue", full = FALSE)

Arguments

`x`	data.frame
`cases`	character - cases binary variable (0 / 1)
`exposure`	character vector - exposure variables
`exact`	boolean - TRUE if you want the Fisher's exact p-value instead of CHI2
`sort`	character - [pvalue, or, pe] sort by pvalue (default) or by odds ratio, or by percent exposed
`full`	boolean - TRUE if you need to display useful values for formatting

Details

You can optionally choose to display the Fisher???s exact p-value instead of the Chi squared p-value, with the option exact = TRUE.

You can specify the sort order, with the option sort=???or??? to order by odds ratios. The default sort order is by p-values.

The option "full = TRUE" provides you with useful formatting information, which can be handy if you're using "markdown".

Value

list :

`df`	data.frame - results table
`digits`	integer vector - digit number displayed for kable/xtable
`align`	character - alignment for kable/xtable

Note

- You can use the lowercase command "cctable" instead of "CCTable"

Author(s)

[email protected]

References

cctable for Stata by *Gilles Desve* and *Peter Makary*.

Examples

library(EpiStats)

data(Tiramisu)
df <- Tiramisu

# You can see the association between several exposures and being ill.
cctable(df, "ill", exposure=c("wmousse", "tira", "beer", "mousse"))

# By storing results in res, you can also use individual elements of the results.
# For example if you would like to view a particular odds ratio,
# you can view it by typing (for example):

res = CCTable(df, "ill", exposure = c("wmousse", "tira", "beer", "mousse"), exact=TRUE)
res$df$OR[1]
library(EpiStats)

data(Tiramisu)
df <- Tiramisu

# You can see the association between several exposures and being ill.
cctable(df, "ill", exposure=c("wmousse", "tira", "beer", "mousse"))

# By storing results in res, you can also use individual elements of the results.
# For example if you would like to view a particular odds ratio,
# you can view it by typing (for example):

res = CCTable(df, "ill", exposure = c("wmousse", "tira", "beer", "mousse"), exact=TRUE)
res$df$OR[1]

contingency table of 2 variables

Description

Creates a contingency table of 2 variables. Percentages are optionals by row, column or both. It can provides an optional statistic (Fisher or Chisquare).

Usage

 crossTable(data, var1, var2, percent="none", statistic="none")crossTable(data, var1, var2, percent="none", statistic="none")

Arguments

`data`	data.frame
`var1`	character - first varname - can be unquoted
`var2`	character - second varname - can be unquoted
`percent`	character - "none" (default) or ("row", "col", "both") - can be unquoted
`statistic`	character - "none" (default) or ("fisher", "chi2") - can be unquoted

Value

data.frame - contingency table

Author(s)

[email protected]

Examples

library(EpiStats)

# Dataset by Anja Hauri, RKI.
data(Tiramisu)
DF <- Tiramisu

# Table with percentagges and statistic on ordered factors
DF %<>%
  orderFactors(ill , values = c(1,0), labels = c("YES", "NO")) %>%
  orderFactors(sex, values = c("males", "females"), labels = c("Males", "Females"))

crossTable(DF, "ill", "sex", "both", "chi2")

library(EpiStats)

# Dataset by Anja Hauri, RKI.
data(Tiramisu)
DF <- Tiramisu

# Table with percentagges and statistic on ordered factors
DF %<>%
  orderFactors(ill , values = c(1,0), labels = c("YES", "NO")) %>%
  orderFactors(sex, values = c("males", "females"), labels = c("Males", "Females"))

crossTable(DF, "ill", "sex", "both", "chi2")

Univariate analysis of cohort study measuring risk

Description

CS analyses cohort studies with equal follow-up time per subject. The risk (the proportion of individuals who become cases) is calculated overall and among the exposed and unexposed. Note that all variables need to be numeric and binary and coded as "0" and "1".

Point estimates and confidence intervals for the risk ratio and risk difference are calculated, along with attributable or preventive fractions for the exposed and the total population.

Additionally you can select if you want to display the Fisher's exact test, by specifying exact = TRUE.

If you specify full = TRUE you can easily access useful statistics from the output tables.

Usage

CS(x, cases, exposure, exact = F, full = FALSE, title = "CS")
CS(x, cases, exposure, exact = F, full = FALSE, title = "CS")

Arguments

`x`	data.frame
`cases`	character - Case variable
`exposure`	character - Exposure variable
`exact`	boolean - TRUE if you would like to display Fisher's exact p-value
`full`	boolean - TRUE if you need to display useful statistics and values for formatting
`title`	character - title of tables

Value

list:

`df1`	data.frame - two by two table
`df2`	data.frame - statistics
`st`	list - individual statistics
`df1.digits`	integer vector - digit number displayed for kable/xtable
`df2.digits`	integer vector - digit number displayed for kable/xtable
`df2.align`	character - alignment for kable/xtable

The item st returns the risk difference and its 95 percent confidence intervals, the risk ratio and its 95 percent confidence intervals, the attributable fraction among the exposed and its 95 percent confidence intervals, the attributable fraction among the population and its 95 percent confidence intervals, the Chi square value, the Chi square p-value and the Fisher's exact test p-value.

Note

You can use the lowercase command "cs" in place of "CS"

Author(s)

[email protected]

References

Stata 13: cs. https://www.stata.com/manuals13/stepitab.pdf

Examples

library(EpiStats)

# Dataset by Anja Hauri, RKI.
# Dataset provided with package.
data(Tiramisu)
DF <- Tiramisu

# The CS command looks at the association between the outcome variable "ill"
# and an exposure "mousse"
CS(DF, "ill", "mousse")

# The option exact = TRUE provides Fisher's exact test p-values
CS(DF, "ill", "mousse", exact = TRUE)

# With the option full = TRUE you can easily use individual elements of the results:
result <- CS(DF, "ill", "mousse", full = TRUE)
result$st$risk_ratio$point_estimate

library(EpiStats)

# Dataset by Anja Hauri, RKI.
# Dataset provided with package.
data(Tiramisu)
DF <- Tiramisu

# The CS command looks at the association between the outcome variable "ill"
# and an exposure "mousse"
CS(DF, "ill", "mousse")

# The option exact = TRUE provides Fisher's exact test p-values
CS(DF, "ill", "mousse", exact = TRUE)

# With the option full = TRUE you can easily use individual elements of the results:
result <- CS(DF, "ill", "mousse", full = TRUE)
result$st$risk_ratio$point_estimate

Stratified analysis for cohort studies measuring risk

Description

CSInter is useful to determine the effects of a third variable on the association between an exposure and an outcome. CSInter produces 2 by 2 tables with stratum specific risk ratios, attributable risk among exposed and population attributable risk. Note that the outcome and exposure variable need to be numeric and binary and coded as "0" and 1". The third variable needs to be numeric, but may have more categories, such as "0", "1" and "2".

Usage

CSInter(x, cases, exposure, by, table = FALSE, full = FALSE)
CSInter(x, cases, exposure, by, table = FALSE, full = FALSE)

Arguments

`x`	data.frame
`cases`	string: illness binary variable (0 / 1)
`exposure`	string: exposure binary variable (0 / 1)
`by`	string: stratifying variable (a factor)
`table`	boolean - TRUE if you need to display interaction table
`full`	boolean - TRUE if you need to display useful values for formatting

Details

CSInter displays a summary with the crude RR, the Mantel Haenszel adjusted RR and the result of a "Woolf" test for homogeneity of stratum-specific RR.

The option full = TRUE provides you with useful formatting information, which can be handy if you're using "markdown".

Value

list:

`df1`	data.frame - cross-table
`df2`	data.frame - statistics
`df1.digits`	integer vector - digit number displayed for kable/xtable
`df2.digits`	integer vector - digit number displayed for kable/xtable

Note

- You can use the lowercase command "csinter" instead of "CSInter" - The "by" variable (the stratifying variable) can have more than 2 levels

Author(s)

[email protected]

References

csinter for Stata by *Gilles Desve*

Examples

library(EpiStats)

data(Tiramisu)
DF <- Tiramisu

# Here you can see the association between wmousse and ill for each stratum of tira:
csinter(DF, "ill", "wmousse", by = "tira")

# By storing the results in the object "res", you can use individual elements
# of the results. For example if you would like to view just the Mantel-Haenszel
# risk ratio for beer adjusted for tportion, you can view it by typing:
res <- CSInter(DF, "ill", "beer", "tportion", full = TRUE)
res$df2$Stats[3]

library(EpiStats)

data(Tiramisu)
DF <- Tiramisu

# Here you can see the association between wmousse and ill for each stratum of tira:
csinter(DF, "ill", "wmousse", by = "tira")

# By storing the results in the object "res", you can use individual elements
# of the results. For example if you would like to view just the Mantel-Haenszel
# risk ratio for beer adjusted for tportion, you can view it by typing:
res <- CSInter(DF, "ill", "beer", "tportion", full = TRUE)
res$df2$Stats[3]

Summary table for univariate analysis of cohort studies measuring risk

Description

CSTable is used for univariate analysis of cohort studies with several exposures. The results are summarised in one table with one row per exposure making comparisons between exposures easier and providing a useful table for integrating into reports. Note that all variables need to be numeric and binary and coded as "0" and "1".

The results of this function contain: The name of exposure variables, the total number of exposed, the number of exposed cases, the attack rate among the exposed, the total number of unexposed, the number of unexposed cases, the attack rate among the unexposed, risk ratios, 95% percent confidence intervals, and p-values.

You can optionally choose to display the Fisher's exact p-value instead of the Chi squared p-value, with the option exact = TRUE.

You can specify the sort order, with the option sort="rr" to order by risk ratios. The default sort order is by p-values.

The option full = TRUE provides you with useful formatting information, which can be handy if you're using "markdown".

Usage

CSTable(x, cases, exposure = c(), exact = FALSE, sort = "pvalue", full = FALSE)
CSTable(x, cases, exposure = c(), exact = FALSE, sort = "pvalue", full = FALSE)

Arguments

`x`	data.frame
`cases`	string - variable containing cases (binary 0 / 1)
`exposure`	string vector - names of variables containing exposure (binary 0 / 1)
`exact`	boolean - TRUE if you want the Fisher's exact p-value instead of CHI2
`sort`	character - [pvalue, rr, ar] sort by pvalue (default) or by risk ratio, or by percent of attributable risk
`full`	boolean - TRUE if you need to display useful values for formatting

Details

You can optionally choose to display the Fisher's exact p-value instead of the Chi squared p-value, with the option exact = TRUE.

You can specify the sort order, with the option sort="rr" to order by risk ratios. The default sort order is by p-values.

The option full = TRUE provides you with useful formatting information, which can be handy if you're using "markdown".

Value

list :

`df`	data.frame - results table
`digits`	integer vector - digit number displayed for kable/xtable
`align`	character - alignment for kable/xtable

Note

- You can use the lowercase command "cstable" instead of "CSTable"

Author(s)

[email protected]

References

cstable for Stata by *Gilles Desve* and *Peter Makary*

Examples

library(EpiStats)

data(Tiramisu)
df <- Tiramisu

# You can see the association between several exposures and being ill.
CSTable(df, "ill", exposure=c("wmousse", "tira", "beer", "mousse"))

# By storing results in res, you can also use individual elements of the results.
# For example if you would like to view a particular risk ratio,
# you can view it by typing (for example):
res <- CSTable(df, "ill", exposure = c("wmousse", "tira", "beer", "mousse"), exact=TRUE)
res$df$RR[1]

library(EpiStats)

data(Tiramisu)
df <- Tiramisu

# You can see the association between several exposures and being ill.
CSTable(df, "ill", exposure=c("wmousse", "tira", "beer", "mousse"))

# By storing results in res, you can also use individual elements of the results.
# For example if you would like to view a particular risk ratio,
# you can view it by typing (for example):
res <- CSTable(df, "ill", exposure = c("wmousse", "tira", "beer", "mousse"), exact=TRUE)
res$df$RR[1]

Generates ordered factors.

Description

Generates ordered factors for a list of columns by name or by index or range.

Usage

 orderFactors(data, ..., values, labels=NULL)orderFactors(data, ..., values, labels=NULL)

Arguments

`data`	data.frame
`...`	character - first varname - can be unquoted
`values`	character - second varname - can be unquoted
`labels`	character - NULL (default) or ("row", "col", "both") - can be unquoted

Value

data.frame - contingency table

Author(s)

[email protected]

Examples

library(EpiStats)

# Dataset by Anja Hauri, RKI.
data(Tiramisu)
DF <- Tiramisu

# Table with percentagges and statistic on ordered factors
DF %<>%
  orderFactors(ill , values = c(1,0), labels = c("YES", "NO")) %>%
  orderFactors(sex, values = c("males", "females"), labels = c("Males", "Females"))
crossTable(DF, "ill", "sex", "both", "chi2")
library(EpiStats)

# Dataset by Anja Hauri, RKI.
data(Tiramisu)
DF <- Tiramisu

# Table with percentagges and statistic on ordered factors
DF %<>%
  orderFactors(ill , values = c(1,0), labels = c("YES", "NO")) %>%
  orderFactors(sex, values = c("males", "females"), labels = c("Males", "Females"))
crossTable(DF, "ill", "sex", "both", "chi2")

A foodborne disease outbreak dataset

Description

The dataset available with the EpiStats package is from an outbreak investigation carried out in Germany in 1998 by Anja Hauri, Robert Koch Institute.

Usage

data(Tiramisu)data(Tiramisu)

Format

A data frame with 291 observations with the following 21 variables.

ill: a numeric vector
dateonset: a date
sex: a factor with levels females males
age: a numeric vector
tira: a numeric vector
tportion: a numeric vector
wmousse: a numeric vector
dmousse: a numeric vector
mousse: a numeric vector
mportion: a numeric vector
beer: a numeric vector
uniquekey: a numeric vector
redjelly: a numeric vector
fruitsalad: a numeric vector
tomato: a numeric vector
mince: a numeric vector
salmon: a numeric vector
horseradish: a numeric vector
chickenwin: a numeric vector
roastbeef: a numeric vector
pork: a numeric vector

References

The dataset available with the EpiStats package is from an outbreak investigation carried out in Germany in 1998 by Anja Hauri, Robert Koch Institute. It is used in case studies by organisations including EPIET, ECDC and EpiConcept. It is provided with this package with Anja's permission.

Examples

data(Tiramisu)
## maybe str(Tiramisu) ; plot(Tiramisu) ...
data(Tiramisu)
## maybe str(Tiramisu) ; plot(Tiramisu) ...

Package 'EpiStats'

Help Index

Univariate analysis of case control studies

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

Stratified analysis for case control studies

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Summary table for univariate analysis of case control studies

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

contingency table of 2 variables

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Univariate analysis of cohort study measuring risk

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

Stratified analysis for cohort studies measuring risk

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Summary table for univariate analysis of cohort studies measuring risk

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Generates ordered factors.

Description

Usage

Arguments

Value

Author(s)