Title: | Tools for Epidemiologists |
---|---|
Description: | Provides set of functions aimed at epidemiologists. The package includes commands for measures of association and impact for case control studies and cohort studies. It may be particularly useful for outbreak investigations including univariable analysis and stratified analysis. The functions for cohort studies include the CS(), CSTable() and CSInter() commands. The functions for case control studies include the CC(), CCTable() and CCInter() commands. References - Cornfield, J. 1956. A statistical problem arising from retrospective studies. In Vol. 4 of Proceedings of the Third Berkeley Symposium, ed. J. Neyman, 135-148. Berkeley, CA - University of California Press. Woolf, B. 1955. On estimating the relation between blood group disease. Annals of Human Genetics 19 251-253. Reprinted in Evolution of Epidemiologic Ideas Annotated Readings on Concepts and Methods, ed. S. Greenland, pp. 108-110. Newton Lower Falls, MA Epidemiology Resources. Gilles Desve & Peter Makary, 2007. 'CSTABLE Stata module to calculate summary table for cohort study' Statistical Software Components S456879, Boston College Department of Economics. Gilles Desve & Peter Makary, 2007. 'CCTABLE Stata module to calculate summary table for case-control study' Statistical Software Components S456878, Boston College Department of Economics. |
Authors: | Jean Pierre Decorps [aut], Esther Kissling [ctb], Lore Merdrignac [cre] |
Maintainer: | Lore Merdrignac <[email protected]> |
License: | LGPL-3 |
Version: | 1.6-2 |
Built: | 2024-10-30 09:16:02 UTC |
Source: | https://github.com/cran/EpiStats |
CC is used with case-control studies to determine the association between an exposure and an outcome. Note that all variables need to be numeric and binary and coded as "0" and "1". Point estimates and confidence intervals for the odds ratio are calculated, along with attributable or prevented fractions for the exposed and total population.
Additionally you can select if you want to display the Fisher's exact test, by specifying exact = TRUE.
If you specify full = TRUE you can easily access useful statistics from the output tables.
CC(data, cases, exposure, exact = FALSE, full = FALSE, title = "CC")
CC(data, cases, exposure, exact = FALSE, full = FALSE, title = "CC")
data |
data.frame |
cases |
character - Case variable |
exposure |
character - Exposure variable |
exact |
boolean - TRUE if you would like to display Fisher's exact p-value |
full |
boolean - TRUE if you need to display useful statistics and values for formatting |
title |
character - title of tables |
list:
df1 |
data.frame - two by two table |
df2 |
data.frame - statistics |
df1.align |
character - alignment for kable/xtable |
df2.align |
character - alignment for kable/xtable |
df1.digits |
integer vector - digit number displayed for kable/xtable |
df2.digits |
integer vector - digit number displayed for kable/xtable |
st |
list - individual statistics |
The item st returns the odds ratio and its 95 percent confidence intervals, the attributable fraction among the exposed and its 95 percent confidence intervals, the attributable fraction among the population and its 95 percent confidence intervals, the Chi square value, the Chi square p-value and the Fisher's exact test p-value.
You can use the lowercase command "cc" in place of "CC"
Please note also that when the outcome is frequent the odds ratio will overestimate the risk ratio (if OR>1) or underestimate the risk ratio (OR<1). If the outcome is rare, the risk ratio and the odds ratio are similiar.
In a case control study, the attributable fraction among the exposed and among the population assume that the OR approximates the risk ratio.
Please interpret all measures with caution.
Stata 13: cc https://www.stata.com/manuals13/stepitab.pdf
CCTable, CCInter, CS, CSTable, CSInter
library(EpiStats) # Dataset by Anja Hauri, RKI. data(Tiramisu) DF <- Tiramisu # The CC command looks at the association between the outcome variable "ill" # and an exposure "mousse" CC(DF, "ill", "mousse") # The option exact = TRUE provides Fisher's exact test p-values CC(DF, "ill", "mousse", exact = TRUE) # With the option full = TRUE you can easily use individual elements of the results: result <- CC(DF, "ill", "mousse", full = TRUE) result$st$odds_ratio$point_estimate
library(EpiStats) # Dataset by Anja Hauri, RKI. data(Tiramisu) DF <- Tiramisu # The CC command looks at the association between the outcome variable "ill" # and an exposure "mousse" CC(DF, "ill", "mousse") # The option exact = TRUE provides Fisher's exact test p-values CC(DF, "ill", "mousse", exact = TRUE) # With the option full = TRUE you can easily use individual elements of the results: result <- CC(DF, "ill", "mousse", full = TRUE) result$st$odds_ratio$point_estimate
CCInter is useful to determine the effects of a third variable on the association between an exposure and an outcome. CCInter produces 2 by 2 tables with stratum specific odds ratios, attributable risk among exposed and population attributable risk.
Note that the outcome and exposure variable need to be numeric and binary and coded as "0" and 1". The third variable needs to be numeric, but may have more categories, such as "0", "1" and "2".
CCInter(x, cases, exposure, by, table = FALSE, full = FALSE)
CCInter(x, cases, exposure, by, table = FALSE, full = FALSE)
x |
data.frame |
cases |
string: case binary variable (0 / 1) |
exposure |
string: exposure binary variable (0 / 1) |
by |
string: stratifying variable (a factor) |
table |
boolean - TRUE if you need to display interaction table |
full |
boolean - TRUE if you need to display useful values for formatting |
CCInter is useful to determine the effects of a third variable on the association between an exposure and an outcome. CCInter produces 2 by 2 tables with stratum specific odds ratios, attributable risk among exposed and population attributable risk. Note that the outcome and exposure variable need to be numeric and binary and coded as "0" and 1". The third variable needs to be numeric, but may have more categories, such as "0", "1" and "2". CCInter displays a summary with the crude OR, the Mantel Haenszel adjusted OR and the result of a Woolf test for homogeneity of stratum-specific OR.
The option "full = TRUE" provides you with useful formatting information, which can be handy if you're using "markdown".
list:
df1 |
data.frame - cross-table |
df2 |
data.frame - statistics |
df1.digits |
integer vector - digit number displayed for kable/xtable |
df1.align |
character - alignment for kable/xtable |
df2.digits |
integer vector - digit number displayed for kable/xtable |
df2.align |
character - alignment for kable/xtable |
- You can use lowercas command "ccinter" instead of "CCInter" - The "by" variable (the stratifying variable) can have more than 2 levels
ccinter for Stata by *Gilles Desve*
CC, CCTable
library(EpiStats) data(Tiramisu) DF <- Tiramisu # Here you can see the association between wmousse and ill for each stratum of tira: CCInter(DF, "ill", "wmousse", by = "tira") # By storing the results in the object "res", you can use individual elements of the results. # For example if you would like to view just the Mantel-Haenszel odds ratio for beer adjusted # for tportion, you can view it by typing: res <- CCInter(DF, "ill", "beer", "tportion", full = TRUE) res$df2$Stats[3]
library(EpiStats) data(Tiramisu) DF <- Tiramisu # Here you can see the association between wmousse and ill for each stratum of tira: CCInter(DF, "ill", "wmousse", by = "tira") # By storing the results in the object "res", you can use individual elements of the results. # For example if you would like to view just the Mantel-Haenszel odds ratio for beer adjusted # for tportion, you can view it by typing: res <- CCInter(DF, "ill", "beer", "tportion", full = TRUE) res$df2$Stats[3]
CCTable is used for univariate analysis of case control studies with several exposures. The results are summarised in one table with one row per exposure making comparisons between exposures easier and providing a useful table for integrating into reports. Note that all variables need to be numeric and binary and coded as "0" and "1".
The results of this function contain: The name of exposure variables, the total number of cases, the number of exposed cases, the percentage of exposed among cases, the number of controls, the number of exposed controls, the percentage of exposed among controls, odds ratios, 95%CI intervals, p-values.
You can optionally choose to display the Fisher's exact p-value instead of the Chi squared p-value, with the option exact = TRUE.
You can specify the sort order, with the option sort = "or" to order by odds ratios. The default sort order is by p-values.
The option full = TRUE provides you with useful formatting information, which can be handy if you're using "markdown".
CCTable(x, cases, exposure = c(), exact = FALSE, sort = "pvalue", full = FALSE)
CCTable(x, cases, exposure = c(), exact = FALSE, sort = "pvalue", full = FALSE)
x |
data.frame |
cases |
character - cases binary variable (0 / 1) |
exposure |
character vector - exposure variables |
exact |
boolean - TRUE if you want the Fisher's exact p-value instead of CHI2 |
sort |
character - [pvalue, or, pe] sort by pvalue (default) or by odds ratio, or by percent exposed |
full |
boolean - TRUE if you need to display useful values for formatting |
The results of this function contain: The name of exposure variables, the total number of cases, the number of exposed cases, the percentage of exposed among cases, the number of controls, the number of exposed controls, the percentage of exposed among controls, odds ratios, 95%CI intervals, p-values.
You can optionally choose to display the Fisher???s exact p-value instead of the Chi squared p-value, with the option exact = TRUE.
You can specify the sort order, with the option sort=???or??? to order by odds ratios. The default sort order is by p-values.
The option "full = TRUE" provides you with useful formatting information, which can be handy if you're using "markdown".
list :
df |
data.frame - results table |
digits |
integer vector - digit number displayed for kable/xtable |
align |
character - alignment for kable/xtable |
- You can use the lowercase command "cctable" instead of "CCTable"
cctable for Stata by *Gilles Desve* and *Peter Makary*.
CC, CCInter
library(EpiStats) data(Tiramisu) df <- Tiramisu # You can see the association between several exposures and being ill. cctable(df, "ill", exposure=c("wmousse", "tira", "beer", "mousse")) # By storing results in res, you can also use individual elements of the results. # For example if you would like to view a particular odds ratio, # you can view it by typing (for example): res = CCTable(df, "ill", exposure = c("wmousse", "tira", "beer", "mousse"), exact=TRUE) res$df$OR[1]
library(EpiStats) data(Tiramisu) df <- Tiramisu # You can see the association between several exposures and being ill. cctable(df, "ill", exposure=c("wmousse", "tira", "beer", "mousse")) # By storing results in res, you can also use individual elements of the results. # For example if you would like to view a particular odds ratio, # you can view it by typing (for example): res = CCTable(df, "ill", exposure = c("wmousse", "tira", "beer", "mousse"), exact=TRUE) res$df$OR[1]
Creates a contingency table of 2 variables. Percentages are optionals by row, column or both. It can provides an optional statistic (Fisher or Chisquare).
crossTable(data, var1, var2, percent="none", statistic="none")
crossTable(data, var1, var2, percent="none", statistic="none")
data |
data.frame |
var1 |
character - first varname - can be unquoted |
var2 |
character - second varname - can be unquoted |
percent |
character - "none" (default) or ("row", "col", "both") - can be unquoted |
statistic |
character - "none" (default) or ("fisher", "chi2") - can be unquoted |
data.frame - contingency table
orderFactors, CC, CS
library(EpiStats) # Dataset by Anja Hauri, RKI. data(Tiramisu) DF <- Tiramisu # Table with percentagges and statistic on ordered factors DF %<>% orderFactors(ill , values = c(1,0), labels = c("YES", "NO")) %>% orderFactors(sex, values = c("males", "females"), labels = c("Males", "Females")) crossTable(DF, "ill", "sex", "both", "chi2")
library(EpiStats) # Dataset by Anja Hauri, RKI. data(Tiramisu) DF <- Tiramisu # Table with percentagges and statistic on ordered factors DF %<>% orderFactors(ill , values = c(1,0), labels = c("YES", "NO")) %>% orderFactors(sex, values = c("males", "females"), labels = c("Males", "Females")) crossTable(DF, "ill", "sex", "both", "chi2")
CS analyses cohort studies with equal follow-up time per subject. The risk (the proportion of individuals who become cases) is calculated overall and among the exposed and unexposed. Note that all variables need to be numeric and binary and coded as "0" and "1".
Point estimates and confidence intervals for the risk ratio and risk difference are calculated, along with attributable or preventive fractions for the exposed and the total population.
Additionally you can select if you want to display the Fisher's exact test, by specifying exact = TRUE.
If you specify full = TRUE you can easily access useful statistics from the output tables.
CS(x, cases, exposure, exact = F, full = FALSE, title = "CS")
CS(x, cases, exposure, exact = F, full = FALSE, title = "CS")
x |
data.frame |
cases |
character - Case variable |
exposure |
character - Exposure variable |
exact |
boolean - TRUE if you would like to display Fisher's exact p-value |
full |
boolean - TRUE if you need to display useful statistics and values for formatting |
title |
character - title of tables |
list:
df1 |
data.frame - two by two table |
df2 |
data.frame - statistics |
st |
list - individual statistics |
df1.digits |
integer vector - digit number displayed for kable/xtable |
df2.digits |
integer vector - digit number displayed for kable/xtable |
df2.align |
character - alignment for kable/xtable |
The item st returns the risk difference and its 95 percent confidence intervals, the risk ratio and its 95 percent confidence intervals, the attributable fraction among the exposed and its 95 percent confidence intervals, the attributable fraction among the population and its 95 percent confidence intervals, the Chi square value, the Chi square p-value and the Fisher's exact test p-value.
You can use the lowercase command "cs" in place of "CS"
Stata 13: cs. https://www.stata.com/manuals13/stepitab.pdf
CSTable, CSInter, CC, CCTable, CCInter
library(EpiStats) # Dataset by Anja Hauri, RKI. # Dataset provided with package. data(Tiramisu) DF <- Tiramisu # The CS command looks at the association between the outcome variable "ill" # and an exposure "mousse" CS(DF, "ill", "mousse") # The option exact = TRUE provides Fisher's exact test p-values CS(DF, "ill", "mousse", exact = TRUE) # With the option full = TRUE you can easily use individual elements of the results: result <- CS(DF, "ill", "mousse", full = TRUE) result$st$risk_ratio$point_estimate
library(EpiStats) # Dataset by Anja Hauri, RKI. # Dataset provided with package. data(Tiramisu) DF <- Tiramisu # The CS command looks at the association between the outcome variable "ill" # and an exposure "mousse" CS(DF, "ill", "mousse") # The option exact = TRUE provides Fisher's exact test p-values CS(DF, "ill", "mousse", exact = TRUE) # With the option full = TRUE you can easily use individual elements of the results: result <- CS(DF, "ill", "mousse", full = TRUE) result$st$risk_ratio$point_estimate
CSInter is useful to determine the effects of a third variable on the association between an exposure and an outcome. CSInter produces 2 by 2 tables with stratum specific risk ratios, attributable risk among exposed and population attributable risk. Note that the outcome and exposure variable need to be numeric and binary and coded as "0" and 1". The third variable needs to be numeric, but may have more categories, such as "0", "1" and "2".
CSInter(x, cases, exposure, by, table = FALSE, full = FALSE)
CSInter(x, cases, exposure, by, table = FALSE, full = FALSE)
x |
data.frame |
cases |
string: illness binary variable (0 / 1) |
exposure |
string: exposure binary variable (0 / 1) |
by |
string: stratifying variable (a factor) |
table |
boolean - TRUE if you need to display interaction table |
full |
boolean - TRUE if you need to display useful values for formatting |
CSInter is useful to determine the effects of a third variable on the association between an exposure and an outcome. CSInter produces 2 by 2 tables with stratum specific risk ratios, attributable risk among exposed and population attributable risk. Note that the outcome and exposure variable need to be numeric and binary and coded as "0" and 1". The third variable needs to be numeric, but may have more categories, such as "0", "1" and "2".
CSInter displays a summary with the crude RR, the Mantel Haenszel adjusted RR and the result of a "Woolf" test for homogeneity of stratum-specific RR.
The option full = TRUE provides you with useful formatting information, which can be handy if you're using "markdown".
list:
df1 |
data.frame - cross-table |
df2 |
data.frame - statistics |
df1.digits |
integer vector - digit number displayed for kable/xtable |
df2.digits |
integer vector - digit number displayed for kable/xtable |
- You can use the lowercase command "csinter" instead of "CSInter" - The "by" variable (the stratifying variable) can have more than 2 levels
csinter for Stata by *Gilles Desve*
CS, CSTable
library(EpiStats) data(Tiramisu) DF <- Tiramisu # Here you can see the association between wmousse and ill for each stratum of tira: csinter(DF, "ill", "wmousse", by = "tira") # By storing the results in the object "res", you can use individual elements # of the results. For example if you would like to view just the Mantel-Haenszel # risk ratio for beer adjusted for tportion, you can view it by typing: res <- CSInter(DF, "ill", "beer", "tportion", full = TRUE) res$df2$Stats[3]
library(EpiStats) data(Tiramisu) DF <- Tiramisu # Here you can see the association between wmousse and ill for each stratum of tira: csinter(DF, "ill", "wmousse", by = "tira") # By storing the results in the object "res", you can use individual elements # of the results. For example if you would like to view just the Mantel-Haenszel # risk ratio for beer adjusted for tportion, you can view it by typing: res <- CSInter(DF, "ill", "beer", "tportion", full = TRUE) res$df2$Stats[3]
CSTable is used for univariate analysis of cohort studies with several exposures. The results are summarised in one table with one row per exposure making comparisons between exposures easier and providing a useful table for integrating into reports. Note that all variables need to be numeric and binary and coded as "0" and "1".
The results of this function contain: The name of exposure variables, the total number of exposed, the number of exposed cases, the attack rate among the exposed, the total number of unexposed, the number of unexposed cases, the attack rate among the unexposed, risk ratios, 95% percent confidence intervals, and p-values.
You can optionally choose to display the Fisher's exact p-value instead of the Chi squared p-value, with the option exact = TRUE.
You can specify the sort order, with the option sort="rr" to order by risk ratios. The default sort order is by p-values.
The option full = TRUE provides you with useful formatting information, which can be handy if you're using "markdown".
CSTable(x, cases, exposure = c(), exact = FALSE, sort = "pvalue", full = FALSE)
CSTable(x, cases, exposure = c(), exact = FALSE, sort = "pvalue", full = FALSE)
x |
data.frame |
cases |
string - variable containing cases (binary 0 / 1) |
exposure |
string vector - names of variables containing exposure (binary 0 / 1) |
exact |
boolean - TRUE if you want the Fisher's exact p-value instead of CHI2 |
sort |
character - [pvalue, rr, ar] sort by pvalue (default) or by risk ratio, or by percent of attributable risk |
full |
boolean - TRUE if you need to display useful values for formatting |
The results of this function contain: The name of exposure variables, the total number of exposed, the number of exposed cases, the attack rate among the exposed, the total number of unexposed, the number of unexposed cases, the attack rate among the unexposed, risk ratios, 95
You can optionally choose to display the Fisher's exact p-value instead of the Chi squared p-value, with the option exact = TRUE.
You can specify the sort order, with the option sort="rr" to order by risk ratios. The default sort order is by p-values.
The option full = TRUE provides you with useful formatting information, which can be handy if you're using "markdown".
list :
df |
data.frame - results table |
digits |
integer vector - digit number displayed for kable/xtable |
align |
character - alignment for kable/xtable |
- You can use the lowercase command "cstable" instead of "CSTable"
cstable for Stata by *Gilles Desve* and *Peter Makary*
CS, CSInter
library(EpiStats) data(Tiramisu) df <- Tiramisu # You can see the association between several exposures and being ill. CSTable(df, "ill", exposure=c("wmousse", "tira", "beer", "mousse")) # By storing results in res, you can also use individual elements of the results. # For example if you would like to view a particular risk ratio, # you can view it by typing (for example): res <- CSTable(df, "ill", exposure = c("wmousse", "tira", "beer", "mousse"), exact=TRUE) res$df$RR[1]
library(EpiStats) data(Tiramisu) df <- Tiramisu # You can see the association between several exposures and being ill. CSTable(df, "ill", exposure=c("wmousse", "tira", "beer", "mousse")) # By storing results in res, you can also use individual elements of the results. # For example if you would like to view a particular risk ratio, # you can view it by typing (for example): res <- CSTable(df, "ill", exposure = c("wmousse", "tira", "beer", "mousse"), exact=TRUE) res$df$RR[1]
Generates ordered factors for a list of columns by name or by index or range.
orderFactors(data, ..., values, labels=NULL)
orderFactors(data, ..., values, labels=NULL)
data |
data.frame |
... |
character - first varname - can be unquoted |
values |
character - second varname - can be unquoted |
labels |
character - NULL (default) or ("row", "col", "both") - can be unquoted |
data.frame - contingency table
crossTable
library(EpiStats) # Dataset by Anja Hauri, RKI. data(Tiramisu) DF <- Tiramisu # Table with percentagges and statistic on ordered factors DF %<>% orderFactors(ill , values = c(1,0), labels = c("YES", "NO")) %>% orderFactors(sex, values = c("males", "females"), labels = c("Males", "Females")) crossTable(DF, "ill", "sex", "both", "chi2")
library(EpiStats) # Dataset by Anja Hauri, RKI. data(Tiramisu) DF <- Tiramisu # Table with percentagges and statistic on ordered factors DF %<>% orderFactors(ill , values = c(1,0), labels = c("YES", "NO")) %>% orderFactors(sex, values = c("males", "females"), labels = c("Males", "Females")) crossTable(DF, "ill", "sex", "both", "chi2")
The dataset available with the EpiStats package is from an outbreak investigation carried out in Germany in 1998 by Anja Hauri, Robert Koch Institute.
data(Tiramisu)
data(Tiramisu)
A data frame with 291 observations with the following 21 variables.
ill
a numeric vector
dateonset
a date
sex
a factor with levels females
males
age
a numeric vector
tira
a numeric vector
tportion
a numeric vector
wmousse
a numeric vector
dmousse
a numeric vector
mousse
a numeric vector
mportion
a numeric vector
beer
a numeric vector
uniquekey
a numeric vector
redjelly
a numeric vector
fruitsalad
a numeric vector
tomato
a numeric vector
mince
a numeric vector
salmon
a numeric vector
horseradish
a numeric vector
chickenwin
a numeric vector
roastbeef
a numeric vector
pork
a numeric vector
The dataset available with the EpiStats package is from an outbreak investigation carried out in Germany in 1998 by Anja Hauri, Robert Koch Institute. It is used in case studies by organisations including EPIET, ECDC and EpiConcept. It is provided with this package with Anja's permission.
data(Tiramisu) ## maybe str(Tiramisu) ; plot(Tiramisu) ...
data(Tiramisu) ## maybe str(Tiramisu) ; plot(Tiramisu) ...