--- title: "Introduction to the covid19swiss Dataset" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to the covid19italy Dataset} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, message=FALSE, warning=FALSE, comment = "#>" ) ``` The covid19swiss R package provides a tidy format dataset of the 2019 Novel Coronavirus COVID-19 (2019-nCoV) pandemic outbreak in Switzerland cantons and Principality of Liechtenstein (FL). ### Data structure The `covid19swiss` dataset includes the following fields: * `date` - the timestamp of the case, a `Date` object * `location` - the **Cantons of Switzerland** and **the Principality of Liechtenstein** (FL) abbreviation code * `location_type` - description of the location, either **Canton of Switzerland** or **the Principality of echtenstein** * `location_code` - a canton index code for merging geometry data from the rnaturalearth package, ailable only for Switzerland cantons * `location_code_type` - the name of code in the **rnaturalearth** package for Switzerland map * `data_type` - the type of case * `value` - the number of cases corresponding to the `date` and `data_type` fields Where the available `data_type` field includes the following cases: * `tested_total` - cumulative number of tests performed as of the date * `cases_total` - cumulative confirmed Covid-19 cases as of the current date * `hosp_new` - new hospitalizations on the current date * `hosp_current` - current number of hospitalized patients as of the current date * `icu_current` - number of hospitalized patients in ICUs as of the current date * `vent_current` - number of hospitalized patients requiring ventilation as of the current date * `recovered_total` - cumulative number of patients recovered as of the current date * `deaths_total` - cumulative deaths due to Covid-19 as of the current date The data organized in a long format: ```{r} library(covid19swiss) head(covid19swiss) ``` It is straightforward to transform the data into a wide format with the `pivot_wider` function from the **tidyr** package: ```{r} library(tidyr) covid19swiss_wide <- covid19swiss %>% pivot_wider(names_from = data_type, values_from = value) head(covid19swiss_wide) ``` ### Query and summarise the data The following examples demonstrate simple methods for query and summarise the data with the dplyr and tidyr packages. #### Cases summary by canton The first example demonstrates how to query the total confirmed, recovered, and death cases by canton as of April 8th: ```{r } library(dplyr) covid19swiss %>% filter(date == as.Date("2020-09-08"), data_type %in% c("cases_total", "recovered_total", "death_total")) %>% select(location, value, data_type) %>% pivot_wider(names_from = data_type, values_from = value) %>% arrange(-cases_total) ``` **Note:** some fields, such as `total_recovered` or `total_tested`, are not available for some cantons and marked as missing values (i.e., `NA`) ### Calculating rates for Canton of Geneva In the next example, we will filter the dataset for the Canton of Geneva and calculate the following metrics: * Positive rate - $\frac{Total ~confirmed}{Total ~tested}$ * Recovery rate - $\frac{Total ~recovered}{Total ~confirmed}$ * Death rate - $\frac{Total ~death}{Total ~confirmed}$ ```{r} covid19swiss %>% dplyr::filter(location == "GE", date == as.Date("2020-04-10")) %>% dplyr::select(data_type, value) %>% tidyr::pivot_wider(names_from = data_type, values_from = value) %>% dplyr::mutate(positive_tested = round(100 * cases_total / tested_total, 2), death_rate = round(100 * deaths_total / cases_total, 2), recovery_rate = round(100 * recovered_total / cases_total, 2)) %>% dplyr::select(positive_tested, recovery_rate, death_rate) ``` Values are in precentage ### Separating between Switzerland and Principality of Liechtenstein The raw data include both Switzerland and the Principality of Liechtenstein. Separating the data by country can be done by using the `location` field: ```{r} switzerland <- covid19swiss %>% filter(location != "FL") head(switzerland) liechtenstein <- covid19swiss %>% filter(location == "FL") head(liechtenstein) ```