The covid19swiss R package provides a tidy format dataset of the 2019 Novel Coronavirus COVID-19 (2019-nCoV) pandemic outbreak in Switzerland cantons and Principality of Liechtenstein (FL).
The covid19swiss
dataset includes the following
fields:
date
- the timestamp of the case, a Date
objectlocation
- the Cantons of Switzerland
and the Principality of Liechtenstein (FL) abbreviation
codelocation_type
- description of the location, either
Canton of Switzerland or the Principality of
echtensteinlocation_code
- a canton index code for merging
geometry data from the rnaturalearth package, ailable only for
Switzerland cantonslocation_code_type
- the name of code in the
rnaturalearth package for Switzerland mapdata_type
- the type of casevalue
- the number of cases corresponding to the
date
and data_type
fieldsWhere the available data_type
field includes the
following cases:
tested_total
- cumulative number of tests performed as
of the datecases_total
- cumulative confirmed Covid-19 cases as of
the current datehosp_new
- new hospitalizations on the current
datehosp_current
- current number of hospitalized patients
as of the current dateicu_current
- number of hospitalized patients in ICUs
as of the current datevent_current
- number of hospitalized patients
requiring ventilation as of the current daterecovered_total
- cumulative number of patients
recovered as of the current datedeaths_total
- cumulative deaths due to Covid-19 as of
the current dateThe data organized in a long format:
library(covid19swiss)
head(covid19swiss)
#> date location location_type location_code location_code_type
#> 1 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> 2 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> 3 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> 4 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> 5 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> 6 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> data_type value
#> 1 tested_total 4
#> 2 cases_total NA
#> 3 hosp_new NA
#> 4 hosp_current NA
#> 5 icu_current NA
#> 6 vent_current NA
It is straightforward to transform the data into a wide format with
the pivot_wider
function from the tidyr
package:
library(tidyr)
covid19swiss_wide <- covid19swiss %>%
pivot_wider(names_from = data_type, values_from = value)
head(covid19swiss_wide)
#> # A tibble: 6 × 13
#> date location location_type location_code location_code_type
#> <date> <chr> <chr> <chr> <chr>
#> 1 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> 2 2020-01-25 GE Canton of Switzerland CH.GE gn_a1_code
#> 3 2020-01-26 GE Canton of Switzerland CH.GE gn_a1_code
#> 4 2020-01-27 GE Canton of Switzerland CH.GE gn_a1_code
#> 5 2020-01-28 GE Canton of Switzerland CH.GE gn_a1_code
#> 6 2020-01-29 GE Canton of Switzerland CH.GE gn_a1_code
#> # ℹ 8 more variables: tested_total <int>, cases_total <int>, hosp_new <int>,
#> # hosp_current <int>, icu_current <int>, vent_current <int>,
#> # recovered_total <int>, deaths_total <int>
The following examples demonstrate simple methods for query and summarise the data with the dplyr and tidyr packages.
The first example demonstrates how to query the total confirmed, recovered, and death cases by canton as of April 8th:
library(dplyr)
covid19swiss %>%
filter(date == as.Date("2020-09-08"),
data_type %in% c("cases_total", "recovered_total", "death_total")) %>%
select(location, value, data_type) %>%
pivot_wider(names_from = data_type, values_from = value) %>%
arrange(-cases_total)
#> # A tibble: 26 × 3
#> location cases_total recovered_total
#> <chr> <int> <int>
#> 1 VD 8109 NA
#> 2 GE 7409 NA
#> 3 ZH 6643 NA
#> 4 TI 3565 929
#> 5 BE 2698 NA
#> 6 VS 2416 320
#> 7 AG 2260 1495
#> 8 FR 1925 164
#> 9 SG 1353 NA
#> 10 BS 1254 1154
#> # ℹ 16 more rows
Note: some fields, such as
total_recovered
or total_tested
, are not
available for some cantons and marked as missing values (i.e.,
NA
)
In the next example, we will filter the dataset for the Canton of Geneva and calculate the following metrics:
covid19swiss %>% dplyr::filter(location == "GE",
date == as.Date("2020-04-10")) %>%
dplyr::select(data_type, value) %>%
tidyr::pivot_wider(names_from = data_type, values_from = value) %>%
dplyr::mutate(positive_tested = round(100 * cases_total / tested_total, 2),
death_rate = round(100 * deaths_total / cases_total, 2),
recovery_rate = round(100 * recovered_total / cases_total, 2)) %>%
dplyr::select(positive_tested, recovery_rate, death_rate)
#> # A tibble: 1 × 3
#> positive_tested recovery_rate death_rate
#> <dbl> <dbl> <dbl>
#> 1 24.5 9.79 3.58
Values are in precentage
The raw data include both Switzerland and the Principality of
Liechtenstein. Separating the data by country can be done by using the
location
field:
switzerland <- covid19swiss %>% filter(location != "FL")
head(switzerland)
#> date location location_type location_code location_code_type
#> 1 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> 2 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> 3 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> 4 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> 5 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> 6 2020-01-24 GE Canton of Switzerland CH.GE gn_a1_code
#> data_type value
#> 1 tested_total 4
#> 2 cases_total NA
#> 3 hosp_new NA
#> 4 hosp_current NA
#> 5 icu_current NA
#> 6 vent_current NA
liechtenstein <- covid19swiss %>% filter(location == "FL")
head(liechtenstein)
#> date location location_type location_code
#> 1 2020-02-27 FL Principality of Liechtenstein <NA>
#> 2 2020-02-27 FL Principality of Liechtenstein <NA>
#> 3 2020-02-27 FL Principality of Liechtenstein <NA>
#> 4 2020-02-27 FL Principality of Liechtenstein <NA>
#> 5 2020-02-27 FL Principality of Liechtenstein <NA>
#> 6 2020-02-27 FL Principality of Liechtenstein <NA>
#> location_code_type data_type value
#> 1 gn_a1_code tested_total 3
#> 2 gn_a1_code cases_total NA
#> 3 gn_a1_code hosp_new NA
#> 4 gn_a1_code hosp_current NA
#> 5 gn_a1_code icu_current NA
#> 6 gn_a1_code vent_current NA