Downloads Johns Hopkins University CSSE data on the spread of the SARS-CoV-2 virus and the Covid-19 pandemic (https://github.com/CSSEGISandData/COVID-19). The data for confirmed cases, reported deaths and recoveries are merged into one data frame, converted to long format and joined with ISO3c (ISO 3166-1 alpha-3) country codes based on the countrycode package. Please note: JHU stopped updating the data on March 10, 2023.

download_jhu_csse_covid19_data(
  type = "country",
  silent = FALSE,
  cached = FALSE
)

Arguments

type

The type of data that you want to retrieve. Can be any subset of

  • "country": Data at the country level (the default).

  • "country_region": Data at the country region level (only available for Australia, Canada, China and some oversea areas).

  • "us_county": Data at the U.S. county level.

silent

Whether you want the function to send some status messages to the console. Might be informative as downloading will take some time and thus defaults to TRUE.

cached

Whether you want to download the cached version of the data from the tidycovid19 Github repository instead of retrieving the data from the authorative source. Downloading the cached version is faster and the cache is updated daily. Defaults to FALSE.

Value

If only one type was selected, a data frame containing the data. Otherwise, a list containing the desired data frames ordered as in type.

Examples

df <- download_jhu_csse_covid19_data(silent = TRUE, cached = TRUE) df %>% dplyr::group_by(country) %>% dplyr::summarise(confirmed_cases = max(confirmed, na.rm = TRUE)) %>% dplyr::arrange(-confirmed_cases) %>% dplyr::top_n(10)
#> Selecting by confirmed_cases
#> # A tibble: 10 × 2 #> country confirmed_cases #> <chr> <dbl> #> 1 US 103802702 #> 2 India 44690738 #> 3 France 39866718 #> 4 Germany 38249060 #> 5 Brazil 37081209 #> 6 Japan 33320438 #> 7 Korea, South 30615522 #> 8 Italy 25603510 #> 9 United Kingdom 24658705 #> 10 Russia 22075858
df <- download_jhu_csse_covid19_data( type = "us_county", silent = TRUE, cached = TRUE ) df %>% dplyr::filter(!is.na(state)) %>% dplyr::group_by(state) %>% dplyr::summarise(deaths = max(deaths, na.rm = TRUE)) %>% dplyr::arrange(-deaths) %>% dplyr::top_n(10)
#> Selecting by deaths
#> # A tibble: 10 × 2 #> state deaths #> <chr> <dbl> #> 1 California 35545 #> 2 Florida 25840 #> 3 Arizona 18846 #> 4 Illinois 15289 #> 5 New York 14219 #> 6 Texas 11623 #> 7 Nevada 9313 #> 8 Michigan 9107 #> 9 Puerto Rico 5823 #> 10 Pennsylvania 5549