Download current country-level World Bank data — download_wbank

Uses the wbstats package to download recent country-level data from the World Bank (https://data.worldbank.org).

download_wbank_data(
  vars = c("SP.POP.TOTL", "AG.LND.TOTL.K2", "EN.POP.DNST", "EN.URB.LCTY",
    "SP.DYN.LE00.IN", "NY.GDP.PCAP.KD"),
  labels = c("population", "land_area_skm", "pop_density", "pop_largest_city",
    "life_expectancy", "gdp_capita"),
  var_def = FALSE,
  silent = FALSE,
  cached = FALSE
)

Arguments

vars	Specify the data items that you want to retrieve.
labels	Give somewhat more informative variable names for the output data frame. Has to match the length of `vars` and needs to contain valid variable names.
var_def	Do you want to retrieve a data frame containing the World Bank data definitions along with the actual data? Defaults to `FALSE`.
silent	Whether you want the function to send some status messages to the console. Might be informative as downloading will take some time and thus defaults to `TRUE`.
cached	Whether you want to download the cached version of the data from the tidycovid19 Github repository instead of retrieving the data from the authorative source. Downloading the cached version is faster and the cache is updated daily. Defaults to `FALSE`.

Value

If var_def = FALSE, a data frame containing the data and a timestamp variable indicating the time of data retrieval. Otherwise, a list including the data frame with the data followed by a data frame containing the variable definitions.

Examples

df <- download_wbank_data(silent = TRUE, cached = TRUE)
df %>%
  dplyr::select(country, population) %>%
  dplyr::arrange(-population)
#> # A tibble: 217 × 2
#>    country            population
#>    <chr>                   <dbl>
#>  1 India              1417173173
#>  2 China              1412175000
#>  3 United States       333287557
#>  4 Indonesia           275501339
#>  5 Pakistan            235824862
#>  6 Nigeria             218541212
#>  7 Brazil              215313498
#>  8 Bangladesh          171186372
#>  9 Russian Federation  144236933
#> 10 Mexico              127504125
#> # ℹ 207 more rows

lst <- download_wbank_data(silent = TRUE, cached = TRUE, var_def = TRUE)
lst[[1]] %>%
  tidyr::pivot_longer(5:10, names_to = "wbank_variable", values_to = "values") %>%
  dplyr::group_by(wbank_variable) %>%
  dplyr::summarise(non_na = sum(!is.na(values)))
#> # A tibble: 6 × 2
#>   wbank_variable   non_na
#>   <chr>             <int>
#> 1 gdp_capita          210
#> 2 land_area_skm       216
#> 3 life_expectancy     210
#> 4 pop_density         216
#> 5 pop_largest_city    153
#> 6 population          217