Building on a current working papers of David Veenman and myself, and using ‘fancy’ animations, we discuss the issues related to non-random outliers in empirical archival research work and whether robust regression methods can be viewed as a pancea (spoiler: they can’t). Building on these insights we suggest a work-flow for archival work that helps us to take outlier treatment to the next level.
I recently included the new Our World in Data data on Covid-19 vaccination progress around the world in the {tidycovid19} package. What was meant to be a short info post for package users turned into a mini case on “outliers”. See for yourself