Missing values, indicated by (or coerced to) NA
in R, are common in environmental data due to equipment malfunction, survey non-response, human error, resource limitations, and any number of other unforeseen hiccups that can occur during data collection. Despite their ubiquity, NAs are rarely considered in exploratory data analysis, and are commonly “dealt with” (read: disappeared) by listwise deletion. Listwise deletion (in which any row with an NA
is removed) may be the best method for handling missings, but also omits valuable existing observations, reduces statistical power, and depending on the mechanism of missingness can increase bias in parameter estimates. Exploring and thinking critically about missing data is an important and often overlooked part of exploratory data analysis that can help us to understand what data are missing and why, so that we choose an appropriate method for handling them.