3.6 Common arithmetic functions - sum()
, mean()
, median()
, etc.
Statistics is an R strength, so if there is an arithmetic function you can think of, it probably exists in R.
The most common ones are:
sum()
mean()
median()
min()
,max()
sd()
- standard deviationIQR()
- interquartile range
An import thing to remember relates to missing data: if any of your values is NA (not available; missing), these functions will return an NA.
Either deal with your missing values beforehand (recommended) or add the na.rm = TRUE
argument into any of the functions to ask R to ignore missing values.
More discussion and examples around missing data can be found in Chapters 2 and 11.
## [1] NA
## [1] 3
Overall, R’s unwillingness to implicitly average over observations with missing values should be considered helpful, not an unnecessary pain.
If you don’t know exactly where your missing values are, you might end up comparing the averages of different groups.
So the na.rm = TRUE
is fine to use if quickly exploring and cleaning data, or if you’ve already investigated missing values and are convinced the existing ones are representative.
But it is rightfully not a default so get used to typing na.rm = TRUE
when using these functions.