In this document, I will introduce the fct_infreq() function and show what it’s for.

#load tidyverse up
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.6     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.4     ✓ stringr 1.4.0
## ✓ readr   2.1.1     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
#example dataset

What is it for?

This function changes the order of the levels by number of observations within each level, starting with the largest number of observations.

Arguments * f * A factor

  • ordered
    • A logical which determines the “ordered” status of the output factor. NA preserves the existing status of the factor.
f <- (c("red", "yellow", "yellow", "blue", "blue", "yellow", "red", "yellow", "blue", "red", "yellow", "rainbow", "yellow", "blue"))
summary(f) #According to this, we should see yellow in our data set first, then blue, then red, then rainbow
##    blue rainbow     red  yellow 
##       4       1       3       6
change_f <- fct_infreq(f)
##  yellow    blue     red rainbow 
##       6       4       3       1

Is it helpful?

I don’t think this is super relevant because if you were to need to automatically set the order of a factor this way from a large data set, R actually automatically does this. So this is really only helpful when we create our own vectors or datasets which I rarely have done.


##    Biscoe     Dream Torgersen 
##       168       124        52

Maybe it can be helpful if you purposefully re-level your vector, but then want it back to be ordered by frequency again (R starts with largest frequency).

## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
##     chisq.test, fisher.test
penguins$island_relevel <- relevel(penguins$island, "Torgersen")

penguins %>% 
##  island_relevel   n   percent
##       Torgersen  52 0.1511628
##          Biscoe 168 0.4883721
##           Dream 124 0.3604651

Now back to original with function.

penguins$change_f <- fct_infreq(penguins$island_relevel)

penguins %>% 
##   change_f   n   percent
##     Biscoe 168 0.4883721
##      Dream 124 0.3604651
##  Torgersen  52 0.1511628