dplyr::slice_min/max()
Function of the Week: slice_min() & slice_max()
Isabel English
2022-03-16
slice_min() & slice_max()
In this document, I will introduce the
slice_min() & slice_max()
functions and show what they’re used for.
Loading my data
I always enjoy looking at the datasets FiveThirtyEight uses, because they display data so nicely. For this presentation, I chose to look at Club Soccer Predictions–specifically, the English (Barclay’s) Premier League. To simplify this example, I grabbed a subset of data that included only the Premier League teams and their current ranking data.
spi_global_rankings <- read_excel("20220316_FiveThirtyEight_soccer-spi_data/spi_global_rankings.xlsx",
sheet = 1,
skip = 0,
na = "NA")
barclays_premier <- subset(spi_global_rankings, league == "Barclays Premier League")
print(barclays_premier)
## # A tibble: 20 × 7
## rank prev_rank name league off def spi
## <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 1 1 Manchester City Barclays Premier … 2.9 0.2 93.7
## 2 3 3 Liverpool Barclays Premier … 2.95 0.29 92.8
## 3 4 4 Chelsea Barclays Premier … 2.4 0.26 89.3
## 4 11 12 Arsenal Barclays Premier … 2.22 0.52 82.2
## 5 14 15 Tottenham Hotspur Barclays Premier … 2.34 0.67 80.3
## 6 17 17 Manchester United Barclays Premier … 2.19 0.67 78.3
## 7 22 28 Aston Villa Barclays Premier … 2.03 0.64 76.7
## 8 30 29 West Ham United Barclays Premier … 2.01 0.73 74.3
## 9 32 27 Brighton and Hove Albion Barclays Premier … 1.85 0.62 74.2
## 10 34 45 Wolverhampton Barclays Premier … 1.72 0.55 73.8
## 11 38 41 Crystal Palace Barclays Premier … 1.89 0.69 73.1
## 12 50 47 Leicester City Barclays Premier … 2.07 0.96 69.8
## 13 55 48 Southampton Barclays Premier … 1.94 0.89 69.2
## 14 61 66 Brentford Barclays Premier … 1.78 0.85 67.1
## 15 65 68 Newcastle Barclays Premier … 1.76 0.85 66.7
## 16 74 70 Burnley Barclays Premier … 1.62 0.83 64.4
## 17 78 74 Everton Barclays Premier … 1.73 0.95 63.8
## 18 91 78 Leeds United Barclays Premier … 1.85 1.14 61.6
## 19 103 100 Watford Barclays Premier … 1.65 1.08 59
## 20 151 146 Norwich City Barclays Premier … 1.51 1.19 53.0
The rankings appear to be missing spots, but we have to remember that this was pulling a subset of the greater dataset which was ranking all International Club Teams (of which there are 640).
What is it for?
The data is still in rank order by Soccer Power Ranking (SPI), so if I just pull the top 5 teams, I’ll know who the top teams to watch in the Premier League are. I can do this using
slice_min()
for best (lowest #) ranking:
slice_min(barclays_premier, order_by = rank, n = 5, with_ties = TRUE)
## # A tibble: 5 × 7
## rank prev_rank name league off def spi
## <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 1 1 Manchester City Barclays Premier League 2.9 0.2 93.7
## 2 3 3 Liverpool Barclays Premier League 2.95 0.29 92.8
## 3 4 4 Chelsea Barclays Premier League 2.4 0.26 89.3
## 4 11 12 Arsenal Barclays Premier League 2.22 0.52 82.2
## 5 14 15 Tottenham Hotspur Barclays Premier League 2.34 0.67 80.3
I can also do this by using
slice_max()
for highest SPI:
slice_max(barclays_premier, order_by = spi, n = 5, with_ties = TRUE)
## # A tibble: 5 × 7
## rank prev_rank name league off def spi
## <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 1 1 Manchester City Barclays Premier League 2.9 0.2 93.7
## 2 3 3 Liverpool Barclays Premier League 2.95 0.29 92.8
## 3 4 4 Chelsea Barclays Premier League 2.4 0.26 89.3
## 4 11 12 Arsenal Barclays Premier League 2.22 0.52 82.2
## 5 14 15 Tottenham Hotspur Barclays Premier League 2.34 0.67 80.3
Is it helpful?
These functions are helpful when you want to quickly visualize the top or bottom of a list, maybe to have a quick understanding of your data and/or how it is being organized while you’re working, but I imagine that combining them with some more filtering would give them even more power. For example, highlighting these top 5 Premier League teams, but from the original data of 640 teams. Or perhaps the top students from each of 6 classes.