stringr::str_detect()

Function of the Week:

Submission Instructions

Please sign up for a function here: https://docs.google.com/spreadsheets/d/1-RWAQTlLwttjFuZVAtSs8OiHIwu6AZLUdWugIHHTWVo/edit?usp=sharing

For this assignment, please submit both the .Rmd and the .html files. I will add it to the website. Remove your name from the Rmd if you do not wish it shared. If you select a function which was presented last year, please develop your own examples and content.

stringr::str_detect()

In this document, I will introduce the str_detect() function and show what it’s for.

#load tidyverse up
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.6     v dplyr   1.0.7
## v tidyr   1.1.4     v stringr 1.4.0
## v readr   2.1.1     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

What is it for?

Given a vector, str_detect() looks for and returns whether there is a presence or absence of a pattern in thact vector. The str_detect() function takes two arguments. The first is a vector, and the second is the pattern you are looking for.

flowers <- c("tulip", "iris", "hibicus", "poppy", "daisy")

str_detect(flowers, "i")
## [1]  TRUE  TRUE  TRUE FALSE  TRUE
str_detect(flowers, "^i")
## [1] FALSE  TRUE FALSE FALSE FALSE
str_detect(flowers, "y$")
## [1] FALSE FALSE FALSE  TRUE  TRUE
str_detect(flowers, "[aeiou]")
## [1] TRUE TRUE TRUE TRUE TRUE

str_detect() can also be used in combination with filter() to subset a data set.

df <- data.frame(Country = c("Australia", "Canada", "Japan", "India", "Spain"))
df
##     Country
## 1 Australia
## 2    Canada
## 3     Japan
## 4     India
## 5     Spain
df %>% filter(str_detect(Country, "Australia|Canada"))
##     Country
## 1 Australia
## 2    Canada
df %>% filter(str_detect(Country, "Australia|Canada", negate = TRUE))
##   Country
## 1   Japan
## 2   India
## 3   Spain

Is it helpful?

It is really helpful to look for patterns in a large dataset. However, it is equivalent togrepl(). It can also be repalced by other functions.