Before you start…
Jump back into the project you created for the previous sessions exercises and then create a new R script. Save it as soon as you make it and give it a good name like ex_2_day_2.R
or spencer.R
and you’ll be ready to go!
Cruising right through!
This is the life! Smell that sea-air! Feel that wind in your antenna. While a research vessel is no cruise ship, it’s pretty nice when you’re used to hot kitchens and stuffy backrooms that smell oddly like stale yogurt.
The only problem is…what’s a baking bot turned analytic bot supposed to do while we wait to get more data? We’ve already looked at the data…what if we get bored?
Exercise 1: filtering data
The goal of this exercise is to pare down a dataset using the filter function with as many conditions as you need in order to filter the data down to a single penguin. This game is a throw back to those days on long road trips when you were down to one last game… the game of Guess Who.
You were BORED because you had already played all the games, done all the puzzles, no one would stop for snacks, and you were still miles from home. So you began, Am I bigger than a bread box? Am I the color green? In this exercise you are writing code to winnow down the list of possibilities. FUN STUFF!
Using the
penguins
dataset in thepalmerpenguins
package, use the filter function (with multiple conditions if necessary) to find the penguin that meets one of the following sets of criteria. What is its secret ID number?
- I am a female penguin living on Biscoe Island. I was observed in the latest year and my friends tell me I have a very handsome bill depth of 17.7 mm. Who am I?
- I am a male Chinstrap penguin weighing exactly 4000 g or maybe more. Probably more. My bill is a bit over 50 mm long, but I am most proud of my short flippers, which are less than 200 mm long. Who am I?
- Hi! I am a female Gentoo penguin that likes to wear colorful hats. My bill length is a smidgeon under 45 mm long. Last time I checked, I weighed over 5000 g but that was quite a while ago. Who am I?
Load the penguin data
Add a top secret ID column with mutate
library(tidyverse)
library(palmerpenguins)
penguins <- penguins
penguins <- mutate(penguins,
secret_id = 1:n())
Ack! Typing is hard. To help avoid typing errors it can help to copy column names. You can use
names(penguins)
orglimpse(penguins)
to print the columns to the console and copy the column name that you need.
🐧 Penguin 1
I am a female penguin living on Biscoe Island. I was observed in the latest year and my friends tell me I have a very handsome bill depth of 17.7 mm.
p1 <- penguins %>% filter(sex == "female", island == _______, _______ _______ )
p1 <- penguins %>% filter(sex == "female", island == "Biscoe", year == 2009, bill_depth_mm == 17.7)
🐧 Penguin 2
I am a male Chinstrap penguin weighing exactly 4000 g or maybe more. Probably more. My bill is a bit over 50 mm long, but I am most proud of my short flippers, which are less than 200 mm long.
p2 <- penguins %>% filter(sex == _______ , species == _________ , body_mass_g >= ________ , _______ _______ , _______ _______ )
p2 <- penguins %>% filter(sex == "male" , species == "Chinstrap" , body_mass_g > 4000, bill_length_mm > 50, flipper_length_mm < 200)
🐧 Penguin 3
I am a female Gentoo penguin that likes to wear colorful hats. My bill length is a smidgeon under 45 mm long. Last time I checked, I weighed over 5000 g but that was quite a while ago.
p3 <- penguins %>% filter(sex == _______ , species == _________ , _______ _______ , _______ _______ )
p3 <- penguins %>% filter(sex == "female" , species == "Gentoo" , bill_length_mm < 45, body_mass_g > 5000)
For a more advanced challenge, try to isolate these penguins.
- I forgot my name and I really do not know many details about myself. I know I am an Adelie penguin that lives on Torgersen, but all my physical measurements are missing.
- I have the lowest weight of the male penguins. Who am I?
- I am the heaviest female Adelie penguin. Who am I?
- I have the longest flippers of the Adelie penguins on Dream Island. Who am I?
🐧 Penguin 4
I forgot my name and I really do not know many details about myself. I know I am an Adelie penguin that lives on Torgersen, but all my physical measurements are missing.
# is.na() checks if a columns value is missing p4 <- penguins %>% filter(species == _______, island == _______ , is.na( __________ ))
p4 <- penguins %>% filter(species == "Adelie", island == "Torgersen" , is.na(bill_length_mm))
🐧 Penguin 5
I have the lowest weight of the male penguins.
# Filter by sex first p5 <- penguins %>% filter(sex == _______ ) # And then pipe it to filter by body_mass_g # Remember to use `na.rm = TRUE` to account for NA values p5 <- penguins %>% filter(sex == ________ ) %>% filter(body_mass_g == _________ )
p5 <- penguins %>% filter(sex == "male" ) %>% filter(body_mass_g == min(body_mass_g, na.rm = TRUE))
Exercise 2: Sorting with case_when( )
I love grouping things, which was part of the reason I loved being a baking bot. I could spend all day sorting measuring spoons and measuring cups into groups like “tiny”, “super tiny”, and “the tiny-est of the tiny”. Maybe I can put my grouping skills to use on the penguin data?
Let’s use case_when()
to add some sorting columns to the data. Pick one of the options below to split the penguins into various groups based on their measurements.
- Assign the penguins a value for the new column
flipper_group
based on the following criteria:- First check if flipper length is greater than 210 mm, if it is they are
"big flips"
- Otherwise they are
"small flips"
- First check if flipper length is greater than 210 mm, if it is they are
- Assign the penguins a value for
weight_class
based on the following criteria:- First check if body mass is less than or equal to 3500 g, if so they are
"littles"
- Then check if body mass is less than 4300 g, then they are
"middles"
- Otherwise they are
"biggles"
- First check if body mass is less than or equal to 3500 g, if so they are
- Assign the penguins a value for
bill_class
based on the following criteria:- First check if bill depth is missing, then they are
"unknown"
- Then check if bill depth is more than or equal to 18, then they are
"deep"
- Then check if bill depth is more than 16, then they are
"middling"
- Otherwise they are
"shallow"
- First check if bill depth is missing, then they are
flipper_group
Assign the penguins a value for the new column flipper_group
based on the following criteria: 1.) First check if flipper length is greater than 210 mm, if it is they are "big flips"
2.) Otherwise they are "small flips"
penguins %>% mutate(flipper_group = case_when(flipper_length_mm > _______ ~ ________, TRUE ~ _________ ))
penguins %>% mutate(flipper_group = case_when(flipper_length_mm > 210 ~ "big flips", TRUE ~ "small flips" ))
BON VOYAGE!