if_else()
Sometimes the way data comes to you isn’t the way you’d like to use it. For example, maybe you get a dataset with months numbered as “1, 2, 3…” but you’d prefer to have them labeled “January, February, March…”. We can use commands that act like an “if else” statement, which works by changing data if it meets a certain condition. We select certain values to change IF they match the condition and set them equal to something new. If they don’t meet the condition (ELSE) we set them equal to something different or leave them alone. First we’ll learn how to use if_else() then we’ll expand it with case_when(). We’ll use both of these functions in conjunction with the mutate() function to change our column values.
if_else()
if_else() works by determining TRUE or FALSE on whether values meet a condition. Then you can tell R how you’d like to change the value based on that determination. We’ll illustrate with this simple dataset of observations of a traffic light:
Now let’s say we want to change these values to indicate whether you should “go” or “stop”. So every value of “green” will become “go” and values of “red” or “yellow” will become “stop”. In terms of if else, we would read this as IF the color is “green” it becomes “go”, ELSE it becomes “stop”. The syntax for if_else() is to put the conditional statement you are evaluating, then the value you want it changed to if it is TRUE, then the value you want it changed to if it is FALSE.
This would be the code:
Notice that for the conditional statement we use == to evaluate whether values in the specified column, light, match our condition of green. Since we’re matching a value, “green” should go in quotes. “go” and “stop” should also be in quotes because those are the character values we want to fill our column.
If you remember how mutate() works you’ll know we also could have made this conditional evaluation into its own column. Let’s alter the code to have the columns obs and light and create a new column called move for our “go”/“stop” values.
Alter the code below to do this:
Sometimes you don’t always want to change every value in a column. For example, let’s say you have incorrectly entered data that you need to fix, then you can use if_else() to change the things that evaluate TRUE and leave alone the things that evaluate FALSE.
We’ll use a common example of changing mispelled data:
Here two rows have a misspelling of the word “weird”, so we want to change those but leave everything else the same. We can leave things alone by having FALSE conditions be equal to their original column value. So, we just state the column name in the FALSE position. Note that it is not in quotes because we don’t want the word “word”, we want the value from the column.
Now you should see “weird” spelled correctly each time and the other values, including the correctly spelled “weird”, are left as is.
Non == Conditionals
For the conditional statement, we don’t always have to just match a value, it can be any condition. So here we’ll change values based on whether the are greater or less than zero. The data:
The code:
We can also you if_else() to change NA values to 0 if that is useful in our data.
The data:
See if you can use the conditional is.na() to change the missing values to 0:
When using a numeric value as the replacement, it doesn’t need to be in quotes. So leave 0 out of quotes. If it is in quotes it will treat it is a character, but we want our column to stay numeric.
Using Logical Operators
Let’s go back to our traffic light example and do the reverse of what we initially did. Let’s say this time we want to change “red” or “yellow” to be “stop” and have the ELSE condition be “go”. (Obviously, it’s easier to do the other way, but this is just for show.)
We can use the logical operator | (OR) to help us do this. We just fully write out both conditionals separated by |. > When your lines start getting long or complicated, it’s best practice to go to a new line so your code is easier to read.
if_else() works well if you only have two values you want to change things to. When you start adding on more things to change, it is easier to use case_when(). ```