Other Joins

There are a couple other useful types of joins for combining tables: full_join() and inner_join(). Whereas with left and right joins one table stays exactly the same and just has information added, these joins will create a new table with aspects of both inputs.

full_join()

A full join is for combining all the data in both tables. Let’s go back to our fruit tables.

Both cost and color contain some information that the other table does not. If we want to retain all information we can perform a full join to keep all rows from each table, making matches where possible.

The code for a full join works the same as left or right just with the command full_join(). See if you can write the code:

You’ll see that now we have eight rows, which is longer than either of the starting tables, and we have NAs in both the cost and color columns since the two tables don’t have complete overlap.

inner_join()

If we want to not have any NAs in our final table and only keep the variables for which a key exists in both datasets, we use an inner join.

You’ll see now that our resulting dataset is only four rows because it only includes fruits that are in both datasets.

anti_join()

The last of our join functions is anti_join(). An anti join will find all the rows in the first table listed that don’t have a match in the second table listed. This doesn’t combine any data from our tables but it’s useful for finding out where datasets do and don’t have overlap.

Here’s how to find all the rows in cost that don’t have a match in color:

You’ll see that only peach appears because it is the only key found in cost but not in color. Note that anti_join() is only giving you information about the first dataset as output. It doesn’t show all the rows from each dataset that don’t match, just from whichever dataset is listed first.

Rewrite the code to find the values of color that don’t appear in cost.

One good use is running an anti join to check that both your datasets contain the same information; here you’d want the result of the anti join to be null.