Trend Lines

If you make a scatterplot, you might want to add a trend line to it. Right the code to make a scatterplot showing bill depth on the x-axis and bill length on the y-axis.

If we want to add a trend line we just add another line of code using the command geom_smooth(). For geom_smooth, you need to specify what type of line you want to add, often we’re just wanting to add a straight line that best fits the data. This line is technically a linear model of our data, so we want to tell geom_smooth to use β€œlm”. (Other options might be something like an exponential curve or a poisson curve.)

You’ll notice that the blue trend line also has a gray zone around it. This is a 95% confidence interval shown around the line. The default is to have it on, but you can turn it off by saying β€œse = FALSE”, where se stands for standard error.

But again here, we have the problem of all our penguin species being mushed together, so we should probably separate them by color. Add β€œcolor = species” to the code to help make the graph more informative.

Remember, use color for lines and points and fill for shapes you want filled in.

Note that R automatically knew to make separate trend lines for each species. Nice. Also, note that now we see the opposite trend than when the data was analyzed all together. This shows the importance of including meaningful separations in your data.