<!-- https://evamaerey.github.io/flipbooks/flipbook_recipes#49 --> class: title-slide, center, bottom # Introduction to ggplot2 <figure> <img src="img/sn_logo.png" alt="Logo of Sentinel North. Shows a stars in the middle." width = "450"/> </figure> May 19, 2022 (updated: 2022-05-19) ## Philippe Massicotte --- name: hello class: middle, center, inverse <img src="img/myname.png" width="300px"/> <b>Research assistant at Takuvik (Laval University)</b><br> <center> <figure> <img src="img/takuvik.svg" alt="Logo of Takuvik." width = "300"/> </figure> </center> <small>*Remote sensing, modelling, data science, data visualization*</small><br> [
@pmassicotte](https://github.com/PMassicotte) [
@philmassicotte](https://twitter.com/apreshill) [
www.pmassicotte.com](https://www.pmassicotte.com) --- # Outline - Brief introduction to data visualization - General advice to make graphics - `ggplot2` basic plots - Histograms and bar plots - Points and lines plots - Boxplots - `ggplot2` aesthetics and appearance - Color, size - Axes and titles - Faceting (small multiples) - Overview of the `theme()` function --- # Not covered today -- - Data importation, manipulation and transformation - `data.table` - https://rdatatable.gitlab.io/data.table/ - `readr` - https://r4ds.had.co.nz/data-import.html - `readxl` - https://readxl.tidyverse.org/ - `dplyr`, `tidyr` - https://r4ds.had.co.nz/tidy-data.html -- - Data visualization theory - [Fundamentals of Data Visualization](https://clauswilke.com/dataviz/) - [What to consider when choosing colors for data visualization](https://blog.datawrapper.de/colors/) - [What to consider when visualizing data for colorblind readers](https://blog.datawrapper.de/colorblindness-part2/) - [Choosing Color Palettes for Data Visualization That Are Accessible for Most Audiences](https://www.youtube.com/watch?v=PstHyodalWg) --- # Data visualization <center> <figure> <img src="https://blog.rstudio.com/2019/11/18/artist-in-residence/horst-eco-r4ds.png" width="800"> <figcaption>Artwork by <a href="https://twitter.com/allison_horst?s=20">@allison_horst</a></figcaption> </figure> </center> --- # Data visualization -- - An important aspect of data sciences: - **Communicate information clearly and efficiently to the community.** -- - Powerful tool to discover patterns in the data: - It makes complex data more accessible: **reveal the data**. -- - Bad graphics can be a reason for paper rejection: - **Readers should rapidly understand the message you are trying to convey.** -- - A picture is worth a thousand words: - **Always, always, always plot your data!** - When possible, replace tables with figures. --- # Datasaurus Dozen <img src="index_files/figure-html/unnamed-chunk-1-1.svg" style="display: block; margin: auto;" /> -- **These 13 datasets have the same statistical properties (*mean*, *variance*, *correlation*).** However, they look quite different! --- # Visualization to convey your message It can be difficult to grasp the information contained in a table. .pull-left[ <table class="table" style="font-size: 18px; margin-left: auto; margin-right: auto;"> <caption style="font-size: initial !important;">Sea ice extent in the Arctic between 1978 and 2021.</caption> <thead> <tr> <th style="text-align:left;"> date </th> <th style="text-align:right;"> extent </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 3in; "> 1978-10-26 </td> <td style="text-align:right;width: 3in; "> 10.231 </td> </tr> <tr> <td style="text-align:left;width: 3in; "> 1978-10-28 </td> <td style="text-align:right;width: 3in; "> 10.420 </td> </tr> <tr> <td style="text-align:left;width: 3in; "> 1978-10-30 </td> <td style="text-align:right;width: 3in; "> 10.557 </td> </tr> <tr> <td style="text-align:left;width: 3in; "> 1978-11-01 </td> <td style="text-align:right;width: 3in; "> 10.670 </td> </tr> <tr> <td style="text-align:left;width: 3in; "> 1978-11-03 </td> <td style="text-align:right;width: 3in; "> 10.777 </td> </tr> <tr> <td style="text-align:left;width: 3in; "> 1978-11-05 </td> <td style="text-align:right;width: 3in; "> 10.968 </td> </tr> <tr> <td style="text-align:left;width: 3in; "> 1978-11-07 </td> <td style="text-align:right;width: 3in; "> 11.080 </td> </tr> <tr> <td style="text-align:left;width: 3in; "> 1978-11-09 </td> <td style="text-align:right;width: 3in; "> 11.189 </td> </tr> <tr> <td style="text-align:left;width: 3in; "> 1978-11-11 </td> <td style="text-align:right;width: 3in; "> 11.314 </td> </tr> <tr> <td style="text-align:left;width: 3in; "> 1978-11-13 </td> <td style="text-align:right;width: 3in; "> 11.460 </td> </tr> </tbody> </table> ] .pull-right[ <center> <img src="img/arctic_seaice_extent.png" alt="Map showing the Arctic sea ice extent in October 2021." height="400"/> </center> ] <small> [Data from NSIDC](ftp://sidads.colorado.edu/DATASETS/NOAA/G02135/north/daily/data/N_seaice_extent_daily_v3.0.csv) </small> --- # Visualization to convey your message Graphics, on the other hand, can help to convey your message. <img src="index_files/figure-html/unnamed-chunk-3-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Visualization to convey your message The same data but presented differently. <img src="index_files/figure-html/unnamed-chunk-4-1.svg" style="display: block; margin: auto;" /> Interesting reading: [Arctic Sea Ice Extent Second Highest in 18 Years at the end of 2021](https://www.severe-weather.eu/global-weather/arctic-sea-ice-second-highest-18-years-end-2021-rrc/) --- # Visualization to convey your message The same data but presented differently. <img src="index_files/figure-html/sie-rect-plot-1.svg" style="display: block; margin: auto;" /> --- class: inverse, center, middle # The anatomy of misleading graphs <center> <figure> <img style="margin:0px auto;display:block" src="img/unsplash/isaac-smith-AT77Q0Njnt0-unsplash.jpg" alt="A handmade graph on a piece of paper." width = "450"/> </figure> <figcaption>Photo by <a href="https://unsplash.com/@isaacmsmith?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Isaac Smith</a> on <a href="https://unsplash.com/s/photos/graph?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a> </figcaption> </center> --- # The anatomy of misleading graphs > In statistics, a misleading graph, also known as a distorted graph, is a graph that misrepresents data, constituting a misuse of statistics and with the result that an incorrect conclusion may be derived from it (Wikipedia). -- - Defining what makes a good graph can be subjective to a certain extent. -- - Nevertheless, there are a couple of elements that characterize misleading/bad graphs: - Improper axis scaling - Truncated graphs - Unnecessary use of 3D - Bad choice of colours --- ## The good, the bad and the ugly of data viz There are many resources online that provide advice for making good graphs and avoiding pitfalls: - [Data is ugly](https://www.reddit.com/r/dataisugly/) - [Data Visualization Examples: Good, Bad and Misleading](https://www.syntaxtechs.com/blog/data-visualization-examples) - [10 good and bad examples of data visualization](https://www.polymersearch.com/blog/10-good-and-bad-examples-of-data-visualization) - [15 Misleading Data Visualization Examples](https://rigorousthemes.com/blog/misleading-data-visualization-examples/) --- ## Did Apple presented a misleading graph? Back in 2013, Tim Cook presented this chart showing cumulative iPhone sales. <center> <figure> <img src="img/iphone-sales1.png" width="600"> </figure> <figcaption>Credit: <a href="https://qz.com/122921/the-chart-tim-cook-doesnt-want-you-to-see/">https://qz.com/</a> </figcaption> </center> --- ## Did Apple presented a misleading graph? > Using data from Apple’s own quarterly reports filed with the Securities and Exchange Commission, I made a better chart. <center> <figure> <img src="img/iphone-sales2.png" width="600"> </figure> <figcaption>Credit: <a href="https://qz.com/122921/the-chart-tim-cook-doesnt-want-you-to-see/">https://qz.com/</a> </figcaption> </center> --- # Exaggerating variations - It is easy to exaggerate variation in the data by manipulating the range of an axis. - In this example, changing the maximum of the **y-axis** in panel **B** suggests that the increase is less steep than in **A**. <img src="index_files/figure-html/unnamed-chunk-5-1.svg" style="display: block; margin: auto;" /> --- # Aspect ratio These three graphs show the same data with different aspect ratios which considerably influences the visual perception. <img src="index_files/figure-html/unnamed-chunk-6-1.svg" style="display: block; margin: auto;" /> --- # Aspect ratio .pull-left[ - Choosing the appropriate aspect ratio for the data can help to distinguish certain features. - To illustrate it, we will use **yearly sunspot numbers from 1700-1988** (data: [WDC-SILSO, Royal Observatory of Belgium](http://www.sidc.be/silso/datafiles)). - Examples and advice stole from https://graphworkflow.com/enhancement/aspect/. ] .pull-right[ <center> <figure> <img style="margin:0px auto;display:block" src="img/unsplash/nasa-JHyiw_dpALk-unsplash.jpg" alt="Close-up image of the sun with visible sunspots." width = "350"/> </figure> <figcaption>Photo by <a href="https://unsplash.com/@nasa?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">NASA</a> on <a href="https://unsplash.com/s/photos/sun?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a> </figcaption> </center> ] --- # Aspect ratio On the following graph, we can observe what seems to be a solar cycle of sunspot activity repeated every 11-13 years. <img src="index_files/figure-html/sunspot1-1.svg" style="display: block; margin: auto;" /> --- # Aspect ratio Using a very wide aspect ratio reveals an interesting pattern. <img src="index_files/figure-html/sunspot2-1.png" style="display: block; margin: auto;" /> -- Sunspot cycles <b><font color="#E15554">increase more rapidly</font></b> <b><font color="#4D9DE0">than they decrease</font></b> when the number of sunspots is higher. <img src="index_files/figure-html/sunspot3-1.png" style="display: block; margin: auto;" /> -- Although we can see the different types of cycles better, the graph looks too stretched. --- # Aspect ratio **A better option consists to split the data into subsets.** Here, each panel presents the data for a specific century. <img src="index_files/figure-html/sunspot4-1.png" width="50%" style="display: block; margin: auto;" /> --- # Truncated graphs - Truncated graphs (**especially barplot**) can be very misleading because the *y-axis* does not start at 0. - The graph in **panel A** suggests that Olivier's salary **is 10 times more** than Benjamin's. <img src="index_files/figure-html/unnamed-chunk-7-1.svg" style="display: block; margin: auto;" /> --- # Unnecessary use of 3D - 3D graphics are very rarely useful: - First, they break the data-to-ink ratio rule (*more about that in a moment*). - Secondly, they can distort the reality. - On the left graph, the visual perspective suggests that **Item C** is as large as **Item A**, whereas in fact, **Item A** is two times bigger than **Item C**. .pull-left[ <center> <figure> <img src="https://upload.wikimedia.org/wikipedia/commons/8/88/Misleading_Pie_Chart.png" width="300"> </figure> <figcaption>Credit: Wikipedia</figcaption> </center> ] .pull-right[ <center> <figure> <img src="https://upload.wikimedia.org/wikipedia/commons/8/87/Sample_Pie_Chart.png" width="300"> </figure> <figcaption>Credit: Wikipedia</figcaption> </center> ] --- # Improper axe scale labeling - There is nothing wrong to use a logarithm scaling (I do it very often), but be careful to annotate the axis adequately, so the viewer is aware of it. - **If not clearly specified, the user may assume that the data is presented on a linear scale.** - On a `log10` scale, a difference of 1 unit, represents 1 order magnitude. <img src="index_files/figure-html/unnamed-chunk-8-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Data-ink ratio The data-ink ratio is the proportion of ink that is used to present actual data compared to the total amount of ink used in the entire display. <br> $$ \verb|data-ink ratio| = \frac{\verb|Data-ink|}{\verb|Total ink used to print the graphic|} $$ <br> The data-to-ink ratio should be kept as high as possible. --- # Data-ink ratio Here, both graphs contain the same data. However, the grey background in **panel A** is unnecessary. <img src="index_files/figure-html/unnamed-chunk-9-1.svg" style="display: block; margin: auto;" /> --- class: partial # Sometimes rules are made to be broken <img src="index_files/figure-html/unnamed-chunk-10-1.svg" style="display: block; margin: auto;" /> --- # Further resources - There are plenty of resources online about **the do's and don'ts** of data visualization. - Recommended reading: [Fundamentals of Data Visualization](https://clauswilke.com/dataviz/index.html) (freely available online). <center> <img src="img/fundamentals_data_visualization_cover.png" width="300"> </center> --- class: inverse, center, middle <center> <figure> <img src="https://github.com/allisonhorst/stats-illustrations/blob/master/rstats-artwork/ggplot2_masterpiece.png?raw=true" width="700"> <figcaption>Artwork by <a href="https://twitter.com/allison_horst?s=20">@allison_horst</a></figcaption> </figure> </center> --- # ggplot2 `ggplot2` is a system for declaratively creating graphics, based on [The Grammar of Graphics](https://amzn.to/2ef1eWp). > Presents a unique foundation for producing almost every quantitative graphic found in scientific journals, newspapers, statistical packages, and data visualization systems. <center> <figure> <img src="img/grammar_of_graphics.png" width="600"> <figcaption>Source: <a href="https://bit.ly/3JwHavs">Thomas de Beus</a></figcaption> </figure> </center> --- # ggplot2 `ggplot2` is not part of base R, so it needs to be installed. .left-column[ <center> <figure> <img src="https://raw.githubusercontent.com/tidyverse/ggplot2/master/man/figures/logo.png" width="300"> </figure> </center> ] .right-column[ ```r install.packages("ggplot2") ``` After the installation, you will have to load it. ```r library(ggplot2) ``` ] --- # ggplot2 .column-left[ - `ggplot2` is a very powerful tool. - The learning curve can be difficult, but **the time investment will eventually pay off.** ] .column-right[ <center> <figure> <img src="https://github.com/allisonhorst/stats-illustrations/blob/master/rstats-artwork/r_first_then.png?raw=true" height="500"> <figcaption> Artwork by <a href="https://twitter.com/allison_horst?s=20">@allison_horst</a> </figcaption> </figure> </center> ] --- # TidyTuesday TidyTuesday is a weekly challenge where people use (mostly) `ggplot2` to explore a new dataset. > A weekly data project aimed at the R ecosystem. As this project was borne out of the R4DS Online Learning Community and the R for Data Science textbook, an emphasis was placed on understanding how to summarize and arrange data to make meaningful charts with `ggplot2`, `tidyr`, `dplyr`, and other tools in the `tidyverse` ecosystem. <center> <img src="https://github.com/rfordatascience/tidytuesday/blob/master/static/tt_logo.png?raw=true" alt="drawing" width="450"/> <figcaption> <a href="https://twitter.com/search?q=%23TidyTuesday&src=hashtag_click">#TidyTuesday on <i class="fab fa-twitter"></i></a> </figcaption> </center> --- # TidyTuesday As already said, `ggplot2` can be intimidating at first. However, with some practice, you will be able to make stunning graphics. <center> <div class="row"> <div class="column"> <img src="https://github.com/loreabad6/TidyTuesday/blob/master/plot/2022_week_03.png?raw=true" alt="Who makes the best chocolate with Ecuadorian beans?" style="width:120%"> <figcaption> Graphic by <a href=https://twitter.com/loreabad6>@loreabad6</a> </figcaption> </div> <div class="column"> <img src="https://github.com/jkaupp/tidytuesdays/blob/437b47ad14bd3d6d66dca41e0a4a84c9d96716fb/2020/week44/tw44_plot.png?raw=true" alt="Wind power generation in the Maritimes" style="width:80%"> <figcaption> Graphic by <a href=https://twitter.com/jakekaupp">@jakekaupp</a> </figcaption> </div> <div class="column"> <img src="https://github.com/gkaramanis/aRtist/blob/main/genuary/2021/2021-3/2021-3.png?raw=true" alt="Person's face ploted using ggplot2" style="width:80%"> <figcaption> Graphic by <a href=https://twitter.com/geokaramanis?lang=en">@geokaramanis</a> </figcaption> </div> </div> </center> **Most people participating in the TidyTuesday challenges share their code on Github. It is a great way to learn more advanced techniques!** --- class: inverse, center, middle # Let's start with ggplot2! --- # Different types of visualizations There are many types of visualization to choose from to present data. The decision depends on the data itself and how you want to present it to your audience. <center> <img src="img/chart_types.png" alt="xxx" height="375"/> <figcaption> <a href="https://www.kdnuggets.com/2019/03/how-choose-right-chart-type.html">How to Choose the Right Chart Type</a> </figcaption> </center> --- # The Palmer Penguins: the new iris Data were collected and made available by [Dr. Kristen Gorman](https://www.uaf.edu/cfos/people/faculty/detail/kristen-gorman.php) and the [Palmer Station, Antarctica LTER](https://pal.lternet.edu/), a member of the [Long Term Ecological Research Network](https://lternet.edu/). <center> <figure> <img src="https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/man/figures/lter_penguins.png" width="600"> </figure> <figcaption> Artwork by <a href="https://twitter.com/allison_horst?s=20">@allison_horst</a> </figcaption> </center> --- # The Palmer Penguins The dataset contain data for **344 penguins** from **3 different species** (<pu><b>Chinstrap</b></pu>, <cy><b>Gentoo</b></cy> and <or><b>Adélie</b></or>) collected from **3 islands** in the Palmer Archipelago, Antarctica. <small> .pull-left[ - `species`: penguin species (*Adélie*, *Chinstrap* and *Gentoo*) - `island`: island in Palmer Archipelago, Antarctica (*Biscoe*, *Dream* and *Torgersen*) - `bill_length_mm`: bill length (millimeters) - `bill_depth_mm`: bill depth (millimeters) - `flipper_length_mm`: flipper length (millimeters) - `body_mass_g`: body mass (grams) - `sex`: penguin sex (*female*, *male*) - `year`: year of the study (2007, 2008 and 2009) ] </small> .pull-right[ <center> <figure> <img src="https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/man/figures/culmen_depth.png" width="700"> </figure> </center> ] --- # Installing the data .left-column[ <center> <figure> <img src="https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/man/figures/logo.png" width="300"> </figure> </center> ] .right-column[ You can install the data as follow: ```r # install.packages("remotes") remotes::install_github("allisonhorst/palmerpenguins") ``` ] --- # The palmer penguins data Let's load the data into R. <small> ```r library(palmerpenguins) penguins ``` ``` ## # A tibble: 344 × 8 ## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g ## <fct> <fct> <dbl> <dbl> <int> <int> ## 1 Adelie Torgersen 39.1 18.7 181 3750 ## 2 Adelie Torgersen 39.5 17.4 186 3800 ## 3 Adelie Torgersen 40.3 18 195 3250 ## 4 Adelie Torgersen NA NA NA NA ## 5 Adelie Torgersen 36.7 19.3 193 3450 ## 6 Adelie Torgersen 39.3 20.6 190 3650 ## 7 Adelie Torgersen 38.9 17.8 181 3625 ## 8 Adelie Torgersen 39.2 19.6 195 4675 ## 9 Adelie Torgersen 34.1 18.1 193 3475 ## 10 Adelie Torgersen 42 20.2 190 4250 ## # … with 334 more rows, and 2 more variables: sex <fct>, year <int> ``` </small> **Note that the `ggplot2` works best when using data frames.** --- # Understanding the ggplot2 syntax A `ggplot2` plot is built layer by layer by using the `+` operator. <center> <img src="img/ggplot2_structure.png" alt="drawing" width="1000"/> </center> <figcaption> The basic structure of a ggplot2 plot. </figcaption> --- # The geoms `geoms` is the abbreviation for *geometric objects* which are used to specify which type of graphic you want to produce (**boxplot**, **barplot**, **scatterplot**, **histogram**, ...). All `ggplot2` geoms start with the `geom_` prefix. <small> ``` ## [1] "geom_abline" "geom_area" "geom_bar" ## [4] "geom_bin_2d" "geom_bin2d" "geom_blank" ## [7] "geom_boxplot" "geom_col" "geom_contour" ## [10] "geom_contour_filled" "geom_count" "geom_crossbar" ## [13] "geom_curve" "geom_density" "geom_density_2d" ## [16] "geom_density_2d_filled" "geom_density2d" "geom_density2d_filled" ## [19] "geom_dotplot" "geom_errorbar" "geom_errorbarh" ## [22] "geom_freqpoly" "geom_function" "geom_hex" ## [25] "geom_histogram" "geom_hline" "geom_jitter" ## [28] "geom_label" "geom_line" "geom_linerange" ## [31] "geom_map" "geom_path" "geom_point" ## [34] "geom_pointrange" "geom_polygon" "geom_qq" ## [37] "geom_qq_line" "geom_quantile" "geom_raster" ## [40] "geom_rect" "geom_ribbon" "geom_rug" ## [43] "geom_segment" "geom_sf" "geom_sf_label" ## [46] "geom_sf_text" "geom_smooth" "geom_spoke" ## [49] "geom_step" "geom_text" "geom_tile" ## [52] "geom_violin" "geom_vline" ``` </small> --- class: inverse, center, middle # One variable graphics --- # One variable graphics In this section, we are going to see the two main types of one variable graphics: | **Graphic type** | **Geom** | **Description** | | ---------------- | ------------------ | ---------------------------------------- | | Histogram | `geom_histogram()` | Produces histograms for continuous data. | | Barplot | `geom_bar()` | Produces histograms for discrete data. | --- count: false # Histogram .panel1-histogram1-auto[ ```r *ggplot( * data = penguins, * mapping = aes(x = body_mass_g) *) ``` ] .panel2-histogram1-auto[ <img src="index_files/figure-html/histogram1_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Histogram .panel1-histogram1-auto[ ```r ggplot( data = penguins, mapping = aes(x = body_mass_g) ) + * geom_histogram() ``` ] .panel2-histogram1-auto[ <img src="index_files/figure-html/histogram1_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-histogram1-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-histogram1-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-histogram1-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false # Barplot .panel1-barplot1-auto[ ```r *ggplot( * data = penguins, * mapping = aes(x = island) *) ``` ] .panel2-barplot1-auto[ <img src="index_files/figure-html/barplot1_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Barplot .panel1-barplot1-auto[ ```r ggplot( data = penguins, mapping = aes(x = island) ) + * geom_bar() ``` ] .panel2-barplot1-auto[ <img src="index_files/figure-html/barplot1_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-barplot1-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-barplot1-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-barplot1-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle # Two variables graphics --- # Two variables graphics There are several types of charts with two variables. Here are the most used. | **Graphic type** | **Geom** | **Description** | | ---------------- | ---------------- | ------------------------------------------------------------- | | Barplot | `geom_col()` | Produces bars with heights that represent values in the data. | | Scatter plot | `geom_point()` | Produces scatter plot between `x` and `y`. | | Line plot | `geom_line()` | Produces line plot between `x` and `y`. | | Boxplot | `geom_boxplot()` | Boxplot between `x` and `y`. | --- # Barplot We used `geom_bar()` to visualize the number/count of cases for each value of `x`. If we want to represent values in the data (`y`), we can use `geom_col()`. For this example, I will use the `starwars` data that is included in the `dplyr` package. **This graph shows the top 20 tallest characters (in cm).** .pull-left[ ```r ggplot( data = sw, mapping = aes( * x = name, * y = height ) ) + geom_col() ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-17-1.svg" style="display: block; margin: auto;" /> ] -- **What could be done to improve this graph?** --- # Barplot In the previous graph, it was difficult to read the names on the `x-axis`. **A better way to present the data is by swapping the `x` and `y` axes.** -- .pull-left[ ```r ggplot( data = sw, mapping = aes( * x = height, * y = name ) ) + geom_col() ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-18-1.svg" style="display: block; margin: auto;" /> ] -- **Can we improve the graph furthermore?** --- # Ordering things We have made some improvements in the previous graph by swapping the axes. However, **it would be even better if the characters were sorted by their height** to make the plot easier to understand. -- One way to do it is with `forcats::fct_reorder()`. Note that the `forcats` library is not part of base R and must be installed separately. -- .pull-left[ <small> ```r ggplot( data = sw, mapping = aes( x = height, * y = forcats::fct_reorder(name, height) ) ) + geom_col() ``` </small> ] -- .pull-right[ <img src="index_files/figure-html/unnamed-chunk-19-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Scatter plot .panel1-scatterplot1-auto[ ```r *ggplot( * penguins, * aes( * x = bill_length_mm, * y = body_mass_g * ) *) ``` ] .panel2-scatterplot1-auto[ <img src="index_files/figure-html/scatterplot1_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Scatter plot .panel1-scatterplot1-auto[ ```r ggplot( penguins, aes( x = bill_length_mm, y = body_mass_g ) ) + * geom_point() ``` ] .panel2-scatterplot1-auto[ <img src="index_files/figure-html/scatterplot1_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-scatterplot1-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-scatterplot1-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-scatterplot1-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- # Adding a regression line Use `geom_smooth(method = "lm")` to add a linear regression line to your data. .pull-left[ <small> ```r ggplot( penguins, aes( x = bill_length_mm, y = body_mass_g ) ) + geom_point()+ * geom_smooth(method = "lm") ``` </small> ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-20-1.svg" style="display: block; margin: auto;" /> ] --- # Adding a smoothing line If the argument `method` is not specified, a `loess` or a `gam` model will be used (depending on the number of points to be fitted). .pull-left[ <small> ```r ggplot( penguins, aes( x = bill_length_mm, y = body_mass_g ) ) + geom_point()+ * geom_smooth() ``` </small> ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-21-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Line plot .panel1-lineplot1-auto[ ```r *ggplot( * penguins, * aes( * x = bill_length_mm, * y = body_mass_g * ) *) ``` ] .panel2-lineplot1-auto[ <img src="index_files/figure-html/lineplot1_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Line plot .panel1-lineplot1-auto[ ```r ggplot( penguins, aes( x = bill_length_mm, y = body_mass_g ) ) + * geom_line() ``` ] .panel2-lineplot1-auto[ <img src="index_files/figure-html/lineplot1_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-lineplot1-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-lineplot1-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-lineplot1-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- # Boxplot > In descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their quartiles (Wikipedia). To make a boxplot, we need to have a **discrete/categorical** variable on `x` and a **continuous** variable on `y`. --- count: false # Boxplot .panel1-boxplot1-auto[ ```r *ggplot( * data = penguins, * mapping = aes( * x = island, * y = body_mass_g * ) *) ``` ] .panel2-boxplot1-auto[ <img src="index_files/figure-html/boxplot1_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Boxplot .panel1-boxplot1-auto[ ```r ggplot( data = penguins, mapping = aes( x = island, y = body_mass_g ) ) + * geom_boxplot() ``` ] .panel2-boxplot1-auto[ <img src="index_files/figure-html/boxplot1_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-boxplot1-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-boxplot1-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-boxplot1-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- # Boxplot We can reorder the `island` variable based on the `body_mass_g` variable to make the graphic more appealing with `forcats::fct_reorder()`. -- .pull-left[ <small> ```r ggplot( data = penguins, mapping = aes( * x = forcats::fct_reorder( * island, * body_mass_g, * na.rm = TRUE * ), y = body_mass_g ) ) + geom_boxplot() ``` </small> ] -- .pull-right[ <img src="index_files/figure-html/unnamed-chunk-22-1.svg" style="display: block; margin: auto;" /> ] Note the use of `na.rm = TRUE` in the `fct_reorder()` function. --- class: inverse, center, middle # Geom aesthetics --- # Geom aesthetics Aesthetics such as *colour*, *shape*, *size* of the displayed geoms can be controlled inside the `geom_()` functions. For example, we can change the `color` and the `size` of the point in the `geom_point()` function. ```r # All points will be red with a size of 5 geom_point(color = "red", size = 5) ``` <img src="index_files/figure-html/unnamed-chunk-24-1.svg" style="display: block; margin: auto;" /> --- # Working with colours If we want to set a colour **based on a variable**, we have to use the aesthetic: `aes(colour = variable)`. ```r ggplot(penguins, aes(x = bill_length_mm, y = body_mass_g)) + * geom_point(aes(color = species)) ``` <img src="index_files/figure-html/unnamed-chunk-25-1.svg" style="display: block; margin: auto;" /> --- count: false # Setting colors manually We can also create our own palette of colors using *scale_color_manual()*. .panel1-manual_colors1-auto[ ```r *ggplot( * penguins, * aes( * x = bill_length_mm, * y = body_mass_g * ) *) ``` ] .panel2-manual_colors1-auto[ <img src="index_files/figure-html/manual_colors1_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Setting colors manually We can also create our own palette of colors using *scale_color_manual()*. .panel1-manual_colors1-auto[ ```r ggplot( penguins, aes( x = bill_length_mm, y = body_mass_g ) ) + * geom_point(aes(color = species)) ``` ] .panel2-manual_colors1-auto[ <img src="index_files/figure-html/manual_colors1_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Setting colors manually We can also create our own palette of colors using *scale_color_manual()*. .panel1-manual_colors1-auto[ ```r ggplot( penguins, aes( x = bill_length_mm, y = body_mass_g ) ) + geom_point(aes(color = species)) + * scale_color_manual( * breaks = c( * "Adelie", * "Chinstrap", * "Gentoo" * ), * values = c( * "darkorange", * "purple", * "#008b8b" * ) * ) ``` ] .panel2-manual_colors1-auto[ <img src="index_files/figure-html/manual_colors1_auto_03_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-manual_colors1-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-manual_colors1-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-manual_colors1-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- # Setting color manually This can also be done in one step **using a named vector** of colors. `c("Adelie" = "darkorange", "Chinstrap" = "purple", "Gentoo" = "#008b8b")` ```r ggplot( penguins, aes( x = bill_length_mm, y = body_mass_g ) ) + geom_point(aes(color = species)) + scale_color_manual(values = c( "Adelie" = "darkorange", "Chinstrap" = "purple", "Gentoo" = "#008b8b" )) ``` --- # Setting color manually Some color resources to procrastinate on! - [The super fast color palettes generator!](https://coolors.co/) - [ColorSpace](https://mycolor.space/) - [Gradient Generator](https://colordesigner.io/gradient-generator) - [Color Palettes Color Schemes](https://www.color-hex.com/color-palettes/) - [Color Hunt](https://colorhunt.co/) --- background-image: url(https://github.com/EmilHvitfeldt/paletteer/blob/master/man/figures/logo.png?raw=true) background-size: 90px background-position: 90% 8% # Color palettes for ggplot2 Many different R packages provide colour palettes. `paletteer` is a comprehensive collection of colour palettes in R created by [@Emil_Hvitfeldt](https://twitter.com/Emil_Hvitfeldt). I have made a [paletteer gallery](https://github.com/PMassicotte/paletteer_gallery) to help me navigate all the palettes. .pull-left[ <small> ```r *library(paletteer) ggplot( data = sw, mapping = aes( x = height, y = forcats::fct_reorder(name, height), * fill = height ) ) + geom_col() + * scale_fill_paletteer_c("ggthemes::Red-Gold") ``` </small> ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-27-1.svg" style="display: block; margin: auto;" /> ] --- # Types of colour scales **Attention to the type of colour scale that is mapped to your data.** -- - Use `scale_color_*()` of `scale_colour_*()` for the **color** of the geom. -- - Use `scale_fill_*()` for the **fill color** of the geom. -- .pull-left[ <br> <small> ```r ggplot( data = sw, mapping = aes( x = height, y = forcats::fct_reorder(name, height), * fill = height, * color = eye_color ) ) + geom_col(size = 1) ``` </small> ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-28-1.svg" height="80%" style="display: block; margin: auto;" /> ] --- # Continuous vs discrete mapping **Attention to the type of data you are mapping to a colour or shape scale.** If the mapped variable is **continuous**, the generated guide on the legend will be also **continuous**. ```r ggplot(penguins, aes(x = bill_length_mm, y = body_mass_g)) + * geom_point(aes(color = year)) ``` <img src="index_files/figure-html/continuous-mapping-1.svg" style="display: block; margin: auto;" /> --- # Continuous vs discrete mapping **Attention to the type of data you are mapping to a colour or shape scale.** If the mapped variable is **discrete**, the generated guide on the legend will be also **discrete**. You can convert a continuous variable into a factor with the `factor()` function. ```r ggplot(penguins, aes(x = bill_length_mm, y = body_mass_g)) + * geom_point(aes(color = factor(year))) ``` <img src="index_files/figure-html/discrete-mapping-1.svg" style="display: block; margin: auto;" /> --- # Working with size As we did for the colours, the size of the geom (ex.: points) can be based on a particular variable. ```r ggplot(penguins, aes(x = bill_length_mm, y = body_mass_g)) + * geom_point(aes(size = body_mass_g)) ``` <img src="index_files/figure-html/unnamed-chunk-29-1.svg" style="display: block; margin: auto;" /> --- class: inverse, center, middle # Color blindness <center> <figure> <img style="margin:0px auto;display:block" src="img/unsplash/ronald-cuyan-AJgFLjnmSs4-unsplash.jpg" alt="Someone with a basket on his shoulder. The basket contains what looks like plants of various colors." width = "550"/> </figure> <figcaption>Photo by <a href="https://unsplash.com/@ronaldcuyan?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Ronald Cuyan</a> on <a href="https://unsplash.com/s/photos/color-blind?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a> </figcaption> </center> --- # Colours for colourblind readers > Colour (color) blindness (colour vision deficiency, or CVD) affects approximately 1 in 12 men (8%) and 1 in 200 women. Source: https://www.colourblindawareness.org/colour-blindness/ -- .pull-left[ <small>For non colorblind viewers, this plot looks perfectly fine.</small> <br> <img src="index_files/figure-html/unnamed-chunk-30-1.svg" width="90%" style="display: block; margin: auto;" /> ] -- .pull-right[ <small>However, for some colorblind viewers, the <orange><b>orange</b></orange> and the <green><b>green</b></green> colors are hardly distinguishable.</small> <img src="index_files/figure-html/unnamed-chunk-31-1.svg" style="display: block; margin: auto;" /> ] --- # Colours for colourblind readers Whenever possible, try to carefully choose a colorblind-friendly colours scheme. Here is a scatterplot using `ggplot2` default colour palette. .pull-left[ ```r p <- ggplot( penguins, aes( x = bill_length_mm, y = body_mass_g ) ) + geom_point(aes(color = species)) ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-33-1.svg" style="display: block; margin: auto;" /> ] --- # Colours for colourblind readers - Color deficiency simulations can be made using the `colorblindr` package. - `ggplot2` defaults are not too bad, except for the desaturated plot. ```r colorblindr::cvd_grid(p) ``` <img src="index_files/figure-html/unnamed-chunk-34-1.svg" width="60%" style="display: block; margin: auto;" /> --- # Colours for colourblind readers There are many palettes to choose from that are colorblind-friendly. .pull-left[ ```r p <- ggplot( penguins, aes( x = bill_length_mm, y = body_mass_g ) ) + geom_point(aes(color = species)) + * paletteer::scale_color_paletteer_d( * "khroma::contrast" * ) ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-36-1.svg" style="display: block; margin: auto;" /> ] --- # Colours for colourblind readers This time, colors are more easily distinguishable. ```r colorblindr::cvd_grid(p) ``` <img src="index_files/figure-html/unnamed-chunk-37-1.svg" width="60%" style="display: block; margin: auto;" /> --- class: inverse, center, middle # Titles <center> <figure> <img style="margin:0px auto;display:block" src="img/unsplash/clem-onojeghuo-0PPKxWtYh0g-unsplash.jpg" alt="Old books placed vertically on a shelf." width = "550"/> </figure> <figcaption>Photo by <a href="https://unsplash.com/@clemono?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Clem Onojeghuo</a> on <a href="https://unsplash.com/s/photos/library-title?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a> </figcaption> </center> --- # Titles There are many ways to change the titles of the graphic and the axes. Here we are going to use the `labs()` function. These are the main parameters: - `title`: Main title of the graph - `x`, `y`: Titles for the axes - `subtitle`: Subtitle title of the graph (default: under the main title) - `caption`: Caption of the graph (default: bottom right of the graph) --- count: false # Axes and titles .panel1-axes_and_titles1-auto[ ```r *ggplot( * data = penguins, * aes( * x = flipper_length_mm, * y = bill_length_mm, * color = species * ) *) ``` ] .panel2-axes_and_titles1-auto[ <img src="index_files/figure-html/axes_and_titles1_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Axes and titles .panel1-axes_and_titles1-auto[ ```r ggplot( data = penguins, aes( x = flipper_length_mm, y = bill_length_mm, color = species ) ) + * geom_point() ``` ] .panel2-axes_and_titles1-auto[ <img src="index_files/figure-html/axes_and_titles1_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Axes and titles .panel1-axes_and_titles1-auto[ ```r ggplot( data = penguins, aes( x = flipper_length_mm, y = bill_length_mm, color = species ) ) + geom_point() + * labs( * title = "Flipper and bill length", * subtitle = "Relationship between flipper and bill length", * caption = "Data from: palmerpenguins R package", * x = "Flipper length (mm)", * y = "Bill length (mm)", * color = "Penguin\nspecies" * ) ``` ] .panel2-axes_and_titles1-auto[ <img src="index_files/figure-html/axes_and_titles1_auto_03_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-axes_and_titles1-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-axes_and_titles1-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-axes_and_titles1-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false # Controling axis labels .panel1-axes_and_titles2-auto[ ```r *ggplot( * data = penguins, * aes( * x = flipper_length_mm, * y = bill_length_mm, * color = species * ) *) ``` ] .panel2-axes_and_titles2-auto[ <img src="index_files/figure-html/axes_and_titles2_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Controling axis labels .panel1-axes_and_titles2-auto[ ```r ggplot( data = penguins, aes( x = flipper_length_mm, y = bill_length_mm, color = species ) ) + * geom_point() ``` ] .panel2-axes_and_titles2-auto[ <img src="index_files/figure-html/axes_and_titles2_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Controling axis labels .panel1-axes_and_titles2-auto[ ```r ggplot( data = penguins, aes( x = flipper_length_mm, y = bill_length_mm, color = species ) ) + geom_point() + * scale_x_continuous( * breaks = c(180, 220), * labels = c("abc", "def") * ) ``` ] .panel2-axes_and_titles2-auto[ <img src="index_files/figure-html/axes_and_titles2_auto_03_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-axes_and_titles2-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-axes_and_titles2-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-axes_and_titles2-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle # Faceting (small multiples) --- # Faceting (small multiples) Faceting is a technique that allows displaying additional categorical variables in facets. Within `ggplot2`, there are two types of faceting: `facet_grid()` and `facet_wrap()`. <img src="index_files/figure-html/unnamed-chunk-38-1.svg" style="display: block; margin: auto;" /> --- count: false # 1D faceting .panel1-facet1-auto[ ```r *ggplot( * data = drop_na(penguins, sex), * aes( * x = body_mass_g, * y = bill_depth_mm, * color = sex * ) *) ``` ] .panel2-facet1-auto[ <img src="index_files/figure-html/facet1_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # 1D faceting .panel1-facet1-auto[ ```r ggplot( data = drop_na(penguins, sex), aes( x = body_mass_g, y = bill_depth_mm, color = sex ) ) + * geom_point() ``` ] .panel2-facet1-auto[ <img src="index_files/figure-html/facet1_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # 1D faceting .panel1-facet1-auto[ ```r ggplot( data = drop_na(penguins, sex), aes( x = body_mass_g, y = bill_depth_mm, color = sex ) ) + geom_point() + * facet_wrap(~species) ``` ] .panel2-facet1-auto[ <img src="index_files/figure-html/facet1_auto_03_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # 1D faceting .panel1-facet1-auto[ ```r ggplot( data = drop_na(penguins, sex), aes( x = body_mass_g, y = bill_depth_mm, color = sex ) ) + geom_point() + facet_wrap(~species) + * facet_wrap(~species, scales = "free_x") ``` ] .panel2-facet1-auto[ <img src="index_files/figure-html/facet1_auto_04_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # 1D faceting .panel1-facet1-auto[ ```r ggplot( data = drop_na(penguins, sex), aes( x = body_mass_g, y = bill_depth_mm, color = sex ) ) + geom_point() + facet_wrap(~species) + facet_wrap(~species, scales = "free_x") + * facet_wrap(~species, scales = "free_y") ``` ] .panel2-facet1-auto[ <img src="index_files/figure-html/facet1_auto_05_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # 1D faceting .panel1-facet1-auto[ ```r ggplot( data = drop_na(penguins, sex), aes( x = body_mass_g, y = bill_depth_mm, color = sex ) ) + geom_point() + facet_wrap(~species) + facet_wrap(~species, scales = "free_x") + facet_wrap(~species, scales = "free_y") + * facet_wrap(~species, scales = "free") ``` ] .panel2-facet1-auto[ <img src="index_files/figure-html/facet1_auto_06_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-facet1-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-facet1-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-facet1-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false # 2D faceting .panel1-facet3-auto[ ```r *ggplot( * data = drop_na(penguins, sex), * aes( * x = body_mass_g, * y = bill_depth_mm, * color = sex * ) *) ``` ] .panel2-facet3-auto[ <img src="index_files/figure-html/facet3_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # 2D faceting .panel1-facet3-auto[ ```r ggplot( data = drop_na(penguins, sex), aes( x = body_mass_g, y = bill_depth_mm, color = sex ) ) + * geom_point() ``` ] .panel2-facet3-auto[ <img src="index_files/figure-html/facet3_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # 2D faceting .panel1-facet3-auto[ ```r ggplot( data = drop_na(penguins, sex), aes( x = body_mass_g, y = bill_depth_mm, color = sex ) ) + geom_point() + * facet_grid(island ~ species) ``` ] .panel2-facet3-auto[ <img src="index_files/figure-html/facet3_auto_03_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-facet3-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-facet3-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-facet3-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle # Using ggplot2 themes --- # ggplot2 themes > Themes are a powerful way to customize the non-data components of your plots: i.e. titles, labels, fonts, background, gridlines, and legends. Many components can be changed using the `theme()` function. Today we are going to see just a few of them, but feel free to have a look at `?theme` to have more information. -- `theme()` can be used to modify: -- - The background aesthetics (colour, grid, etc.). -- - Axis titles and ticks aesthetics. -- - Legend titles and positions. -- - Aesthetics of plot titles (title, subtitle, caption, etc.). -- - Plot margins. --- count: false # ggplot2 built-in themes .panel1-theme1-auto[ ```r *p ``` ] .panel2-theme1-auto[ <img src="index_files/figure-html/theme1_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # ggplot2 built-in themes .panel1-theme1-auto[ ```r p + * theme_bw() ``` ] .panel2-theme1-auto[ <img src="index_files/figure-html/theme1_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # ggplot2 built-in themes .panel1-theme1-auto[ ```r p + theme_bw() + * theme_light() ``` ] .panel2-theme1-auto[ <img src="index_files/figure-html/theme1_auto_03_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # ggplot2 built-in themes .panel1-theme1-auto[ ```r p + theme_bw() + theme_light() + * theme_gray() ``` ] .panel2-theme1-auto[ <img src="index_files/figure-html/theme1_auto_04_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # ggplot2 built-in themes .panel1-theme1-auto[ ```r p + theme_bw() + theme_light() + theme_gray() + * theme_dark() ``` ] .panel2-theme1-auto[ <img src="index_files/figure-html/theme1_auto_05_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # ggplot2 built-in themes .panel1-theme1-auto[ ```r p + theme_bw() + theme_light() + theme_gray() + theme_dark() + * theme_void() ``` ] .panel2-theme1-auto[ <img src="index_files/figure-html/theme1_auto_06_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-theme1-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-theme1-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-theme1-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- # External themes - You are not limited to using `ggplot2` built-in themes. - Here are some themes provided by the `ggthemes` and `ggpubr` packages. ```r install.packages("ggthemes") install.packages("ggpubr") ``` --- count: false # External themes .panel1-theme2-auto[ ```r *p ``` ] .panel2-theme2-auto[ <img src="index_files/figure-html/theme2_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # External themes .panel1-theme2-auto[ ```r p + * ggthemes::theme_solarized() ``` ] .panel2-theme2-auto[ <img src="index_files/figure-html/theme2_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # External themes .panel1-theme2-auto[ ```r p + ggthemes::theme_solarized() + * ggthemes::theme_fivethirtyeight() ``` ] .panel2-theme2-auto[ <img src="index_files/figure-html/theme2_auto_03_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # External themes .panel1-theme2-auto[ ```r p + ggthemes::theme_solarized() + ggthemes::theme_fivethirtyeight() + * ggthemes::theme_calc() ``` ] .panel2-theme2-auto[ <img src="index_files/figure-html/theme2_auto_04_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # External themes .panel1-theme2-auto[ ```r p + ggthemes::theme_solarized() + ggthemes::theme_fivethirtyeight() + ggthemes::theme_calc() + * ggthemes::theme_excel() ``` ] .panel2-theme2-auto[ <img src="index_files/figure-html/theme2_auto_05_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # External themes .panel1-theme2-auto[ ```r p + ggthemes::theme_solarized() + ggthemes::theme_fivethirtyeight() + ggthemes::theme_calc() + ggthemes::theme_excel() + * ggpubr::theme_pubr() ``` ] .panel2-theme2-auto[ <img src="index_files/figure-html/theme2_auto_06_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # External themes .panel1-theme2-auto[ ```r p + ggthemes::theme_solarized() + ggthemes::theme_fivethirtyeight() + ggthemes::theme_calc() + ggthemes::theme_excel() + ggpubr::theme_pubr() + * ggpubr::theme_cleveland() ``` ] .panel2-theme2-auto[ <img src="index_files/figure-html/theme2_auto_07_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # External themes .panel1-theme2-auto[ ```r p + ggthemes::theme_solarized() + ggthemes::theme_fivethirtyeight() + ggthemes::theme_calc() + ggthemes::theme_excel() + ggpubr::theme_pubr() + ggpubr::theme_cleveland() + * ggpubr::theme_pubclean() ``` ] .panel2-theme2-auto[ <img src="index_files/figure-html/theme2_auto_08_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-theme2-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-theme2-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-theme2-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- # Modify specific components of a theme ```r ?theme() # Display all the modifiable components of a theme ``` .small[ ``` ## [1] "line" "rect" "text" "title" ## [5] "aspect.ratio" "axis.title" "axis.title.x" "axis.title.x.top" ## [9] "axis.title.x.bottom" "axis.title.y" "axis.title.y.left" "axis.title.y.right" ## [13] "axis.text" "axis.text.x" "axis.text.x.top" "axis.text.x.bottom" ## [17] "axis.text.y" "axis.text.y.left" "axis.text.y.right" "axis.ticks" ## [21] "axis.ticks.x" "axis.ticks.x.top" "axis.ticks.x.bottom" "axis.ticks.y" ## [25] "axis.ticks.y.left" "axis.ticks.y.right" "axis.ticks.length" "axis.ticks.length.x" ## [29] "axis.ticks.length.x.top" "axis.ticks.length.x.bottom" "axis.ticks.length.y" "axis.ticks.length.y.left" ## [33] "axis.ticks.length.y.right" "axis.line" "axis.line.x" "axis.line.x.top" ## [37] "axis.line.x.bottom" "axis.line.y" "axis.line.y.left" "axis.line.y.right" ## [41] "legend.background" "legend.margin" "legend.spacing" "legend.spacing.x" ## [45] "legend.spacing.y" "legend.key" "legend.key.size" "legend.key.height" ## [49] "legend.key.width" "legend.text" "legend.text.align" "legend.title" ## [53] "legend.title.align" "legend.position" "legend.direction" "legend.justification" ## [57] "legend.box" "legend.box.just" "legend.box.margin" "legend.box.background" ## [61] "legend.box.spacing" "panel.background" "panel.border" "panel.spacing" ## [65] "panel.spacing.x" "panel.spacing.y" "panel.grid" "panel.grid.major" ## [69] "panel.grid.minor" "panel.grid.major.x" "panel.grid.major.y" "panel.grid.minor.x" ## [73] "panel.grid.minor.y" "panel.ontop" "plot.background" "plot.title" ## [77] "plot.title.position" "plot.subtitle" "plot.caption" "plot.caption.position" ## [81] "plot.tag" "plot.tag.position" "plot.margin" "strip.background" ## [85] "strip.background.x" "strip.background.y" "strip.placement" "strip.text" ## [89] "strip.text.x" "strip.text.y" "strip.switch.pad.grid" "strip.switch.pad.wrap" ``` ] --- # Modify specific components of a theme - Many components can be modified with the `theme()` function. For example: - `axis.title`: Title appearance (font, color, ...) - `axis.ticks`: Axes ticks appearance (length, colour, position, ...) - `plot.background`: Background properties (color, ...) - `plot.margin`: Margins around the plot. -- - Most of these components are modified using the `element_*()` functions. - `element_rect()`: for borders and backgrounds - `element_line()`: for lines elements - `element_text()`: for text --- count: false # Grid .panel1-grid-auto[ ```r *p ``` ] .panel2-grid-auto[ <img src="index_files/figure-html/grid_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Grid .panel1-grid-auto[ ```r p + * theme( * panel.grid = element_line( * size = 3, * color = "red" * ) * ) ``` ] .panel2-grid-auto[ <img src="index_files/figure-html/grid_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-grid-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-grid-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-grid-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false # Axes ticks .panel1-ticks-auto[ ```r *p ``` ] .panel2-ticks-auto[ <img src="index_files/figure-html/ticks_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Axes ticks .panel1-ticks-auto[ ```r p + * theme( * axis.ticks = element_line( * size = 2, * color = "blue" * ), * axis.ticks.length = unit(1, "cm") * ) ``` ] .panel2-ticks-auto[ <img src="index_files/figure-html/ticks_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-ticks-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-ticks-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-ticks-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false # Legend position .panel1-legend_position-auto[ ```r *p ``` ] .panel2-legend_position-auto[ <img src="index_files/figure-html/legend_position_auto_01_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Legend position .panel1-legend_position-auto[ ```r p + * theme(legend.position = "top") ``` ] .panel2-legend_position-auto[ <img src="index_files/figure-html/legend_position_auto_02_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Legend position .panel1-legend_position-auto[ ```r p + theme(legend.position = "top") + * theme(legend.position = "left") ``` ] .panel2-legend_position-auto[ <img src="index_files/figure-html/legend_position_auto_03_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Legend position .panel1-legend_position-auto[ ```r p + theme(legend.position = "top") + theme(legend.position = "left") + * theme(legend.position = "bottom") ``` ] .panel2-legend_position-auto[ <img src="index_files/figure-html/legend_position_auto_04_output-1.svg" style="display: block; margin: auto;" /> ] --- count: false # Legend position .panel1-legend_position-auto[ ```r p + theme(legend.position = "top") + theme(legend.position = "left") + theme(legend.position = "bottom") + * theme(legend.position = c(0.5, 0.5)) ``` ] .panel2-legend_position-auto[ <img src="index_files/figure-html/legend_position_auto_05_output-1.svg" style="display: block; margin: auto;" /> ] <style> .panel1-legend_position-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-legend_position-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-legend_position-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle <center> <figure> <img src="https://github.com/allisonhorst/stats-illustrations/blob/master/rstats-artwork/patchwork_1.jpg?raw=true" width="700"> <figcaption> Artwork by <a href="https://twitter.com/allison_horst?s=20">@allison_horst</a> </figcaption> </figure> </center> --- # Combining plots Many R packages can be used to combine plots. Here we are going to have a quick overview of the `patchwork` package. The package is not part of base R so it needs to be installed and loaded before you can use it. .pull-left[ <center> <figure> <img src="https://github.com/thomasp85/patchwork/blob/master/man/figures/logo.png?raw=true" width="200"> <figcaption> <a href="https://github.com/thomasp85/patchwork">The patchwork package</a> </figcaption> </figure> </center> ] .pull-right[ ```r install.packages("patchwork") library(patchwork) ``` ] --- # Plot #1 <img src="index_files/figure-html/unnamed-chunk-43-1.svg" style="display: block; margin: auto;" /> --- # Plot #2 <img src="index_files/figure-html/unnamed-chunk-44-1.svg" style="display: block; margin: auto;" /> --- # Combining horizontally Combining horizontally is done using the `+` operator. ```r p1 + p2 ``` <img src="index_files/figure-html/unnamed-chunk-45-1.svg" style="display: block; margin: auto;" /> --- # Combining vertically Combining vertically is done using the `/` operator. ```r p1 / p2 ``` <img src="index_files/figure-html/unnamed-chunk-46-1.svg" style="display: block; margin: auto;" /> --- # More complex arrangement ```r (p1 + p2) / p1 ``` <img src="index_files/figure-html/unnamed-chunk-47-1.svg" width="100%" style="display: block; margin: auto;" /> --- # Adding labels .pull-left[ It is often interesting to add a title and necessary to identify each panel (graph) with a letter. This can be achieved with the `plot_annotation()`. ```r p1 / p2 + plot_annotation( * tag_levels = "A", * title = "The Palmer penguins" ) ``` ] .pull-right[ <img src="index_files/figure-html/plot-annotation2-1.svg" style="display: block; margin: auto;" /> ] --- class: inverse, center, middle # Saving your graphics <br> .pull-left[
] .pull-right[
] --- # Vector vs raster graphics .pull-left[ > The main difference between vector and raster graphics is that raster graphics are composed of pixels, while vector graphics are composed of paths. .left[ <credit> Source: <a href="https://www.geeksforgeeks.org/vector-vs-raster-graphics/">Vector vs Raster Graphics</a> </credit> ] - Scientific journals often require at least 300 DPI raster or vector graphics. - **Attention: vector graphics can produce very large files for certain types of graphics (ex.: 3D plots).** ] .pull-right[ <center> <figure> <img src="https://upload.wikimedia.org/wikipedia/commons/a/aa/VectorBitmapExample.svg" height="400"> <figcaption> Source: Wikipedia </figcaption> </figure> </center> ] --- # Saving your graphics Saving `ggplot2` graphics is done with the `ggsave()` function. ```r p <- ggplot(mpg, aes(x = displ, y = cty)) + geom_point() ``` **Vector formats** ```r ggsave("path/to/myfile.pdf", p, width = 5.97, height = 4.79) ggsave("path/to/myfile.eps", p, width = 5.97, height = 4.79) ggsave("path/to/myfile.ps", p, width = 5.97, height = 4.79) ``` **Raster formats** ```r ggsave("path/to/myfile.jpg", p, width = 5.97, height = 4.79, dpi = 300) ggsave("path/to/myfile.tiff", p, width = 5.97, height = 4.79, dpi = 300) ggsave("path/to/myfile.png", p, width = 5.97, height = 4.79, dpi = 300) ``` --- class: inverse, center, middle # Geospatial data <center> <figure> <img style="margin:0px auto;display:block" src="img/unsplash/kyle-glenn-nXt5HtLmlgE-unsplash.jpg" alt="Earth globe palced on a table." width = "550"/> </figure> <figcaption>Photo by <a href="https://unsplash.com/@kylejglenn?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Kyle Glenn</a> on <a href="https://unsplash.com/s/photos/geography?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a> </figcaption> </center> --- It is also possible to visualize geospatial data with `ggplot2`. <center> <img src="https://github.com/PMassicotte/makeovermonday/blob/master/graphs/makeovermonday_2019w50.png?raw=true" alt="drawing" width="700"/> </center> --- # Geospatial data Making maps with `ggplot2` is out of the scope of this workshop. However, here is a quick overview with a simple example. With **only a few lines of code**, we will make a map of the railways in Germany! <center> <figure> <img style="margin:0px auto;display:block" src="img/unsplash/ankush-minda-7KKQG0eB_TI-unsplash.jpg" alt="Photo showing a train." width = "500"/> </figure> <figcaption>Photo by <a href="https://unsplash.com/@an_ku_sh?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Ankush Minda</a> on <a href="https://unsplash.com/s/photos/train?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a> </figcaption> </center> --- # Geospatial data There are many libraries that offer tools to work with spatial data in R: - `sf`, `terra`, `raster`, `tmap`, `leaflet`, `stars`, `mapview`, ..., and of course `ggplot2`. -- For this, I will use `sf` and `rnaturalearth` to manipulate and download spatial data. ```r install.packages("sf") install.packages("rnaturalearth") library(sf) library(rnaturalearth) ``` --- # Geospatial data First, let's download the land data for Germany `rnaturalearth`. ```r germany <- ne_countries(country = "germany", returnclass = "sf", scale = "large") germany ``` <small> ``` ## Simple feature collection with 1 feature and 1 field ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: 5.85249 ymin: 47.27112 xmax: 15.02206 ymax: 55.06533 ## CRS: +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 ## sovereignt geometry ## 50 Germany MULTIPOLYGON (((13.81572 48... ``` </small> --- # Geospatial data The object `germany` is a `data.frame` containing a `geometry` column which co <small> ```r class(germany) ``` ``` ## [1] "sf" "data.frame" ``` </small> -- .pull-left[ We can plot the geographic information contained in `germany` using the `st_geometry()` function from the `sf` package. ] .pull-right[ ```r plot(st_geometry(germany)) ``` <img src="index_files/figure-html/unnamed-chunk-54-1.png" width="30%" style="display: block; margin: auto;" /> ] --- # Geospatial data Secondly, let's download the railroad information. ```r railroads <- ne_download( category = "cultural", type = "railroads", returnclass = "sf", scale = "large", ) ``` -- At this time, we have the railroad for the entire planet. Since we only need information for Germany, we will crop it using the `st_intersection()` function for the `sf` package. ```r # Keep only the railroads that cross Germany railroads <- st_intersection(railroads, germany) ``` --- # Geospatial data <small> ``` ## Simple feature collection with 6 features and 4 fields ## Geometry type: GEOMETRY ## Dimension: XY ## Bounding box: xmin: 8.297778 ymin: 54.755 xmax: 9.453612 ymax: 54.90861 ## Geodetic CRS: WGS 84 ## rwdb_rr_id mult_track electric other_code geometry ## 4811 4812 0 0 1 LINESTRING (8.821666 54.791... ## 4812 4813 0 0 1 MULTILINESTRING ((8.297778 ... ## 4822 4823 0 0 1 LINESTRING (9.417223 54.766... ## 4830 4831 0 0 1 LINESTRING (9.451388 54.768... ## 4832 4833 0 0 1 LINESTRING (8.879167 54.758... ## 4840 4841 0 0 1 LINESTRING (9.383334 54.755... ``` </small> --- # Geospatial data Spatial data can be plotted with `geom_sf()`: <small> .pull-left[ ```r ggplot() + * geom_sf( * data = germany, * size = 0.25, * fill = "#F7F5FB" * ) + geom_sf( data = railroads, size = 0.15, color = "#483A58" ) ``` ] </small> .pull-right[ <img src="index_files/figure-html/unnamed-chunk-56-1.png" width="70%" style="display: block; margin: auto;" /> ] --- class: inverse, center, middle # Recommended ressources --- <!-- # ggplot2 cheat sheet --> <center> <figure> <img src="img/ggplot2_cheat_sheet_1.png" width="780"> <figcaption><a href="https://raw.githubusercontent.com/rstudio/cheatsheets/main/data-visualization.pdf">Download the ggplot2 cheat sheet</a></figcaption> </figure> </center> --- # Free online books .pull-left[ <center> <img src="img/r_graphics_cookbook.jpg" width="300"> <figcaption><a href="https://r-graphics.org/index.html">R Graphics Cookbook, 2nd edition</a></figcaption> </center> ] .pull-right[ <center> <img src="img/fundamentals_data_visualization_cover.png" width="300"> <figcaption><a href="https://clauswilke.com/dataviz/">Fundamentals of Data Visualization</a></figcaption> </center> ] --- # ggplot2 gallery <center> <img src="img/ggplot2_extensions.png" width="1000"> <figcaption><a href="https://exts.ggplot2.tidyverse.org/gallery/">ggplot2 extensions - gallery</a> </figcaption> </center> --- class: inverse, center, middle <center> <figure> <img style="margin:0px auto;display:block" src="img/unsplash/kevin-butz-6hsfmat-t7k-unsplash.jpg" alt="Thank you painted on wood in front of a building." width = "800"/> </figure> <figcaption>Photo by <a href="https://unsplash.com/@kevin_butz?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Kevin Butz</a> on <a href="https://unsplash.com/s/photos/thank-you?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a> </figcaption> </center>