class: center, middle, inverse, title-slide <style type="text/css"> .remark-slide table{ border: none } .remark-slide-table { } tr:first-child { border-top: none; } tr:last-child { border-bottom: none; } </style> <style type="text/css"> /* THIS IS A CSS CHUNK - THIS IS A COMMENT */ /* Size of font in code echo. E.g. 10px or 50% */ .remark-code { font-size: 70%; } /* Size of font in text */ .medium-text { font-size: 75%; } /* Size of font in tables */ .small-table table { font-size: 6px; } .medium-table table { font-size: 8px; } .medium-large-table table { font-size: 10px; } </style> # Introduction to R for Applied Epidemiology ### Introduction to data visualization with {ggplot2} contact@appliedepi.org --- # Today: objectives & schedule **In this module we aim to help you:** * Understand the {ggplot2} "Grammar of graphics" * Build simple box/scatter/bar plots and histograms * Adjust the scales, themes, and labels of the plots <div class="tabwid"><style>.cl-1cb765c4{}.cl-1ca4d0a8{font-family:'Arial';font-size:11pt;font-weight:normal;font-style:normal;text-decoration:none;color:rgba(0, 0, 0, 1.00);background-color:transparent;}.cl-1cabd254{margin:0;text-align:left;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-1cabd268{margin:0;text-align:left;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-1cabfb44{width:1.296in;background-color:transparent;vertical-align: middle;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 1.5pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb4e{width:0.693in;background-color:transparent;vertical-align: middle;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 1.5pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb58{width:2.857in;background-color:transparent;vertical-align: middle;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 1.5pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb59{width:1.296in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb62{width:0.693in;background-color:transparent;vertical-align: top;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb63{width:2.857in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb64{width:1.296in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb6c{width:0.693in;background-color:transparent;vertical-align: top;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb6d{width:2.857in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb6e{width:1.296in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb76{width:0.693in;background-color:transparent;vertical-align: top;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb77{width:2.857in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb80{width:1.296in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb81{width:0.693in;background-color:transparent;vertical-align: top;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb8a{width:2.857in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfb8b{width:1.296in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfbc6{width:0.693in;background-color:transparent;vertical-align: top;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfbd0{width:2.857in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfbd1{width:1.296in;background-color:transparent;vertical-align: middle;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfbda{width:0.693in;background-color:transparent;vertical-align: top;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfbdb{width:2.857in;background-color:transparent;vertical-align: middle;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfbe4{width:1.296in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(255, 255, 255, 0.00);border-top: 0 solid rgba(255, 255, 255, 0.00);border-left: 0 solid rgba(255, 255, 255, 0.00);border-right: 0 solid rgba(255, 255, 255, 0.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfbe5{width:0.693in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(255, 255, 255, 0.00);border-top: 0 solid rgba(255, 255, 255, 0.00);border-left: 0 solid rgba(255, 255, 255, 0.00);border-right: 0 solid rgba(255, 255, 255, 0.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-1cabfbe6{width:2.857in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(255, 255, 255, 0.00);border-top: 0 solid rgba(255, 255, 255, 0.00);border-left: 0 solid rgba(255, 255, 255, 0.00);border-right: 0 solid rgba(255, 255, 255, 0.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}</style><table data-quarto-disable-processing='true' class='cl-1cb765c4'><thead><tr style="overflow-wrap:break-word;"><th class="cl-1cabfb44"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">Time</span></p></th><th class="cl-1cabfb4e"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">Part</span></p></th><th class="cl-1cabfb58"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">Topic</span></p></th></tr></thead><tbody><tr style="overflow-wrap:break-word;"><td class="cl-1cabfb59"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">20 minutes</span></p></td><td rowspan="3"class="cl-1cabfb62"><p class="cl-1cabd268"><span class="cl-1ca4d0a8">Part 1</span></p></td><td class="cl-1cabfb63"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">Slides: ggplot2 'Grammar of Graphics'</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-1cabfb64"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">10 minutes</span></p></td><td class="cl-1cabfb6d"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">Demo</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-1cabfb6e"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">1hr 30 minutes</span></p></td><td class="cl-1cabfb77"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">Exercise</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-1cabfb80"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">20 minutes</span></p></td><td class="cl-1cabfb81"><p class="cl-1cabd268"><span class="cl-1ca4d0a8">Part 2</span></p></td><td class="cl-1cabfb8a"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">Slides: Scales, themes & labels</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-1cabfb8b"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">1 hour</span></p></td><td class="cl-1cabfbc6"><p class="cl-1cabd268"><span class="cl-1ca4d0a8"></span></p></td><td class="cl-1cabfbd0"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">Exercise</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-1cabfbd1"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">10 minutes</span></p></td><td class="cl-1cabfbda"><p class="cl-1cabd268"><span class="cl-1ca4d0a8">-</span></p></td><td class="cl-1cabfbdb"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">Debrief</span></p></td></tr></tbody><tfoot><tr style="overflow-wrap:break-word;"><td colspan="3"class="cl-1cabfbe4"><p class="cl-1cabd254"><span class="cl-1ca4d0a8">Take breaks as you wish during the exercise</span></p></td></tr></tfoot></table></div> --- class: inverse, center, middle ## Data visualization with {ggplot2} <img src="../../images/ggplot_intro/ggplot2_hex.png" width="50%" /> --- # Visualization options in R Today we focus on {ggplot2} because it: * is good for fast data exploration of multi-dimensional data * produces very **high quality** final outputs * has well-structured grammar => **high consistency** * is accompanied by many packages that expand functionality See the [R graph gallery](https://www.r-graph-gallery.com/ggplot2-package.html) for inspiration. .footnote[Other plotting options include [**base** R](https://towardsdatascience.com/base-plotting-in-r-eb365da06b22), [**lattice**](https://www.statmethods.net/advgraphs/trellis.html), and [**plotly**](https://plotly.com/r/). ] --- # Was it made with ggplot? <img src="../../images/ggplot_intro/clustering.png" width="100%" /> .footnote[Images sources [here](http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html) and [here](https://jcheshire.com/r-spatial-data-hints/great-maps-ggplot2/)] --- # Was it made with ggplot? <img src="../../images/ggplot_intro/dumbbell_chart.png" width="50%" /> --- # Was it made with ggplot? <img src="../../images/ggplot_intro/map.png" width="50%" /> --- # Was it made with ggplot? <img src="../../images/ggplot_intro/bike_london.png" width="100%" /> --- # Was it made with ggplot? <img src="../../images/ggplot_intro/swiss_map.png" width="90%" /> --- # Was it made with ggplot? <img src="../../images/ggplot_intro/phylo_tree.png" width="60%" /> --- # Was it made with ggplot? <img src="../../images/ggplot_intro/uk_geography.jpg" width="100%" /> --- # Was it made with ggplot? <img src="../../images/ggplot_intro/art_ggplot.png" width="50%" /> --- # Was it made with ggplot? <img src="../../images/ggplot_intro/van_gogh.jpg" width="80%" /> --- # gg-what?? -- - The {ggplot2} *package* is the most popular data visualization tool in R -- - Its `ggplot()` *function* is at the core of the package -- - This whole approach is colloquially known as “ggplotting” -- - Resulting figures are sometimes affectionately called “ggplots” -- {ggplot2} is accompanied by numerous packages that extend its functionalities, such as {gganimate}, {ggthemr}, {ggdendro}, {gghighlight}, {ggforce}... .footnote[ *Bonus question:* What does the "gg” in these names represent? ] ??? - "gg" represents the “grammar of graphics” used to construct the figures --- # Grammar of Graphics Build a plot by adding layers of functions which specify data and design elements -- The order usually looks like this: 1) **"Open" the plot** with the `ggplot()` command and **specify the dataset** -- 2) **"Map" data columns** to "aesthetic" plot features (axes, color, size, shape) -- 3) **Display the data** as “geom” layers -- 4) **Modify "scales"**, such as color scale or y-axis break points -- 5) **Adjust non-data "theme" elements** such as axis labels, title, caption, & fonts These layers are "added" sequentially with **`+`** symbols. ??? Remember that although the commands may be long, it is infinitely easier to edit and recycle than in Excel --- # Open the plot .pull-left[ ```r ggplot() ``` `ggplot()` creates an empty canvas. ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-17-1.png" width="504" /> ] ??? This is only a blank canvas, we have not defined what should be in the x and y axes. If several data frames are needed, they can be added in their own geoms. Piping is useful to make one-time changes to a dataset prior to plotting. --- # Add the data .pull-left[ ```r ggplot(data = surv) ``` Assign the data frame to use. Alternatively, use the `%>%` pipe operator to "pipe" a data frame *into* `ggplot()` ```r surv %>% ggplot() ``` ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-20-1.png" width="504" /> ] ??? This is only a blank canvas, we have not defined what should be in the x and y axes. If several data frames are needed, they can be added in their own geoms. Piping is useful to make one-time changes to a dataset prior to plotting. --- # Add the data .pull-left[ ```r ggplot( data = surv) ``` Newlines and indents will not impact the code execution. They can make longer commands easier to read... ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-22-1.png" width="504" /> ] ??? This is only a blank canvas, we have not defined what should be in the x and y axes. If several data frames are needed, they can be added in their own geoms. Piping is useful to make one-time changes to a dataset prior to plotting. --- # Mappings with `aes()` .pull-left[ ```r ggplot( data = surv, * mapping = aes()) ``` Plot "aesthetics" are features like position, color, shape... `mapping = aes()` maps "aesthetics" to columns in the data. ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-24-1.png" width="504" /> ] ??? ggplot commands tend to get very vertical (long) --- # Mappings with `aes()` .pull-left[ ```r ggplot( data = surv, mapping = aes( * x = age_years )) ``` Aesthetic mappings are placed within `aes()`. Two basic mappings are axes to columns, via: `x = ` ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-26-1.png" width="504" /> ] ??? ggplot commands tend to get very vertical (long) --- # Mappings with `aes()` .pull-left[ ```r ggplot( data = surv, mapping = aes( * x = age_years, * y = ht_cm)) ``` Aesthetic mappings are placed within `aes()`. Two basic mappings are axes to columns, via: `x = ` and `y = ` ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-28-1.png" width="504" /> ] ??? ggplot commands tend to get very vertical (long) --- # Add geometry .pull-left[ ```r ggplot( data = surv, mapping = aes( x = age_years, y = ht_cm)) + *geom_point() ``` Data are visualized using "geom" commands, such as `geom_point()`. These commands are "added" with a **`+`** to the `ggplot()` command. ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-30-1.png" width="504" /> ] --- # Geometries .pull-left[ Some typical “geoms” include: Plot type|Geom ------------------------------------------------------------------------|-------------------------------------------------------------------- Histograms |`geom_histogram()` Points|`geom_point()` .footnote[Full list [here](https://ggplot2.tidyverse.org/reference/) ] ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-31-1.png" width="504" /> ] --- # Geometries .pull-left[ Some typical “geoms” include: Plot type |Geom ------------------------------------------------------------------------|-------------------------------------------------------------------- Lines|`geom_line()` Bar plots|`geom_bar()` or</br> `geom_col()` .footnote[The choice between `geom_bar()` and `geom_col()` depends on the structure of your data. Full list of geoms [here](https://ggplot2.tidyverse.org/reference/)] ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-32-1.png" width="504" /> ] --- # Geometries .pull-left[ Some typical “geoms” include: Plot type|Geom ------------------------------------------------------------------------|-------------------------------------------------------------------- Boxplots|`geom_boxplot()` Violin plots|`geom_violin()` .footnote[Full list [here](https://ggplot2.tidyverse.org/reference/)] ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-33-1.png" width="504" /> ] --- # Adding geoms .pull-left[ ```r ggplot( data = surv, mapping = aes( x = age_years, y = ht_cm)) + *geom_point() ``` With axes now mapped, `geom_point()` displays the data as points. ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-35-1.png" width="504" /> ] --- # Adding geoms .pull-left[ ```r ggplot( data = surv, mapping = aes( x = age_years, y = ht_cm)) + geom_point() + *geom_smooth() ``` We can add additional geoms to the current plot with `+`. *Geoms appear in the order they are written*: the smoothed line appears over the points. ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-37-1.png" width="504" /> ] .footnote[`geom_smooth()` gives smoothed conditional means, helping to show trends in presence of "over-plotting" (see [documentation](https://ggplot2.tidyverse.org/reference/geom_smooth.html))] ??? - Explain why you might use one or the other --- # A quick note on indentations Indentations, spaces, and newlines do not impact code execution, and can be varied to improve readability. ```r ggplot(data = surv, mapping = aes(x = age_years, y = ht_cm)) + geom_point() ``` -- is the same as: ```r ggplot(data = surv, mapping = aes(x = age_years, y = ht_cm)) + geom_point() ``` -- is the same as: ```r ggplot( data = surv, # use case linelist mapping = aes( # make aesthetic mappings for all geoms x = age_years, # assign x-axis to age column y = ht_cm)) + # assign y-axis to height geom_point() # display data as points ``` .footnote[Which do you prefer?] ??? - Which of the above is easier to read for you? - Explain why you might use one or the other long style can enable informative comments/annotations - short style very dense (harder to read for some). Shorter scripts, but so what? The number of lines of your code is not an informative metric. - very long lines => needs to scroll horizontally for people with smaller monitors (not nice) - long-ish style makes it easier to see which argument belongs to each function - spaces around "=" or "+" => make it easier to parse to many people - other? --- class: large-table # Other aesthetics Aside from axes, other common "aesthetics" include: Argument|Controls -------------------------------------------------------------------|-------------------------------------------------------------------- `shape` |Display of point as dot, star, triangle, square... `fill` |The *interior* color (e.g of bar or boxplot) `color` |The *exterior* or bar, boxplot - OR point color `size` |Line thickness, point size... `alpha` |Transparency: 0 (invisible) to 1 (opaque) `width` |Width of "bar plot" bars `linetype` |Either solid, dashed, dotted, etc. `binwidth` |Width of histogram bins ??? Note that “aesthetic” in ggplot has a specific meaning that you might associate with the word “aesthetics” in common English. In ggplot those details are called “themes” and are adjusted within a theme() command Each geom accepts certain aesthetics, like `binwidth=` for `geom_histogram()` <!-- --- --> <!-- class: medium-text --> <!-- # Aesthetics assignments --> <!-- .pull-left[ --> <!-- Aesthetics can be assigned to either: --> <!-- **Static values**: --> <!-- - Assigned **outside `aes()`** --> <!-- - Same display for all data --> <!-- ```{r, echo=T, eval=F} --> <!-- ggplot( --> <!-- data = surv, --> <!-- mapping = aes( --> <!-- x = age_years, --> <!-- y = ht_cm))+ --> <!-- geom_point(color = "purple") # static #<< --> <!-- ``` --> <!-- ] --> <!-- .pull-right[ --> <!-- Some examples: --> <!-- ```{r,eval = params$lang == "en", message=F, warning=F} --> <!-- color_purple <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(color = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = 'Using color = "purple"') --> <!-- color_hospital <- ggplot() + theme_void() --> <!-- cowplot::plot_grid(color_purple, color_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ```{r, eval = params$lang == "fr", message=F, warning=F} --> <!-- color_purple <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(color = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Utilisant color = 'purple' (mauve)") --> <!-- color_hospital <- ggplot() + theme_void() --> <!-- cowplot::plot_grid(color_purple, color_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ```{r, eval = params$lang == "ru", message=F, warning=F} --> <!-- color_purple <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(color = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Использование цвета = 'purple'") --> <!-- color_hospital <- ggplot() + theme_void() --> <!-- cowplot::plot_grid(color_purple, color_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ```{r,eval = params$lang == "es", message=F, warning=F} --> <!-- color_purple <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(color = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Utilizar el color = 'purple' (púrpura)") --> <!-- color_hospital <- ggplot() + theme_void() --> <!-- cowplot::plot_grid(color_purple, color_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ] --> <!-- --- --> <!-- class: medium-text --> <!-- # Aesthetics assignments --> <!-- .pull-left[ --> <!-- Aesthetics can be assigned to either: --> <!-- **Static values**: --> <!-- - Assigned **outside `aes()`** --> <!-- - Same display for all data --> <!-- ```{r, echo=T, eval=F} --> <!-- ggplot( --> <!-- data = surv, --> <!-- mapping = aes( --> <!-- x = age_years, --> <!-- y = ht_cm))+ --> <!-- geom_point(color = "purple") # static #<< --> <!-- ``` --> <!-- **A data column**: --> <!-- - Assigned **inside `aes()`** --> <!-- - Displays data as "groups" --> <!-- ```{r, echo=T, eval=F} --> <!-- ggplot( --> <!-- data = surv, --> <!-- mapping = aes( --> <!-- x = age_years, --> <!-- y = ht_cm, --> <!-- color = hospital))+ # dynamic #<< --> <!-- geom_point() --> <!-- ``` --> <!-- ] --> <!-- .pull-right[ --> <!-- Some examples: --> <!-- ```{r, eval = params$lang == "en", message=F, warning=F} --> <!-- color_purple <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(color = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Using color = 'purple'") --> <!-- color_hospital <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(aes(color = hospital))+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Using aes(color = hospital)") --> <!-- cowplot::plot_grid(color_purple, color_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ```{r, eval = params$lang == "fr", message=F, warning=F} --> <!-- color_purple <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(color = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Utilisant color = 'purple' (mauve)") --> <!-- color_hospital <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(aes(color = hospital))+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Utilisant aes(color = hospital)") --> <!-- cowplot::plot_grid(color_purple, color_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ```{r, eval = params$lang == "ru", message=F, warning=F} --> <!-- color_purple <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(color = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Использование цвета = 'purple' (сиреневый)") --> <!-- color_hospital <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(aes(color = hospital))+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Использование aes(color = hospital), больница") --> <!-- cowplot::plot_grid(color_purple, color_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ```{r, eval = params$lang == "es", message=F, warning=F} --> <!-- color_purple <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(color = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Utilizar el color = 'purple' (púrpura)") --> <!-- color_hospital <- ggplot(data = surv, aes(x = age_years, y = ht_cm))+ --> <!-- geom_point(aes(color = hospital))+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Utilizar aes(color = hospital)") --> <!-- cowplot::plot_grid(color_purple, color_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ] --> <!-- --- --> <!-- class: medium-text --> <!-- # Aesthetics assignments --> <!-- .pull-left[ --> <!-- Aesthetics can be assigned to either: --> <!-- **Static values**: --> <!-- Same display for all data --> <!-- ```{r, echo=T, eval=F} --> <!-- ggplot( --> <!-- data = surv, --> <!-- mapping = aes(x = date_onset)) + --> <!-- geom_histogram(fill = "purple") # static #<< --> <!-- ``` --> <!-- **A data column (dynamic)**: --> <!-- Displays data as "groups" --> <!-- ```{r, echo=T, eval=F} --> <!-- ggplot( --> <!-- data = surv, --> <!-- mapping = aes( --> <!-- x = date_onset, --> <!-- color = district)) + # dynamic #<< --> <!-- geom_histogram() --> <!-- ``` --> <!-- ] --> <!-- .pull-right[ --> <!-- More examples: --> <!-- ```{r, eval = params$lang == "en", message=F, warning=F} --> <!-- fill_purple <- ggplot(data = surv, aes(x = date_onset))+ --> <!-- geom_histogram(fill = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Using fill = 'purple'") --> <!-- fill_hospital <- ggplot(data = surv, aes(x = date_onset))+ --> <!-- geom_histogram(aes(fill = hospital))+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Using aes(fill = hospital)") --> <!-- cowplot::plot_grid( --> <!-- fill_purple, fill_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ```{r,eval = params$lang == "fr", message=F, warning=F} --> <!-- fill_purple <- ggplot(data = surv, aes(x = date_onset))+ --> <!-- geom_histogram(fill = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Utilisant fill = 'purple' (mauve)") --> <!-- fill_hospital <- ggplot(data = surv, aes(x = date_onset))+ --> <!-- geom_histogram(aes(fill = hospital))+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Utilisant aes(fill = hospital)") --> <!-- cowplot::plot_grid( --> <!-- fill_purple, fill_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ```{r, eval = params$lang == "ru", message=F, warning=F} --> <!-- fill_purple <- ggplot(data = surv, aes(x = date_onset))+ --> <!-- geom_histogram(fill = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Использование fill = 'purple' (сиреневый)") --> <!-- fill_hospital <- ggplot(data = surv, aes(x = date_onset))+ --> <!-- geom_histogram(aes(fill = hospital))+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Использование aes(fill= hospital), больница") --> <!-- cowplot::plot_grid( --> <!-- fill_purple, fill_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ```{r, eval = params$lang == "es", message=F, warning=F} --> <!-- fill_purple <- ggplot(data = surv, aes(x = date_onset))+ --> <!-- geom_histogram(fill = "purple")+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Utilizar fill = 'purple' (púrpura)") --> <!-- fill_hospital <- ggplot(data = surv, aes(x = date_onset))+ --> <!-- geom_histogram(aes(fill = hospital))+ --> <!-- theme_minimal(base_size = 16)+ --> <!-- labs(title = "Utilizar aes(fill = hospital)") --> <!-- cowplot::plot_grid( --> <!-- fill_purple, fill_hospital, --> <!-- nrow = 2, rel_widths = c(2,2)) --> <!-- ``` --> <!-- ] --> --- # Static aesthetic assignment .pull-left[ ```r ggplot( data = surv, mapping = aes( x = age_years, y = ht_cm)) + *geom_point(color = "seagreen") ``` **Static** aesthetic assignments are to a **number or character value**. The change applies to **all** data points. Written **outside `aes()`**. Other static examples you might use: `size = 3` `alpha = 0.5` `fill = "purple"` ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-45-1.png" width="504" /> ] --- # Dynamic aesthetic assignment .pull-left[ ```r ggplot( data = surv, mapping = aes( x = age_years, y = ht_cm, * color = hospital)) + geom_point() ``` **Dynamic** aesthetic assignments are mapped to a **column name**. This creates **groups** in the plot and generates a legend. This is written **inside `aes()`**. ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-47-1.png" width="504" /> ] ??? --- # Static and dynamic .pull-left[ ```r ggplot( data = surv, mapping = aes( x = age_years, y = ht_cm, * color = hospital)) + geom_point( * size = 7, * alpha = 0.6) ``` Above, `size = 7` and `alpha = 0.5` are assigned statically, outside `aes()`. `color=` is assigned to column `hospital`, within `aes()`. ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-49-1.png" width="504" /> ] .footnote[Read more about ggplot aesthetics [here](https://ggplot2.tidyverse.org/articles/ggplot2-specs.html) ] ??? As there is only one geom, all aesthetics can be written in `ggplot()`, or in `geom_point()` --- # Facets .pull-left[ ```r ggplot( data = surv, mapping = aes(x = date_onset)) + geom_histogram() + *facet_wrap(~hospital) ``` Groups can also be displayed this way. Facets are "small-multiples" `facet_wrap()` produces one plot per unique value in the column. "~" before the column name is like the word "by" (..."by hospital") ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-51-1.png" width="504" /> ] ??? Also called "small multiples" --- # Facets .pull-left[ ```r ggplot( data = surv, mapping = aes(x = date_onset)) + geom_histogram() + *facet_wrap(~hospital, scales = "free_y") ``` "Free" auto-scaled axes with `scales=` - "free_y" - "free_x" - "free" (both x and y) ] .footnote[ Alert your audience if you use free axes! Also, try `ncol=` and `nrow=` ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-53-1.png" width="504" /> ] --- # Facets + `gghighlight()` .pull-left[ ggplot extension packages like {gghighlight} are useful. `gghighlight()` casts a "shadow" behind each facet. ```r ggplot( data = surv, mapping = aes( x = date_onset, * fill = hospital)) + geom_histogram() + facet_wrap(~ hospital) + *gghighlight() ``` ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-58-1.png" width="504" /> ] --- # gghighlight .pull-left[ `gghighlight()` can also highlight specific values in other plot types ```r surv %>% # get weekly counts by hospital group_by( hospital, week = floor_date(date_onset, "week") ) %>% count() %>% # plot ggplot( mapping = aes( x = week, y = n, color = hospital)) + geom_line() + * gghighlight( * hospital == "Port Hospital")+ theme(legend.position = "none") ``` ] .pull-right[ <img src="intro05-1_files/figure-html/unnamed-chunk-63-1.png" width="504" /> ] .footnote[The code aggregates cases by week and hospital, and passes counts to ggplot] ??? Here we create data frame of cases per week per hospital, and plot with `geom_line()`. The highlight is applied to Port Hospital. --- class: inverse, center, middle ## Exercise Go to the course website Open the first exercise for Module 5, and login Follow the instructions to open your "ebola" R project and continue coding Let an instructor know if you are unsure what to do <img src="../../images/breakout/Safety Match - COVID artwork.png" width="50%" />