class: center, middle, inverse, title-slide .title[ # Data Visualization ] .subtitle[ ## Chapter 2. The Grammar of Graphs in R ] .author[ ### Iñaki Úcar ] .institute[ ### Department of Statistics | uc3m-Santander Big Data Institute ] .institute[ ### Master in Computational Social Science ] .date[ ###
Licensed under Creative Commons Attribution
CC BY 4.0
Last generated: 2023-11-14
] --- class: toc, base24, middle, clear --- class: intoc, inverse, center, middle # Building Graphs Layer by Layer --- class: base24 # An Object-Oriented Graphics System .footnote[Wilkinson, L. (2005) _**The grammar of graphics**_. Springer New York.] -- - Graphics are collections of **objects** that follow a set of rules, a **grammar**, so that they behave consistently and flexibly. -- - The specification of the formal language is expressed in six statements: 1. **DATA**: a set of data operations that create variables from datasets, 2. **TRANS**: variable transformations (e.g., rank), 3. **SCALE**: scale transformations (e.g., log), 4. **COORD**: a coordinate system (e.g., polar), 5. **ELEMENT**: marks (e.g., points) and their aesthetic attributes (e.g., color), 6. **GUIDE**: one or more guides (axes, legends, etc.). -- - These components link data to (visual) objects and specify a scene containing those. --- # An Object-Oriented Graphics System .center[![:scale 85%](assets/img/ch2/specification-example.png)] --- # An Object-Oriented Graphics System .center[![:scale 100%](assets/img/ch2/specification-tree.png)] --- class: base24 # About ggplot2 - An R package for producing statistical graphics - Underlying grammar based on the **Grammar of Graphics** (thus GG) -- - Instead of being limited to sets of pre-defined graphics, it allows to **compose** graphs by combining (adding, `+`) components -- - Simple set of core principles (+ some very few special cases) - Carefully chosen defaults -- - Good for **quick prototyping**, designed to work iteratively - But also **publication-quality graphics**, with a comprehensive theming system -- - Lots of [extensions](https://exts.ggplot2.tidyverse.org/)! --- class: base24 # ggplot2 Basics - Requires [**tidy data**](https://vita.had.co.nz/papers/tidy-data.html): 1 observation per row, 1 variable per column: |country |continent | year| lifeExp| pop| gdpPercap| |:---------|:---------|----:|-------:|--------:|----------:| |Mauritius |Africa | 1962| 60.246| 701016| 2529.0675| |Indonesia |Asia | 1957| 39.918| 90124000| 858.9003| |Italy |Europe | 1977| 73.480| 56059245| 14255.9847| -- - All plots are composed of **data** and **mapping**, the description of how data attributes are mapped to **aes**thetic attributes (channels). -- - Basic workflow: ```r ggplot(data) + # create the graphic object with the data aes(x=..., y=..., color=...) + # add the general mapping ... # add more components (geoms, scales, coords, facets, themes...) ``` --- class: base24 # ggplot2 Basics There are five types of components: -- - A _layer_ is a collection of **geom**etric elements (points, lines...) and **stat**istical transformations (binning, counting...). -- - A **scale** controls a channel, adds or modifies how attributes are mapped (position, color, shape, size...). -- - A **coord**inate system describes how data coordinates are mapped to the plane of the graphic. It also provides axes and gridlines. -- - A **facet** specifies how to break up and display subsets of data as small multiples (AKA _conditioning_, _latticing_ or _tresllising_). -- - A **theme** controls the finer points of display to create attractive plots (background, fonts, guide aspect and positioning...). --- class: inverse, center, middle # Tutorial 01 ## [Building Graphs Layer by Layer](../tutorials/01/) --- # Aesthetics Specification .footnote[Read the comprehensive [guide on aesthetics](https://ggplot2.tidyverse.org/articles/ggplot2-specs.html).] .font120[ - Mastering data **mappings** is an important (the most important?) skill. ] -- .font120[ - Each **geom** is affected by **a different set of aesthetics**: ] .pull-left[ From `?geom_point` (required in bold): > - **`x`** > - **`y`** > - `alpha` > - `colour` > - .blue[`fill`] > - `group` > - .red[`shape`] > - `size` > - .blue[`stroke`] ] .pull-right[ From `?geom_line` (required in bold): > - **`x`** > - **`y`** > - `alpha` > - `colour` > - `group` > - .red[`linetype`] > - `size` ] --- # Individual Geoms | Geom | Result | Details | |---|---|---| | `geom_point()`<br>`geom_text()`<br>`geom_label()` | **scatterplot** | Understands `shape`.<br>Helper for text.<br>Helper for labels. | | `geom_line()`<br>`geom_path()`<br>`geom_step()`<br>`geom_function()` | **line plot** | Connects points from left to right, understands `linetype`.<br>Connects points in order.<br>Produces a _stairstep_ plot.<br>Connects points of a given function of`x`. | | `geom_bar()`<br>`geom_col()` | **bar chart** | `stat="count"` by default!<br>Multiple bars are stacked by default. | | `geom_area()` | **area plot** | Line plot filled from 0 to `y`. | | `geom_polygon()` | | Filled path. | | `geom_rect()`<br>`geom_tile()`<br>`geom_raster()` | | Rectangle by `xmin`, `xmax`, `ymin`, `ymax`.<br>Rectangle by center (`x`, `y`) and size (`width`, `height`).<br>Faster tiles with constant size. | --- # Individual Geoms - Two dimensional: require `x` and `y`, understand `color` and `size`. - Some of them can be `fill`ed. <img src="ch2_files/figure-html/individual-1.png" style="display: block; margin: auto;" /><img src="ch2_files/figure-html/individual-2.png" style="display: block; margin: auto;" /> --- # Collective Geoms - Dealing with point overplotting | Geom | Result | Details | |---|---|------| | `geom_jitter()`<br>`geom_count()`<br>`geom_bin_2d()`<br>`geom_hex()` | | `geom_point()`, but adds some jitter to each point.<br>Maps the count of overlapping points to `size`.<br>Maps the count of rectangles to `fill`.<br>Same, but using hexagons. | <img src="ch2_files/figure-html/collective-1-1.png" style="display: block; margin: auto;" /> --- # Collective Geoms - Dealing with uncertainty | Geom | Result | Details | |---|---|------| | `geom_pointrange()`<br>`geom_linerange()`<br>`geom_errorbar()`<br>`geom_crossbar()` | | Various ways of representing a vertical intervals defined by `x`, `ymin` and `ymax`. | | `geom_ribbon()` | | Special case of `geom_area()` with `ymin` too. | <img src="ch2_files/figure-html/collective-2-1.png" style="display: block; margin: auto;" /> --- # Collective Geoms - Arbitrary segments | Geom | Result | Details | |---|---|------| | `geom_segment()`<br>`geom_curve()`<br>`geom_spoke()` | | Straight line between points (`x`, `y`) and (`xend`, `yend`).<br>Same, but curved line.<br>Polar parametrization of `geom_segment()`. | <img src="ch2_files/figure-html/collective-3-1.png" style="display: block; margin: auto;" /> --- # Collective Geoms - Distributions | Geom | Result | Details | |---|---|------| | `geom_histogram()`<br>`geom_freqpoly()`<br>`geom_dotplot()` | **histogram** | Distribution of a continuous variable by bins.<br>To display the counts with lines instead.<br>Histograms of stacked dots. | | `geom_density()` | **density plot** | Smoothed version of the histogram. | | `geom_rug()` | | Draws ticks for marginal distributions. | <img src="ch2_files/figure-html/collective-4-1.png" style="display: block; margin: auto;" /> --- # Collective Geoms - Boxplots | Geom | Result | Details | |---|---|------| | `geom_boxplot()`<br>`geom_violin()` | **boxplot** | Compact display of the distribution of a continuous variable.<br>Mirrored density, displayed as a boxplot. | <img src="ch2_files/figure-html/collective-5-1.png" style="display: block; margin: auto;" /> --- # Collective Geoms - Smoothing lines | Geom | Result | Details | |---|---|------| | `geom_smooth()`<br>`geom_quantile()` | | Fits a model and draws a smoothing line.<br>Fits a quantile regression and draws the quantiles. | <img src="ch2_files/figure-html/collective-6-1.png" style="display: block; margin: auto;" /> --- # Collective Geoms - Contours | Geom | Result | Details | |---|---|------| | `geom_contour()`<br>`geom_contour_filled()`<br>`geom_density_2d`<br>`geom_density_2d_filled()` | **contour plot** | 2D contours of 3D surfaces of regular `x`, `y`.<br>Filled version.<br>2D contours after computing the density.<br>Filled version. | <img src="ch2_files/figure-html/collective-7-1.png" style="display: block; margin: auto;" /> --- # Collective Geoms - Maps | Geom | Result | Details | |---|---|------| | `geom_map()`<br>`geom_sf()`<br>`geom_sf_text()`<br>`geom_sf_label()` | **map** | Old way to plot polygons as a map.<br>Current recommended way via `sf`.<br>Similar to `geom_text()` but for `sf`.<br>Similar to `geom_label()` but for `sf`. | <img src="ch2_files/figure-html/collective-8-1.png" style="display: block; margin: auto;" /> --- # Geom vs. Stat .pull-left[ ```r ggplot(mpg, aes(displ, hwy)) + * geom_point(stat="identity") ``` <img src="ch2_files/figure-html/stat-identity-1-1.png" style="display: block; margin: auto;" /> ] .pull-right[ ```r ggplot(mpg, aes(displ, hwy)) + * stat_identity(geom="point") ``` <img src="ch2_files/figure-html/stat-identity-2-1.png" style="display: block; margin: auto;" /> ] --- # Geom vs. Stat .pull-left[ ```r ggplot(mpg, aes(hwy)) + * geom_bar(stat="count") ``` <img src="ch2_files/figure-html/stat-count-1-1.png" style="display: block; margin: auto;" /> ] .pull-right[ ```r ggplot(mpg, aes(hwy)) + * stat_count(geom="bar") ``` <img src="ch2_files/figure-html/stat-count-2-1.png" style="display: block; margin: auto;" /> ] --- # Geom vs. Stat .pull-left[ ```r ggplot(mpg, aes(displ, hwy)) + * geom_smooth(stat="smooth") ``` <img src="ch2_files/figure-html/stat-smooth-1-1.png" style="display: block; margin: auto;" /> ] .pull-right[ ```r ggplot(mpg, aes(displ, hwy)) + * stat_smooth(geom="smooth") ``` <img src="ch2_files/figure-html/stat-smooth-2-1.png" style="display: block; margin: auto;" /> ] --- class: intoc, inverse, center, middle # Scales and Guides --- class: base24 # Scale Specification A **scale** is a procedure that performs the mapping of data attributes into channels (position, color, size...): - sets the **limits**; - sets an optional **transformation** (without modifying the data); - sets a **guide**. -- .pull-left[ A **guide** allows us to revert the procedure and recover the data: - an axis or a legend, depending on the channel; - has a name, breaks, labels... ] .pull-right[ .center[![:scale 100%](assets/img/ch2/scale-guides.png)] ] --- # Scale Specification .font150[ Naming: `scale_<aes>_<type>(<arguments>)` ] ![:vspace 40]() .font120[ | Element | Argument | Shortcut function | |--------:|:---------|:------------------| | Title | `name=...` | `labs(x=..., y=..., color=..., ...)` | | Limits | `limits=...` | `lims(x=..., y=..., color=..., ...)` | | Breaks | `breaks=...` | | | Labels | `labels=...` | | | Guide | `guide=...` | `guides(x=..., y=..., color=..., ...)` | | Transformation | `trans=...` | | ] --- class: inverse, center, middle # Tutorial 02 ## [Scales and Guides](../tutorials/02/) --- class: intoc, inverse, center, middle # Coordinate Systems --- class: base24 # Cartesian Coordinates ![:vspace 30]() `coord_cartesian()`: default, no need to be specified - ... although it is useful to set axes limits (via `xlim` and `ylim` arguments). - Position given by orthogonal distances, `x` and `y`, to an origin. -- Some helper functions: - `coord_flip()`: helper to flip the axes. - `coord_fixed()`: helper to fix the aspect ratio. - `coord_trans()`: helper to transform the axes. --- class: base24 # Other Coordinates ![:vspace 100]() - `coord_polar()`: `x` is the angle, `y` is the radius (can be reverted). - `coord_map()`: projections of the sphere into a plane. - Mercator, sinusoidal, cylindrical, rectangular... - Anything supported by the `mapproj` package. - `coord_sf()`: modern way to deal with maps via **simple features**<br> (from `sf` package). --- class: inverse, center, middle # Tutorial 03 ## [Coordinate Systems](../tutorials/03/) --- class: intoc, inverse, center, middle # Faceting --- class: base24 # Facet Specification ![:vspace 50]() .center[![:scale 80%](assets/img/ch2/position-facets.png)] --- # Facet Specification .pull-left[ <img src="ch2_files/figure-html/facet-grid-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="ch2_files/figure-html/facet-wrap-1.png" style="display: block; margin: auto;" /> ] --- class: inverse, center, middle # Tutorial 04 ## [Faceting](../tutorials/04/) --- class: intoc, inverse, center, middle # Themes --- class: base24 # Theme Specification .footnote[Source: [ggplot2 Theme Elements Demonstration](https://henrywang.nl/ggplot2-theme-elements-demonstration/) by Henry Wang] .center[![:scale 90%](assets/img/ch2/theme-elements.png)] --- class: inverse, center, middle # Tutorial 05 ## [Themes](../tutorials/05/) --- class: intoc, inverse, center, middle # Annotations --- class: base24 # Types of Annotations .pull-left[ ] .pull-right[ <img src="ch2_files/figure-html/annotations0-1.png" style="display: block; margin: auto;" /> ] --- class: base24 # Types of Annotations .pull-left[ - Guides (axes and legend) ] .pull-right[ <img src="ch2_files/figure-html/annotations1-1.png" style="display: block; margin: auto;" /> ] --- class: base24 # Types of Annotations .pull-left[ - Guides (axes and legend) - Titles (title, subtitle and caption) ] .pull-right[ <img src="ch2_files/figure-html/annotations2-1.png" style="display: block; margin: auto;" /> ] --- class: base24 # Types of Annotations .pull-left[ - Guides (axes and legend) - Titles (title, subtitle and caption) - Text labels ] .pull-right[ <img src="ch2_files/figure-html/annotations3-1.png" style="display: block; margin: auto;" /> ] --- class: base24 # Types of Annotations .pull-left[ - Guides (axes and legend) - Titles (title, subtitle and caption) - Text labels - Reference lines ] .pull-right[ <img src="ch2_files/figure-html/annotations4-1.png" style="display: block; margin: auto;" /> ] --- class: base24 # Types of Annotations .pull-left[ - Guides (axes and legend) - Titles (title, subtitle and caption) - Text labels - Reference lines - Reference areas ] .pull-right[ <img src="ch2_files/figure-html/annotations5-1.png" style="display: block; margin: auto;" /> ] --- class: base24 # Types of Annotations .pull-left[ - Guides (axes and legend) - Titles (title, subtitle and caption) - Text labels - Reference lines - Reference areas - Direct labeling ] .pull-right[ <img src="ch2_files/figure-html/annotations6-1.png" style="display: block; margin: auto;" /> ] --- class: inverse, center, middle # Tutorial 06 ## [Annotations](../tutorials/06/) --- class: intoc, inverse, center, middle # Arranging Plots --- class: base24 # Types of Arrangements .pull-left[ - Compositions <img src="ch2_files/figure-html/compositions-1.png" style="display: block; margin: auto;" /> ] .pull-right[ - Insets <img src="ch2_files/figure-html/insets-1.png" style="display: block; margin: auto;" /> ] --- class: base24 # Panel Alignment .pull-left[ - None <img src="ch2_files/figure-html/alignment-none-1.png" style="display: block; margin: auto;" /><img src="ch2_files/figure-html/alignment-none-2.png" style="display: block; margin: auto;" /> ] .pull-right[ - With [`patchwork`](https://patchwork.data-imaginist.com/) <img src="ch2_files/figure-html/alignment-patchwork-1.png" style="display: block; margin: auto;" /> ] --- class: inverse, center, middle # Tutorial 07 ## [Arranging Plots](../tutorials/07/)