+ - 0:00:00
Notes for current slide
Notes for next slide

Data Visualization

Chapter 3. Data Visualization in R

Iñaki Úcar

Department of Statistics | uc3m-Santander Big Data Institute

Master in Computational Social Science

Licensed under Creative Commons Attribution CC BY 4.0 Last generated: 2023-01-25

1 / 29

Directory of Visualizations

Based on The R Graph Gallery

2 / 29

Distribution

< Contents

Violin Density Histogram Boxplot Ridgeline


  • Visualization of one or multiple univariate distributions
  • Stacked versions are difficult to interpret and should be avoided
  • Some require fine-tuning of the parameters to avoid being misleading
3 / 29

Distribution Histogram

< Contents

ggplot(mpg) +
aes(hwy) +
geom_histogram()

4 / 29

Distribution Histogram

< Contents

ggplot(mpg) +
aes(hwy, after_stat(density)) +
geom_histogram()

5 / 29

Distribution Histogram

< Contents

ggplot(mpg) +
aes(hwy, fill=class) +
geom_histogram() +
theme(legend.position=c(1, 1),
legend.justification=c(1, 1))

6 / 29

Distribution Histogram

< Contents

ggplot(mpg) +
aes(hwy) +
geom_histogram() +
facet_grid(class ~ .)

7 / 29

Distribution Histogram

< Contents

ggplot(mpg) +
aes(hwy) +
geom_histogram() +
facet_grid(
reorder(class, -hwy, median) ~ .)

8 / 29

Distribution Density

< Contents

ggplot(mpg) +
aes(hwy) +
geom_density()

9 / 29

Distribution Density

< Contents

ggplot(mpg) +
aes(hwy, after_stat(density)) +
geom_histogram(fill="gray") +
geom_density()

10 / 29

Distribution Density

< Contents

ggplot(mpg) +
aes(hwy, after_stat(density)) +
geom_histogram(fill="gray") +
geom_density(adjust=0.2)

11 / 29

Distribution Density

< Contents

ggplot(mpg) +
aes(hwy, after_stat(density)) +
geom_histogram(fill=NA) +
geom_histogram(fill="gray", bins=10) +
geom_density()

12 / 29

Distribution Boxplot

< Contents

ggplot(mpg) +
aes(hwy, class) +
geom_boxplot() +
labs(y=NULL)

13 / 29

Distribution Boxplot

< Contents

ggplot(mpg) +
aes(hwy, reorder(class, hwy, median)) +
geom_boxplot() +
labs(y=NULL)

14 / 29

Distribution Boxplot

< Contents

ggplot(mpg) +
aes(hwy, reorder(class, hwy, median)) +
geom_boxplot(outlier.color="red") +
labs(y=NULL)

15 / 29

Distribution Boxplot

< Contents

ggplot(mpg) +
aes(hwy, reorder(class, hwy, median)) +
geom_boxplot(outlier.color="red") +
geom_jitter(height=0.2, alpha=0.5) +
labs(y=NULL)

16 / 29

Distribution Boxplot

< Contents

ggplot(mpg) +
aes(hwy, reorder(class, hwy, median)) +
geom_boxplot(outlier.color="red",
varwidth=TRUE) +
geom_jitter(height=0.2, alpha=0.5) +
labs(y=NULL)

17 / 29

Distribution Boxplot

< Contents

ggplot(mpg) +
aes(hwy, reorder(class, hwy, median)) +
geom_boxplot(aes(color=drv)) +
labs(y=NULL) +
theme(legend.position=c(1, 0),
legend.justification=c(1, 0))

18 / 29

Distribution Boxplot

< Contents

ggplot(mpg) +
aes(hwy, reorder(class, hwy, median)) +
geom_boxplot(aes(fill=drv)) +
labs(y=NULL) +
theme(legend.position=c(1, 0),
legend.justification=c(1, 0))

19 / 29

Distribution Boxplot

< Contents

ggplot(mpg) +
aes(hwy, reorder(class, hwy, median)) +
geom_boxplot(aes(fill=drv),
varwidth=TRUE) +
labs(y=NULL) +
theme(legend.position=c(1, 0),
legend.justification=c(1, 0))

20 / 29

Distribution Violin

< Contents

ggplot(mpg) +
aes(hwy, reorder(class, hwy, median)) +
geom_violin() +
labs(y=NULL)

21 / 29

Distribution Violin

< Contents

ggplot(mpg) +
aes(hwy, reorder(class, hwy, median)) +
geom_violin(aes(fill=class)) +
labs(y=NULL) +
theme(legend.position="none")

22 / 29

Distribution Violin

< Contents

ggplot(mpg) +
aes(hwy, reorder(class, hwy, median)) +
geom_violin(aes(fill=class)) +
scale_fill_viridis_d() +
labs(y=NULL) +
theme(legend.position="none")

23 / 29

Distribution Violin

< Contents

mpg |> mutate(
class=reorder(class, hwy, median)) |>
ggplot() +
aes(hwy, class) +
geom_violin(aes(fill=class)) +
scale_fill_viridis_d() +
labs(y=NULL) +
theme(legend.position="none")

24 / 29

Distribution Ridgeline

< Contents

mpg |> mutate(
class=reorder(class, hwy, median)) |>
ggplot() +
aes(hwy, class, fill=class) +
ggridges::geom_density_ridges() +
scale_fill_viridis_d() +
labs(y=NULL) +
theme(legend.position="none")

25 / 29

Distribution Ridgeline

< Contents

mpg |> mutate(
class=reorder(class, hwy, median)) |>
ggplot() +
aes(hwy, class, fill=after_stat(x)) +
ggridges::geom_density_ridges_gradient() +
scale_fill_viridis_c() +
labs(y=NULL) +
theme(legend.position="none")

26 / 29

Distribution Ridgeline

< Contents

mpg |> mutate(
class=reorder(class, hwy, median)) |>
ggplot() +
aes(hwy, class, fill=after_stat(x)) +
ggridges::geom_density_ridges_gradient(
scale=1) +
scale_fill_viridis_c() +
labs(y=NULL) +
theme(legend.position="none")

27 / 29

Distribution Ridgeline

< Contents

mpg |> mutate(
class=reorder(class, hwy, median)) |>
ggplot() +
aes(hwy, class, fill=after_stat(x)) +
ggridges::geom_density_ridges_gradient(
scale=1, quantile_lines=TRUE) +
scale_fill_viridis_c() +
labs(y=NULL) +
theme(legend.position="none")

28 / 29

Distribution Ridgeline

< Contents

mpg |> mutate(
class=reorder(class, hwy, median)) |>
ggplot() +
aes(hwy, class, fill=0.5 - abs(
0.5 - after_stat(ecdf))) +
ggridges::geom_density_ridges_gradient(
scale=1, calc_ecdf=TRUE) +
scale_fill_viridis_c("Tail prob.") +
labs(y=NULL) +
theme(legend.position=c(1, 0),
legend.justification=c(1, 0))

29 / 29

Directory of Visualizations

Based on The R Graph Gallery

2 / 29
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
sToggle scribble toolbox
Esc Back to slideshow