pacman::p_load(tidyverse, patchwork, ggiraph, plotly, DT)Programming Interactive Data Visualization with R
1. Learning Outcome
In this hands-on exercise, we will learn to create interactive data visualization by using functions provided by ggiraph and plotlyr packages in R.
2. Getting Started
2.1 Installing and loading the required libraries
Firstly, let’s install and load the required packages:
tidyverse: an opinionated collection of R packages designed for data import, data wrangling and data exploration
patchwork: an R package for preparing composite figure created using ggplot2
ggiraph: to make ggplot graphs interactive
plotly: to plot interative statistical graphs
DT: provides an R interface to JavaScript library DataTables that create interactive tables on html page
2.2 Importing the data
Similar to the previous two hands-on exercise, we’ll still use Exam_data for this exercise. The data file contains year end examination grades of a cohort of primary 3 students from a local school, and it’s in csv format.
Let’s start by importing the data.
exam_data <- read_csv("../../Data/Exam_data.csv")3. Interactive Data Visualisation - ggiraph methods
In this section, we’ll learn how to make the ggplot graphs interactive. It makes it easier to digest the graphs, especially when we want to use the graphs for story telling.
The can be achieved by using ggplot geometries, which has three arguments:
Tooltip: a column of data-sets that contains tooltips to be displayed when we mouse over the elements
Onclick: a column of data-sets that contain a JavaScript function to be executed when elements are clicked
Data_id: a column of data-sets that contain an id to be associated with elements
3.1 Tooltip effect with tooltip aesthetic
3.1.1 Create a graph with ONE tooltip
It takes two steps to create an interactive ggplot graph:
Step 1: create a graph using standard ggplot syntax. The only difference is to use the interactive version of ggplot2 geom. According to ggiraph website, almost all ggplot2 elements can be made interactive. Some of examples are:
p <- ggplot(data = exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(tooltip = ID), # to display ID when mouse over the dots in the graph
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL, breaks = NULL)- Step 2: generate an svg object to be displayed on an html page
girafe(
ggobj = p,
width_svg = 6,
height_svg = 6 * 0.618
)When we put our mouse over an dot on the graph above, the respective student ID will be displayed.
3.1.2 Create a graph with MULTIPLE tooltip
The example above only displayed one information, student ID, when we mouse over the dots. We can actually display more information by specifying a list object as shown in the code chunk below.
# Step 1: create a column in exam_data to store the information that we want to display, by concatenating the information from a few columns
exam_data$tooltip <- c(paste0(
"Name = ", exam_data$ID,
"\n Class = ", exam_data$CLASS))
# Step 2: create a standard ggplot2 object
p <- ggplot(data = exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(tooltip = exam_data$tooltip), # supply the newly created tooltip column to tooltip argument
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL, breaks = NULL)
# Step 3: create an SVG object using girafe() function
girafe(
ggobj = p,
width_svg = 8,
height_svg = 8*0.618
)When we put our mouse over an dot on the graph above, the respective student ID and their classes will now be displayed.
3.1.3 Customising Tooltip style
We can also customize tooltip using opts_tooltip() function by adding css declarations.
tooltip_css <- "background-color:white; #<<
font-style:bold; color:black;" #<<
p <- ggplot(data=exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(tooltip = ID),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL,
breaks = NULL)
girafe(
ggobj = p,
width_svg = 6,
height_svg = 6*0.618,
options = list( #<<
opts_tooltip( #<<
css = tooltip_css)) #<<
) When we put our mouse over an dot on the graph above, the tooltip is now displayed with a white background now.
3.1.4 Displaying statistics on tooltip
We can also customize the tooltip to display statistical summary information. For example, we can create a function to compute 90% confident interval of the mean, and display it in the tooltip.
tooltip <- function(y, ymax, accuracy = 0.01){
mean <- scales::number(y, accuracy = accuracy)
sem <- scales::number(ymax - y, accuracy = accuracy)
paste("Mean maths scores: ", mean, " +/- ", sem)
}
gg_point <- ggplot(data = exam_data,
aes(x = RACE)) +
stat_summary(aes(y = MATHS,
tooltip = after_stat(
tooltip(y, ymax))),
fun.data = "mean_se",
geom = GeomInteractiveCol,
fill = "lightblue") +
stat_summary(aes(y = MATHS),
fun.data = mean_se,
geom = "errorbar",
width = 0.2,
size = 0.2)
girafe(
ggobj = gg_point,
width_svg = 8,
height_svg = 8 * 0.618
)When we put our mouse over a bar, the desired statistics is displayed. In our example above, the mean maths score is displayed with 90% confidence interval.
3.2 Hover effect with data_id aesthetic
In addition to display the tooltip when we mouse over a data point, we can also highlight a subset of the data points using another feature called data_id.
p <- ggplot(data=exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(data_id = CLASS,
tooltip = ID), # to highlight the data points from the same class when mouse over
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL,
breaks = NULL)
girafe(
ggobj = p,
width_svg = 6,
height_svg = 6 * 0.618
) When we mouse over a data point, the other data points which are from the same class are also highlighted.
Note that the default highlight color is orange, but it can be changed using hover_css = “fill:orange;”
3.2.1 Styling hover effect
Let’s now try change the highlighting effect in the graph.
p <- ggplot(data = exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(data_id = CLASS),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL,
breaks = NULL)
girafe(
ggobj = p,
width_svg = 6,
height_svg = 6 * 0.618,
options = list(
opts_hover(css = "fill: #202020;"),
opts_hover_inv(css = "opacity:0.2;")
)
) When we mouse over a data point, only the data points from the same class are highlighted in black and the rest of the points are dimmed.
3.2.2 Combining tooltip and hover effect
Now, let’s combine tooltip and data_id features to make the graph more informative.
p <- ggplot(data=exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(tooltip = CLASS, # to display class information when hover over
data_id = CLASS), # to highlight the data from the same class when hover over
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL,
breaks = NULL)
girafe(
ggobj = p,
width_svg = 6,
height_svg = 6 * 0.618,
options = list(
opts_hover(css = "fill: #202020;"),
opts_hover_inv(css = "opacity: 0.2;")
)
) When we mouse over a data point, only the data points from the same class are highlighted in black and the rest of the points are dimmed. At the same time, the class information is displayed as well.
3.3 Click effect with onclick
ggiraph also provides a feature to allow us to redirect the user to a webpage when they click on the graph.
The code chunk below shows an example.
# Create a column in exam_data to store the website address
exam_data$onclick <- sprintf("window.open(\"%s%s\")",
"https://www.moe.gov.sg/schoolfinder?journey=Primary%20school",
as.character(exam_data$ID))
p <- ggplot(data = exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(onclick = onclick),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL,
breaks = NULL)
girafe(
ggobj = p,
width_svg = 6,
height_svg = 6 * 0.618) When we click on a data point, a webpage is opened automatically to look for the school with the respective school ID.
3.4 Coordinated Multiple Views with ggiraph
We can also plot coordinated multiple views to highlight the related information regarding the same data point.
# Step 1: create the first interactive graph
p1 <- ggplot(data = exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(data_id = ID),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
coord_cartesian(xlim = c(0, 100)) +
scale_y_continuous(NULL,
breaks = NULL)
# Step 2: create the second interactive graph
p2 <- ggplot(data = exam_data,
aes(x = ENGLISH)) +
geom_dotplot_interactive(
aes(data_id = ID),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
coord_cartesian(xlim = c(0, 100)) +
scale_y_continuous(NULL,
breaks = NULL)
# Combine the two graphs using patchwork package
girafe(code = print(p1 + p2),
width_svg = 6,
height_svg = 3,
options = list(
opts_hover(css = "fill: #202020;"),
opts_hover_inv(css = "opacity:0.2;")
)
) When we hover over a data point in any of the two graphs, the related data point is also highlighted in the other graph.
4. Interactive Data Visualisation - plotly methods!
There is another package available in R to create interactive graphs, that is plotly package. It can be done in two ways:
- using plot_ly()
- using ggplotly()
4.1 Creating an interactive scatter plot: plot_ly() method
The code chunk below shows an example.
plot_ly(data = exam_data,
x = ~MATHS,
y = ~ENGLISH)4.2 Working with visual variable: plot_ly() method
We can add another dimension to the plot by coloring the data points base on a categorical variable, for example, RACE.
plot_ly(data = exam_data,
x = ~ENGLISH,
y = ~MATHS,
color = ~RACE)The color did not just provide another piece of information, it can also act as filters to the plot. The data points are filtered when we click on the categories in the legend.
4.3 Creating an interactive scatter plot: ggplotly() method
Let’s now create an interactive scatter plot.
p <- ggplot(data = exam_data,
aes(x = MATHS,
y = ENGLISH)) +
geom_point(size = 1) +
coord_cartesian(xlim = c(0, 100),
ylim = c(0, 100))
ggplotly(p)The coordinates are now displayed when we hover over a data point.
4.4 Coordinated Multiple Views with plotly
Similar to ggiraph package, we can also create coordinated multiple views using plotly package.
- highlight_key() of plotly package is used as shared data
- two scatterplots will be created by using ggplot2 functions
- subplot() of plotly package is used to place them next to each other side-by-side
d <- highlight_key(exam_data)
p1 <- ggplot(data = d,
aes(x = MATHS,
y = ENGLISH)) +
geom_point(size = 1) +
coord_cartesian(xlim = c(0, 100),
ylim = c(0, 100))
p2 <- ggplot(data = d,
aes(x = MATHS,
y = SCIENCE)) +
geom_point(size = 1) +
coord_cartesian(xlim = c(0, 100),
ylim = c(0, 100))
subplot(ggplotly(p1),
ggplotly(p2))5. Interactive Data Visualisation - crosstalk methods!
Lastly, let’s make more interactions in R.
5.1 Interactive Data Table: DT package
The data tables can also be interactive with the help of a DataTables wrapper from the JavaScript library.
DT::datatable(exam_data, class = "compact")Unlike the traditional way of displaying the data tables, there is a search box above the table to allow us type the texts and the table will be filtered accordingly.
5.2 Linked brushing: crosstalk method
We can even combine the plot and the data table, and make them interlinked.
d <- highlight_key(exam_data)
p <- ggplot(d,
aes(ENGLISH,
MATHS)) +
geom_point(size = 1) +
coord_cartesian(xlim = c(0, 100),
ylim = c(0, 100))
gg <- highlight(ggplotly(p),
"plotly_selected")
crosstalk::bscols(gg,
DT::datatable(d),
widths = 5) This comes to the end of this hands-on exercise. I have learned many different methods to create interactive plots in R. Hope you enjoyed it, too!
See you in the next hands-on exercise 🥰