R packages
used in this section
library(DT)
library(gapminder)
library(gghighlight)
library(ggrepel)
library(stargazer)
library(tidyverse)
1. Histogram
 A histogram is an approximate representation of the distribution of
numerical data.
Diffenece between a barchart and a histogram
Discrete variable 
Bar chart 
Lines between bars 
Continuous variable 
Histogram 
No Lines between bars 



When xaxis
is year of elections
 There is no values between 2017 and 2021 (because there was no
election held between them)
 Years of election look like numeric, but it is not treated as
numeric
When xaxis
is vote share
 The value of vote share ranges from 0%, 0.1%, …. to 100%
→ We need infinite number of bars for each value
→ We do not use bars for each value
 We use limited number of bars
2. How to draw a histogram using ggplot2
 You can draw a histogram by using
geom_histogram()
 You need to map a continuous variable on
xaxis
 You don’t have to map on
yaxis
 Let’s draw a histogram of vote share in the lower house elections in
Japan between 1996 and 2021
2.1 Draw a simple histogram
 Make a folder, named
data
in your R Project folder
 Download hr9621.csv onto the
data
folder in your R Project
 Read the election data,
hr9621.csv
and name it
df
df < read_csv("data/hr9621.csv",
na = ".")
 Using
if_else()
, make a dummy variable:
ldp
#####
mutate(ldp = if_else(seito == "自民", "LDP", "NonLDP")

This command means make a dummy variable, named ldp
 If a value in variable
seito
is “自民”, then replace it
with “LDP” and replace the other values (that is, the other party names)
in seito
with “NonLDP”
df < df %>%
mutate(ldp = if_else(seito == "自民", "LDP", "NonLDP"))
 Draw a histogram of vote share
df %>%
ggplot() +
geom_histogram(aes(x = voteshare))