R packages used in this sectionlibrary(DT)
library(gapminder)
library(gghighlight)
library(ggrepel)
library(stargazer)
library(tidyverse)| Variable types | How to visualize | Feature |
|---|---|---|
| Discrete variable | Bar chart | Lines between bars |
| Continuous variable | Histogram | No Lines between bars |
x-axis is year of electionsnumericgeom_histogram()x-axisy-axisdata in your R Project folderdata folder in your R Projecthr96-21.csv and name it
dfdf <- read_csv("data/hr96-21.csv",
na = ".") if_else(), make a dummy variable:
ldpmutate(ldp = if_else(seito == "自民", "LDP", "Non-LDP") -
This command means make a dummy variable, named ldpseito is “自民”, then replace it
with “LDP” and replace the other values (that is, the other party names)
in seito with “Non-LDP”df <- df %>%
mutate(ldp = if_else(seito == "自民", "LDP", "Non-LDP"))df %>%
ggplot() +
geom_histogram(aes(x = voteshare)) bins = XXbinwidth = XXcolor = "white")
out of aes() and within geom_histogram()bin and binwidth ・You can
customize either binwidth (the width of
bins) or bins (the numbder of
bins)
・You cannot simultaneously customize both
x-axis and
y-axisscale_x_continuous() |
customize x-axis |
scale_y_continuous() |
customize y-axis |
hist_plot1 <- df %>%
ggplot() +
geom_histogram(aes(x = voteshare),
color = "white",
binwidth = 5)
hist_plot1x-axis on thie
histogramscale_x_*(), you can customize the scale onx-axis`hist_plot2 <- hist_plot1 +
scale_x_continuous(breaks = seq(0, 100, by = 10),
labels = seq(0, 100, by = 10))
hist_plot2facet_wrap(~)ldp, to the histogram
above by using facet_wrap(~)hist_plot2 +
facet_wrap(~ldp) +
theme_bw(base_family = "HiraKakuProN-W3")ldp, to the histogram
above by using fill = ...position = "identity"df %>%
mutate(ldp = if_else(seito == "自民", "LDP", "Non-LDP")) %>%
ggplot() +
geom_histogram(aes(x = voteshare,
fill = ldp),
position = "identity",
binwidth = 10,
color = "white") +
labs(x = "Vote Share",
y = "The Number of Candidates ",
fill = "") +
theme_bw(base_family = "HiraKakuProN-W3")alpha = ..., you can make the bars
transparentalpha = 0・・・transparentalpha = 1・・・not transparentalpha = 0.5 heredf %>%
mutate(ldp = if_else(seito == "自民", "LDP", "Non-LDP")) %>%
ggplot() +
geom_histogram(aes(x = voteshare,
fill = ldp),
position = "identity",
binwidth = 10,
color = "white",
alpha = 0.5) +
labs(x = "Vote Share",
y = "The Number of Candidates",
fill = "") +
theme_bw(base_family = "HiraKakuProN-W3")df %>%
ggplot() +
geom_histogram(aes(x = voteshare,
fill = seito),
position = "identity",
binwidth = 10,
color = "white",
alpha = 0.5) +
labs(x = "Vote Share",
y = "The Number of Candidates",
fill = "") +
theme_bw(base_family = "HiraKakuProN-W3")df %>%
filter(seito == "自民"|seito == "共産"|seito == "民主") %>%
mutate(party = case_when(seito == "自民" ~ "LDP",
seito == "共産" ~ "JCP",
seito == "民主" ~ "CDP")) |>
ggplot() +
geom_histogram(aes(x = voteshare,
fill = party),
position = "identity",
binwidth = 10,
color = "white",
alpha = 0.5) +
labs(x = "Vote Share",
y = "The Number of Candidates",
fill = "") +
theme_bw(base_family = "HiraKakuProN-W3")position = "identity", then you can add up each
distribution in one bardf %>%
filter(seito == "自民"|seito == "共産"|seito == "民主") %>%
mutate(party = case_when(seito == "自民" ~ "LDP",
seito == "共産" ~ "JCP",
seito == "民主" ~ "CDP")) |>
ggplot() +
geom_histogram(aes(x = voteshare,
fill = party),
binwidth = 10,
color = "white",
alpha = 0.5) +
labs(x = "Vote Share",
y = "The Number of Candidates",
fill = "") +
theme_bw(base_family = "HiraKakuProN-W3")geom_density() instead of
geom_histogram()df %>%
filter(seito == "自民"|seito == "共産"|seito == "民主") %>%
mutate(party = case_when(seito == "自民" ~ "LDP",
seito == "共産" ~ "JCP",
seito == "民主" ~ "CDP")) |>
ggplot() +
geom_density(aes(x = voteshare,
fill = party),
color = "white",
alpha = 0.5,
adjust = 0.8) +
labs(x = "Vote Share",
y = "Density",
fill = "") +
theme_bw(base_family = "HiraKakuProN-W3")Q3.1:
In reference to 2.4 Add another dimension, draw a histogram
of vote share (voteshare) for the LDP (Liberal
Democratic Party) and CDP (Constitutional Democratic Party) candidates
in the 2021 lower house election.
・You see the party name in variable, seito in
hr96_21.csv
・You need to use geom_histogram() to draw a
histogram.
・You need to pay attention to the hidden values behind the bars in
drawing a histogram.
Q3.2:
In reference to 2.4 Add another dimension, draw a histogram
of vote share (voteshare) for the LDP (Liberal
Democratic Party) and CDP (Constitutional Democratic Party) and JCP
(Japan Communist Party) candidates in the 2021 lower house
election.
・You see the party name in variable, seito in
hr96_21.csv
・You need to use geom_histogram() to draw a
histogram.
・You need to pay attention to the hidden values behind the bars in
drawing a histogram.
In answering Q3.1 and Q3.2
questions, use hr96-21.csv
・hr96_21.csv is a collection of Japanese lower house
election data covering 9 national elections (1996, 2000, 2003, 2005,
2009, 2012, 2014, 2017, 2021)
・You need the following three variables which are included in
hr96_21.csv to draw histograms:
| variable | detail |
|---|---|
| year | Election year (1996-2021) |
| voteshare | Vote share (%) |
| seito | Candidate’s affiliated party (in Japanese) |
・hr96_21.csv contains the following 23 variables:
| variable | detail |
|---|---|
| year | Election year (1996-2021) |
| pref | Prefecture |
| ku | Electoral district name |
| kun | Number of electoral district |
| rank | Ascending order of votes |
| wl | 0 = loser / 1 = single-member district (smd) winner / 2 = zombie winner |
| nocand | Number of candidates in each district |
| seito | Candidate’s affiliated party (in Japanese) |
| j_name | Candidate’s name (Japanese) |
| name | Candidate’s name (English) |
| previous | Previous wins |
| gender | Candidate’s gender:“male”, “female” |
| age | Candidate’s age |
| exp | Election expenditure (yen) spent by each candidate |
| status | 0 = challenger / 1 = incumbent / 2 = former incumbent |
| vote | votes each candidate garnered |
| voteshare | Vote share (%) |
| eligible | Eligible voters in each district |
| turnout | Turnout in each district (%) |
| castvote | Total votes cast in each district |
| seshu_dummy | 0 = Not-hereditary candidates, 1 = hereditary candidate |
| jiban_seshu | Relationship between candidate and his predecessor |
| nojiban_seshu | Relationship between candidate and his predecessor |