This is an introduction(hence brief) to "Data Visualization techniques" in the R programming language using the so known package called ggplot2.
About the ggplot2 package
ggplot2 is a plotting system for R, based on
the grammar of graphics, which tries to take the good parts of base and lattice graphics
and none of the bad parts.
“ggplot2 is an R
package for producing statistical, or data, graphics, but it is unlike most
other graphics packages because it has a deep underlying grammar. This grammar,
based on the Grammar of Graphics (Wilkinson, 2005), is composed of a set of
independent components that can be composed in many different ways. [..] Plots
can be built up iteratively and edited later. A carefuly chosen set of defaults
means that most of the time you can produce a publication-quality graphic in
seconds, but if you do have speical formatting requirements, a comprehensive
theming system makes it easy to do what you want. [..] ggplot2 is designed to
work in a layered fashion, starting with a layer showing the raw data then
adding layers of annotation and statistical summaries. [..]"
H.Wickham, ggplot2, Use R, DOI 10.1007/978-0-387-98141_1, ©
Springer Science+Business Media, LLC 2009
Tutorials
Tutorials
First let us start by installing the ggplot2 package and their calling it in the Rstudio IDE.
### install & load ggplot2
library
install.package("ggplot2")
library("ggplot2")
qplot(Sepal.Length, Petal.Length, data = iris, color = Species, size = Petal.Width)
qplot(Sepal.Length, Petal.Length, data = iris, color = Species, size = Petal.Width, alpha = I(0.7))
# By setting the alpha of each point to 0.7, we reduce the effects of over-plotting.
qplot(Sepal.Length, Petal.Length, data = iris, color = Species,
xlab = "Sepal Length", ylab = "Petal Length",
main = "Sepal vs. Petal Length in Fisher's Iris data")

qplot(Sepal.Length, Petal.Length, data = iris, color = Species, size = Petal.Width)
qplot(Sepal.Length, Petal.Length, data = iris, color = Species, size = Petal.Width, alpha = I(0.7))
# By setting the alpha of each point to 0.7, we reduce the effects of over-plotting.
qplot(Sepal.Length, Petal.Length, data = iris, color = Species,
xlab = "Sepal Length", ylab = "Petal Length",
main = "Sepal vs. Petal Length in Fisher's Iris data")
qplot(Sepal.Length, Petal.Length, data = iris, geom = "point")
qplot(Sepal.Length, Petal.Length, data = iris)
#above lines give out the same output.
movies = data.frame(
director = c("spielberg", "spielberg", "spielberg", "jackson", "jackson"),
movie = c("jaws", "avatar", "schindler's list", "lotr", "king kong"),
minutes = c(124, 163, 195, 600, 187)
)
qplot(director, data = movies, geom = "bar", ylab = "# movies")
# But we can also supply a different weight.
# Here the height of each bar is the total running time of the director's movies.
qplot(director, weight = minutes, data = movies, geom = "bar", ylab = "total length (min.)")
qplot(age, circumference, data = Orange, geom = "line",
colour = Tree,
main = "How does orange tree circumference vary with age?")
qplot(age, circumference, data = Orange, geom = c("point", "line"), colour = Tree)
qplot(Sepal.Length, Petal.Length, data = iris)
#above lines give out the same output.
movies = data.frame(
director = c("spielberg", "spielberg", "spielberg", "jackson", "jackson"),
movie = c("jaws", "avatar", "schindler's list", "lotr", "king kong"),
minutes = c(124, 163, 195, 600, 187)
)
qplot(director, data = movies, geom = "bar", ylab = "# movies")
# But we can also supply a different weight.
# Here the height of each bar is the total running time of the director's movies.
qplot(director, weight = minutes, data = movies, geom = "bar", ylab = "total length (min.)")
qplot(age, circumference, data = Orange, geom = "line",
colour = Tree,
main = "How does orange tree circumference vary with age?")
qplot(age, circumference, data = Orange, geom = c("point", "line"), colour = Tree)
No comments:
Post a Comment