Fork me on GitHub

ggplot from ŷhat


How it works

Basic Premise

Making plots is a very repetetive: draw this line, add these colored points, then add these, etc. Instead of re-using the same code over and over, ggplot implements them using a high-level but very expressive API. The result is less time spent creating your charts, and more time interpreting what they mean.

ggplot is not a good fit for people trying to make highly customized data visualizations. While you can make some very intricate, great looking plots, ggplot sacrafices highly customization in favor of generall doing "what you'd expect".

Data

ggplot has a symbiotic relationship with pandas. If you're planning on using ggplot, it's best to keep your data in DataFrames. Think of a DataFrame as a tabular data object. For example, let's look at the diamonds dataset which ships with ggplot.

from ggplot import *
diamonds.head()
carat cut color clarity depth table price x y z
0 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43
1 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31
2 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31
3 0.29 Premium I VS2 62.4 58 334 4.20 4.23 2.63
4 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75

Aesthetics

Aesthetics describe how your data will relate to your plots. Some common aesthetics are: x, y, and color. Aesthetics are specific to the type of plot (or layer) you're adding to your visual. For example, a scatterplot (geom_point) and a line (geom_line)will share x and y, but only a line chart has a linetype aesthetic.

For more information about which geoms have which aesthetics, see the DOCS SECTION.

Layers

ggplot lets you combine or add different types of visualization components (or layers) together. I think this is easiest to understand with an example.

Start with a blank canvas.

p = ggplot(aes(x='date', y='beef'), data=meat)
p

Add some points.

p + geom_point()

Add a line.

p + geom_point() + geom_line()

Add a trendline.

p + geom_point() + geom_line() + stat_smooth(color='blue')

As you can see, you can quite literally add components of your visualization together. For more info on available components, see the DOCS SECTION.