This week, I will analyze Car Fuel Economy dataset from TidyTuesday.
What is TidyTuesday? TidyTuesday is a weekly social data project in R organized by the R for Data Science community.
It is a great way of improving your Data wrangling and visualization techniques, sharing and learning from others.
You can find more information on their github.
Fuel economy data are the result of the work done by the US Environmental Protection Agency.
It is hard to understand your data by looking at the numbers on a csv file. You need to plot it. And adding statistics to your plots will make it more informative.
To evaluate data, we typically use mean and median to define its central tendency and range, quartiles, variance and standard deviation to define how spread it is.
Mean and standard deviation is a good representation of the data if we don’t have extreme values that result in a skewed distribution.
ggplot2 is a powerful data visualization tool of R. Make quick visualizations to explore or share your insights.
Learning how aesthetics and attributes are defined in ggplot will give you an edge to develop your skills quickly.
ggplot2 tips: distinction between aesthetics and attributes Aesthetics are defined inside aes() in ggplot syntax and attributes are outside the aes().
e.g. ggplot(data, aes(x, y, color=var1) + geom_point(size=6)
We typically understand aesthetics as how something looks, color, size etc.