Posts

An accidental side effect of text mining

Extended version of https://towardsdatascience.com/an-accidental-side-effect-of-text-mining-4b43f8ee1273 “The end of our exploring will …

Speed boosting in R: Writing efficient code & parallel programming

Have more things happen at once: Parallel Programming Parallel processing is about using multiple cores of your computer’s CPU to run …

Cleaning and visualizing Five-year cancer survival statistics with ggplot2 and animation

Where are we standing on fight against cancer? Five-year survival rates is a good indicator of improvement in cancer medicine. I am …

An intuitive real life example of a binomial distribution and how to simulate it in R: Learn it once, use it everyday

Learn. It is all about success and failure. What are binomial distributions and why are they so useful? When we repeat a set of events …

Latest trends in automobile industry, Which are the top family cars for your weekend trip?

This week, I will analyze Car Fuel Economy dataset from TidyTuesday. What is TidyTuesday? TidyTuesday is a weekly social data project …

Add custom summary statistics in ggplot2

It is hard to understand your data by looking at the numbers on a csv file. You need to plot it. And adding statistics to your plots …

Data Preparation: Web Scraping html tables with rvest

Accessing different data sources Sometimes, the data you need is available on the web. Accessing those will ease your life as a data …

What is aesthetics and attributes in ggplot's world?

ggplot2 is a powerful data visualization tool of R. Make quick visualizations to explore or share your insights. Learning how …

Why not everyone who smokes develop cancer or who eats a lot develop fatty liver disease? Predicting diseases with machine learning

We are much better at handling diseases than 30 years ago. For example cancer survival rates are much higher now. The significant …

Data Wrangling for Text mining: Extract individual elements from a Book

My ambitious goal is to write a machine learning algorithm that predicts authors. But let’s start with something simpler. An important …