Data preparation

Data Wrangling for Text mining: Extract individual elements from a Book

My ambitious goal is to write a machine learning algorithm that predicts authors. But let’s start with something simpler. An important part in a Data Science workflow is data preparation. Clean it, reformat it and make it usable for further analysis. Figure 1: Photo by Patrick Tomasso on Unsplash I will work on a Poetry book called “New Poems” from D. H. Lawrence. You can download it from Project Gutenberg website which is a library of over 60,000 free eBooks.