CORONA IS HERE
Corona is a virulent disease that has invaded our lives. Wuhan being its birthplace, the non-sojourner virus has moved from the east of the world, China, to the west of the world, United States of America. Work has been brought to a halt, free movement has been stymied and the world economy has been cratered; the corona virus has been a force to reckon with since it’s unprecedented infection on December, 2019. With few infections in January, 2020, the virus has however proliferated in April, 2020 and it’s affecting every corner of the earth. With the death toll gradually increasing, we as data scientists cannot just watch but ask the vital questions to alleviate the fatal viral situation at hand. Even better, we wish to unearth the underlying pattern to slow down the spread curve. Consequently, this thread is an exploratory data analysis(EDA) which asks three necessary questions to better understand the COVID 19 data to help build better predictive models to stop the virus.
Before we ask the necessary questions, let’s first understand our datasets. Our datasets were two disparate datasets from Our World In Data, www.ourworldindata.org. The first data frame, named location, consists of five disparate columns - 'countriesAndTerritories', 'population_year', 'location', 'continent' and 'population'. The second data frame, named fdata