top of page
learn_data_science.jpg

Data Scientist Program

 

Free Online Data Science Training for Complete Beginners.
 


No prior coding knowledge required!

Who is at the top? Covid-19

COVID-19 outbreak was first time experienced in the Wuhan City of China at the end of December 2019. Which spread rapidly in China and then worldwide in 209 countries of America, Europe, Australia and Asia. There are more than two hundred and fifty thousand plus deaths and 3.7 million plus people have been affected worldwide, while figure keeps on increasing on daily basis rapidly. Different steps have been taken worldwide for the control of COVID-19.


In this article we will draw insight about this virus and its trend. We are trying to answer a few questions.

1. Corona Virus's Spread Across Globe Over Time.

2. Top 10 Countries With

a. Highest Number of Confirmed Cases and Fraction they cover for the Global Confirmed Cases.

b. Highest Number of Deaths Reported and Fraction they cover for the Global Deaths Reported.

c. Highest Number Of Pending Cases and Fraction they cover for the Global Active Cases.

d. Highest Number Of Recovered Cases and Fraction they cover for the Global Recovered Cases.

e. Highest Recovery Rate For Closed Cases.

f. Highest Death Rate For Closed Cases.

3. Global Average of Confirmed Cases & Number of Countries Above Global Average.

4. Global Average of Active Cases & Number of Countries Above Global Average.

5. Global Average of Death & Number of Countries Above Global Average.

6. Global Recovery Rate For Closed Cases & Number of Countries Above Global Rate.

7. Global Death Rate For Closed Cases & Number of Countries Above Global Rate.

To answer these questions we will perform analysis on the following Kaggle Dataset. This dataset is updated on daily basis since 22nd January 2020 till date 7th May 2020.

Importing Datasets From github

df_confirmed = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')
df_confirmed = df_confirmed.drop(['Lat', 'Long'],axis=1)
df_confirmed.head(3)

df_covid19 = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/web-data/data/cases_country.csv")
df_covid19.head(3)

Cleaning Data

df_covid19 = df_covid19.drop(["People_Tested","People_Hospitalized","UID","ISO3","Mortality_Rate", "Lat", "Long_"],axis =1)
df_covid19['Country_Region'].replace(['United Kingdom'], ['UK'], inplace=True)

1. Corona Virus Spread Across Globe Over Time

case_nums_country = df_confirmed.groupby("Country/Region").sum().apply(lambda x: x[x > 0].count(), axis =0)
d = [datetime.strptime(date,'%m/%d/%y').strftime("%d %b") for date in case_nums_country.index]
plt.figure(figsize=(15, 8))
plt.plot(d, case_nums_country, color='crimson', linestyle='-', marker='o', markersize=6, markerfacecolor='#ffffff')

plt.xlabel("Dates")
plt.ylabel("Number of Countries/Regions")
plt.xticks(list(np.arange(0,len(d),int(len(d)/5))),d[:-1:int(len(d)/5)]+[d[-1]])

plt.savefig('Growth.png', dpi=500)
plt.show()

2.Top 10 Countries With

a. Highest Number of Confirmed Cases and Fraction they cover for the Global Confirmed Cases

df_covid19.sort_values(by='Confirmed', ascending=False, inplace=True)
top_10_cases = df_covid19.head(10)
plt.figure(figsize=(10, 5))
plt.bar('Country_Region', 'Confirmed', data=top_10_cases, color='darkcyan')

plt.ylabel('Confirmed Cases')
plt.title("Top 10 Countries (Confirmed Cases)")

plt.xticks(rotation=30)
plt.savefig('Confirmed.png', dpi=500)
plt.show()

Fraction Covered

df_covid19['Fraction_Confirmed'] = round((df_covid19['Confirmed']/df_covid19['Confirmed'].sum())*100, 2)
df_covid19.sort_values(by='Fraction_Confirmed', ascending=False, inplace=True)
fraction_confirmed = df_covid19.head(10)

plt.figure(figsize=(10, 10))
plt.pie(fraction_confirmed['Fraction_Confirmed'], labels=fraction_confirmed['Fraction_Confirmed'])
plt.legend(fraction_confirmed['Country_Region'], loc='center')
plt.savefig('Fraction_Confirmed.png', dpi=500)
plt.show()

b. Highest Number of Deaths Reported and Fraction they cover for the Global Deaths Reported

df_covid19.sort_values(by='Deaths', ascending=False, inplace=True)
top_10_deaths = df_covid19.head(10)

plt.figure(figsize=(10, 5))
plt.bar('Country_Region', 'Deaths', data=top_10_deaths, color='crimson')

plt.ylabel('Deaths')
plt.title("Top 10 Countries (Death Cases)")

plt.xticks(rotation=30)
plt.savefig('Deaths.png', dpi=500)
plt.show()

Fraction Covered

df_covid19['Fraction_Deaths'] = round((df_covid19['Deaths']/df_covid19['Deaths'].sum())*100, 2)
df_covid19.sort_values(by='Fraction_Deaths', ascending=False, inplace=True)
fraction_deaths = df_covid19.head(10)

plt.figure(figsize=(10, 10))
plt.pie(fraction_deaths['Fraction_Deaths'], labels=fraction_deaths['Fraction_Deaths'])
plt.legend(fraction_deaths['Country_Region'], loc='center')
plt.savefig('Fraction_Deaths.png', dpi=500)
plt.show()

c. Highest Number Of Pending Cases and Fraction they cover for the Global Active Cases

df_covid19.sort_values(by='Active', ascending=False, inplace=True)
top_10_active = df_covid19.head(10)

plt.figure(figsize=(10, 5))
plt.bar('Country_Region', 'Active', data=top_10_active, color='darkorange')

plt.ylabel('Active')
plt.title("Top 10 Countries (Active Cases)")

plt.xticks(rotation=30)
plt.savefig('Active.png', dpi=500)
plt.show()

Fraction Covered

df_covid19['Fraction_Active'] = round((df_covid19['Active']/df_covid19['Active'].sum())*100, 2)
df_covid19.sort_values(by='Fraction_Active', ascending=False, inplace=True)
fraction_active = df_covid19.head(10)

plt.figure(figsize=(10, 10))
plt.pie(fraction_active['Fraction_Active'], labels=fraction_active['Fraction_Active'])
plt.legend(fraction_active['Country_Region'], loc='center')
plt.savefig('Fraction_Active.png', dpi=500)
plt.show()

d. Highest Number Of Recovered Cases and Fraction they cover for the Global Recovered Cases

df_covid19.sort_values(by='Recovered', ascending=False, inplace=True)
top_10_recovered = df_covid19.head(10)

plt.figure(figsize=(10, 5))
plt.bar('Country_Region', 'Recovered', data=top_10_recovered, color='limegreen')

plt.ylabel('Recovered')
plt.title("Top 10 Countries (Recovered Cases)")

plt.xticks(rotation=30)
plt.savefig('Recovered.png', dpi=500)
plt.show()

Fraction Covered

df_covid19['Fraction_Recovered'] = round((df_covid19['Recovered']/df_covid19['Recovered'].sum())*100, 2)
df_covid19.sort_values(by='Fraction_Recovered', ascending=False, inplace=True)
fraction_recovered = df_covid19.head(10)

plt.figure(figsize=(10, 10))
plt.pie(fraction_recovered['Fraction_Recovered'], labels=fraction_recovered['Fraction_Recovered'])
plt.legend(fraction_recovered['Country_Region'], loc='center')
plt.savefig('Fraction_Recovered.png', dpi=500)
plt.show()

e. Highest Recovery Rate For Closed Cases

top_10_recovered['Percentage Recovered'] = top_10_recovered['Recovered']/(top_10_recovered['Confirmed'] - top_10_recovered['Active']) * 100

top_10_recovered.sort_values(by='Percentage Recovered', ascending=False, inplace=True)

plt.figure(figsize=(10, 5))
plt.plot('Country_Region', 'Percentage Recovered', data=top_10_recovered, color='limegreen', marker='o')

plt.ylabel('Percentage Recovered')
plt.title("Top 10 Countries (% Recovered Out of Closed Cases)")

plt.xticks(rotation=30)
plt.savefig('Recovery_Rate.png', dpi=500)
plt.show()

f. Highest Death Rate For Closed Cases

top_10_deaths['Percentage Deaths'] = top_10_deaths['Deaths']/(top_10_deaths['Confirmed'] - top_10_deaths['Active']) * 100

top_10_deaths.sort_values(by='Percentage Deaths', ascending=False, inplace=True)

plt.figure(figsize=(10, 5))
plt.plot('Country_Region', 'Percentage Deaths', data=top_10_deaths, color='crimson', marker='o')

plt.ylabel('Percentage Deaths')
plt.title("Top 10 Countries (% Deaths Out of Closed Cases)")

plt.xticks(rotation=30)
plt.savefig('Death_Rate.png', dpi=500)
plt.show()

3. Global Average of Confirmed Cases & Number of Countries Above Global Average

4. Global Average of Active Cases & Number of Countries Above Global Average

5. Global Average of Death & Number of Countries Above Global Average

6. Global Recovery Rate For Closed Cases & Number of Countries Above Global Rate

7. Global Death Rate For Closed Cases & Number of Countries Above Global Rate

STAY HOME, STAY SAFE

0 comments

Recent Posts

See All
bottom of page