top of page
learn_data_science.jpg

Data Scientist Program

 

Free Online Data Science Training for Complete Beginners.
 


No prior coding knowledge required!

Pandas Technique: Pivot Table

Writer's picture: Abu Bin FahdAbu Bin Fahd

Pivot table in pandas is an excellent tool to summarize one or more numeric variable based on two other categorical variables. Pivot tables in pandas are popularly seen in MS Excel files. In python, Pivot tables of pandas dataframes can be created using the command: pandas. pivot_table.

import pandas as pd
import numpy as np

Read Dataset

df = pd.read_csv('Srt_dta.csv')
df

Pivot Table

Here we calculated the mean of individual weight(kg) with color values: column to aggregate, optional (If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table index. If an array is passed, it is being used as the same manner as column values.) index: column, Grouper, array, or list of the previous


df.pivot_table(values='Weight(kg)', index='Color')

Different Statistics

df.pivot_table(values='Weight(kg)', index='Color', aggfunc=np.median)

Multiple Statistics

aggfunc : function, list of functions, dict, default numpy.mean

If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves) If dict is passed, the key is column to aggregate and value is function or list of functions.

df.pivot_table(values='Weight(kg)', index='Color', aggfunc=[np.mean, np.median])

Pivot on two variable

columns: column, Grouper, array, or list of the previous

If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table column. If an array is passed, it is being used as the same manner as column values.

df.pivot_table(values='Weight(kg)', index='Color', columns='Breed')

Filling missing values in pivot tables

fill_value: scalar, default None

Value to replace missing values with (in the resulting pivot table, after aggregation).

df.pivot_table(values='Weight(kg)', index='Color', columns='Breed', fill_value=0)

Summing with pivot tables

margins: bool, default False

Add all row / columns (e.g. for subtotal / grand totals)

df.pivot_table(values='Weight(kg)', index='Color', columns='Breed', fill_value=0, margins=True)

0 comments

Recent Posts

See All

Comments


COURSES, PROGRAMS & CERTIFICATIONS

 

Advanced Business Analytics Specialization

Applied Data Science with Python (University of Michigan)

Data Analyst Professional Certificate (IBM)

Data Science Professional Certificate (IBM)

Data Science Specialization (John Hopkins University)

Data Science with Python Certification Training 

Data Scientist Career Path

Data Scientist Nano Degree Program

Data Scientist Program

Deep Learning Specialization

Machine Learning Course (Andrew Ng @ Stanford)

Machine Learning, Data Science and Deep Learning

Machine Learning Specialization (University of Washington)

Master Python for Data Science

Mathematics for Machine Learning (Imperial College London)

Programming with Python

Python for Everybody Specialization (University of Michigan)

Python Machine Learning Certification Training

Reinforcement Learning Specialization (University of Alberta)

Join our mailing list

Data Insight participates in affiliate programs and may sometimes get a commission through purchases made through our links without any additional cost to our visitors.

bottom of page