Data Wrangling
Importing Data Select, Drop & Rename Filter, Sort & Sample Add Columns Cleaning Data Dates & Time Join Data Aggregate & Transform
Data Analysis
Exploring Data Plotting Continuous Variables Plotting Discrete Variables
Machine Learning
Data Preparation Linear Models
Other Tutorials & Content
Learn Python for Data Science Learn Alteryx Blog



Cleaning Data with Pandas

Fill NA with Pandas

Fill NAs in the PromoID column of the orders dataframe with "None":

orders['PromoID'].fillna('None',inplace=True)

Drop NAs with Pandas

Drop any rows from the orders dataframe where the ProductID column contains an NA:

orders['ProductID'].dropna(inplace=True)

Imputing Data

Replace NAs in the shipping_cost column with 3 if the retail price is greater then 30 otherwise replace with 0.

def impute_shipping_cost(row):
shipping_cost = row['Shipping_Cost']
price = row['Retail_Price']
if pd.isnull(shipping_cost):
if price > 30:
return 0
else:
return 3
else:
return shipping_cost

products['Shipping_Cost'] = products[['Retail_Price','Shipping_Cost']].apply(impute_shipping_cost, axis=1)