Data Wrangling
Importing Data Select, Drop & Rename Filter, Sort & Sample Add Columns Cleaning Data Dates & Time Join Data Aggregate & Transform
Data Analysis
Exploring Data Plotting Continuous Variables Plotting Discrete Variables
Machine Learning
Data Preparation Linear Models
Other Tutorials & Content
Learn Python for Data Science Learn Alteryx



Filter, Sort and Sample Data

Pandas Filter Rows

Basic Filter

Filter the orders data frame to only include rows where the PromoID is 30_OFF:

orders[orders['PromoID'] == '30_OFF']

And Filter

Filter the orders data frame to only include rows where the PromoID is 30_OFF and the ProductID is 228722:

orders[(orders['PromoID'] == '30_OFF') & (orders['ProductID'] == 228722)]
View Our Profile on Datasnips.com to See Our Data Science Code Snippets

Or Filter

Filter the orders data frame to only include rows where the PromoID is 30_OFF or the ProductID is 228722:

orders[(orders['PromoID'] == '30_OFF') | (orders['ProductID'] == 228722)]

Basic Filter Using Loc

Here we have the same filter as the basic filter but using the loc operation. Just like when using loc for selecting columns and rows it creates a unique dataframe rather than a view like with the basic filter.

orders.loc[orders['PromoID'] == '30_OFF',:]

Pandas Sort Rows

Sort the orders dataframe by ProductID and then Order_Date in descending order.

orders.sort_values(by=['ProductID', 'Order_Date'], ascending=False, inplace=True)

Pandas Sample Rows

Sample 100 rows from the orders dataframe

orders.sample(n=100, random_state=101)

Sample 10% of rows from the orders dataframe

orders.sample(frac=0.1, random_state=101)