GroupBy and Count
Create a new DataFrame that groups the Orders DataFrame by the distinct values in the Order_Date and ProductID columns and counts the Order_No column:
orders_by_day = orders.groupby(['Order_Date',
'ProductID'],as_index=False)
['Order_No'].count()
GroupBy and Sum
Create a new DataFrame that groups the Orders DataFrame by the distinct values in the Order_Date and ProductID columns and sums the units column:
orders_by_day = orders.groupby(['Order_Date',
'ProductID'],as_index=False)
['units'].sum()
GroupBy and Agg
Create a new DataFrame that groups the Orders DataFrame by the distinct values in the Genre column and determines the mean, maximum, minimum and standard deviation of each value:
genre_aggregations = products.groupby(['Genre'],
as_index=False)['Cost'].agg(['mean','max','min','std']).reset_index()
Running Total Column
Creates a column in the orders DataFrame that contains a running total for each ProductID over time:
orders.sort_values(by=['Order_Date'],
inplace=True)
orders['running_total'] = orders.groupby(['ProductID'],as_index=False)
['units'].cumsum()
Pivot Data
Creates a new DataFrame that pivots the orders Dataframe so the rows are product IDs, the columns are order dates and the cell values are units sold. Any NAs are filled with 0s.
units_by_day = orders.groupby(['Order_Date','ProductID'],as_index=False)['units'].sum()
units_by_day_pivot = units_by_day.pivot(index='ProductID', columns='Order_Date', values='units').reset_index()
units_by_day_pivot.fillna(0, inplace=True)