In the exercises section, we imported the promotions data set and now we need to join this to the main data set to see how much (if any) discount to apply to the retail price of the product sold for a given purchase. The problem we have here is that not all products were purchased as part of a promotion and if we perform an inner join between the main data set and the promotions data then we will only get records that were included in a promotion. In this case we want to perform an outer join so that we get all the records from the main data set and we get information from the promotions data about the products that were ordered in a promotion and the discount rate to apply.
Alteryx doesn’t actually come with an obvious out of the box outer join tool but we can build one using the Join tool in combination with the union tool. In our case we are going to use a left outer tool with the main dataset being the left input and promotions data being the right. Let’s do this now.
Drag a Join Tool onto the workspace and connect “J” output from our previous Join tool to the “L” input of our new Join tool and connect the output of Data Cleansing tool from the promotions data stream to the “R” input of the Join tool.
Go into the Join tools configuration window and set the tool to join on the “PromoID” field from both inputs.
If you look down the field names in the Join tool you will see the “PromoID” field from the right input has been renamed “Right_PromoID” and is a duplicate field that we can de-select. Drag a Select Tool from the tool pallet and connect the “J” output of the second Join Tool then de-select the “Right PromoID” field.
Next drag a Union Tool from the Join section of the tool pallet onto the workspace. Connect first the output of the Select Tool to the input of the Union Tool and the also connect the “L” output of the second Join Tool to the same input of the Union Tool (the union tool allows multiple connections to it’s input).
Connect a Browse tool to the Union tool output and run to explore the results. What you should see now is the “Discount” field added to the main dataset. The field contains Null values for products not ordered as part of a promotion and a discount rate of either 0.2 or 0.3 for those products that were.
As we know, when the “Discount” field is NULL it means that the product wasn’t purchased as part of a promotion and the discount is zero so let’s use the Data Cleansing Tool to replace all NULLs with zeros. Drag a Data Cleansing Tool onto the workspace and connect the input to the output of the Union Tool.
Set the configuration as below to replace all NULLs with a 0 in the “Discount” field.
Finally let’s run the workflow. Looking at the output we can see that any NULL values have been with replaced with 0 as expected.
This is how we create left outer joins in Alteryx and we can use similar logic to create right outer joins and full outer joins.