Often when we work with data sources such as excel or csv files we will be working with data spread across multiple files where we may want all the data in those files or data in just a few of them. For this purpose we can use the Alteryx dynamic input tool which allows us to declare which file or files to import at the time the workflow runs.
In this tutorial we are going to be using files containing details of daily orders that are all stored in a folder with each file containing a date identifier as part of the filename. Each file contains a sample of 2000 orders for the respective days.
We will use the Dynamic Input tool along with the Directory tool to first import all the files into a single dataset and then see how we can use the Sort and Sample tools to make Alteryx only import the latest file.
To start with we have to get the metadata of the files in the folder our files are stored in. To do this we use the Alteryx Directory Tool. The Alteryx Directory Tool allows us to read in metadata from files in a particular folder and use it as a data source just like any other.
Let’s start by dragging the Directory tool into our workspace and then in the configuration window set the directory to the location of our files.
If we run the workflow and look at the output of the Directory tool then we can see that we have a row for each of our files in that location and various fields detailing the metadata of each file such as filename, path and dates relating to creation and last modification.
Now that we have the metadata let’s add in a Dynamic Input tool by searching for the tool, dragging into our workspace and connecting it to the Directory tool.
If we click on the tool to open the configuration window we can set the input data source template and the parameters for how Alteryx will know which files to import.
For this to work, all the files we we intend to import need to contain the same schema (in other words, every file contains the same columns) so we need to initially provide Alteryx with the path to a file that contains the same structure of the files we intend to import. In our case we have set the template file to one of the files already in our orders folder.
After this we need to declare which files to import. When we imported the metadata of our files using the directory tool we got the full path for each file in our folder.
We can use this as the dynamic “Field” and set the dynamic tool to “Change Entire File Path”. This means Alteryx will use the file paths from our directory tool as a list of data sources and will iteratively import each one of these files into a single dataset.
Let’s run the the workflow now and check the output.
As you can see there are 4000 rows in this output as we have dynamically imported data from the two files each containing 2000 rows.
Sometimes rather than importing all the files from a folder we will only need the latest file. We can do this by using a Sort tool and Sample tool. We are going to use the same folder and files we used previously to demonstrate this.
Our starting point is the exact same workflow we created above but to start with we are going to drop a Sort tool in-between the Directory tool and the Dynamic Input tool. In the configurations window we set the tool to sort the data by FileName and the order to be descending.
We do this because the date the file was generated is included in the filename and therefore sorting on this field will mean the latest file appears in the first row of our data that comes from the Directory tool. We could have also used the “CreationTime” or “Last Write Time” fields in this case but ultimately which fields are best to use depends on your use case and how your files are created or updated.
Our last step is to add a Sample tool after the Sort tool. In the configuration we set the sample tool to take the “First N Records” and then set N to 1.
This will result in only the metadata from the file in the first row of our data, which we know is the from the latest file and therefore only information about this one file will be passed to the Dynamic Input tool.
So there we have it, two simple ways to work with multiple files stored in a folder using the Alteryx Dynamic Input tool. The steps used here can be modified to suit different use cases but it shows the power of using the Directory tool and Dynamic Input tool for manipulating the import process for files stored in folders.