Of all the tools, the Alteryx Generate Rows tool was one of the last tools I learned to use. I knew it wasn’t hard to use, I just didn’t put in the time because I could usually accomplish the same task with a join. But, it is a super handy tool, so I thought I’d write a quick tutorial on it.
What Does Generate Rows Do?
That might seem like a stupid question, but maybe you don’t know why you would need to generate rows. Generally, I use this tool when merging data sets that are at different levels of granularity. For example, I have daily oil and gas production data with one record for each day of the year. I also have a few other data points that I want to add to this data set, but those other data points only have one record per month.
Now, you might still be asking why I would use Generate Rows rather than a join. Valid point, but what about when data is missing? What if your data set should have a data point for every day of the year, but it doesn’t? This is where the Generate Rows tool is really handy. Use the Generate Rows tool to create one record for each day of the year and then join to your data set. Then, you can see where data is missing.
To demonstrate how this tool works, we’ll work with a small data set that has one record for each month. We’ll generate rows and expand the data set to one record for each day of the year. Here’s a look at the data set.
Generate Rows Configuration
The screenshot below shows the full configuration for the Alteryx Generate Rows configuration. I’ll explain each part one step at a time.
First, decide whether you are going to create a new field or update an existing field. In most of my use cases, I create a new field. In this case, I will create a new field called Date whose values will represent each day of the year.
Then, you must add an Initialization Expression, which gives the tool a starting point. I want it to use the Start Date column.
Next, you must create a Conditional Expression that tells the tool when to stop creating new rows. I want Alteryx to stop adding rows when the new Date column is less than or equal to my End Date column.
Lastly, enter a Loop Expression that tells the tool how to generate new rows. Since we are working with dates, it’s easy to use the DateTimeAdd expression and tell the tool to add one day to the Date column.
Now, just to be clear, my workflow here consists of 2 tools, a Text Input tool and the Generate rows tool with the config above. It’s so simple, that I don’t even need to show a screenshot of it.
After running the workflow, there are now 273 records. My original data set contained 9 records, representing one record for each month from January thru September (31+28+31+30+31+30+31+31+30 =273). Those days add up to 273, so the record count is correct. You can see below the value repeats for each record added. This can now easily be joined to my production data set with one record per day.
But before you close this window….Keep reading puhleez…
How It Works
To be able to apply this to other scenarios, you need to be crystal clear on how the tool works, so I want to explain a bit further using a different data set(s).
It’s important to understand that the tool kicks off with the Initialization Expression and places the value of the Initialization Expression into the first record of the newly created column.
Here you can see the RecordID becomes the first value in New Field. Then, for all subsequent rows (until the Condition Expression is met), it adds 1 per the Loop Expression.
Also know that the Initialization Expression can be a hard-coded value, which further illustrates that the tool puts the Initialization Expression into the first record of the new column.
Lastly, observe that the Condition expression is applied to each row of the original data set. So, keep this in mind when writing the Conditional Expression.
And now you know how to use the Alteryx Generate Rows tool in any situation or circumstance! Check out some of my other Alteryx content if you found this useful.