In our last sprint, the AAET team worked up a series of Alteryx workflows for our annual planning process that combines forecast and actual data. It’s a massive data set that pulls from a new forecasting database, so we are beginners with this data. One of my coworkers, Jack Stewart, showed me an Alteryx hack he uses with Cross Tab tools to ensure the granularity of the data is what he thinks it is. As I took over his workflows, this hack saved me multiple times when I didn’t realize records had been duplicated. Read on to learn a new Alteryx trick!
If you prefer video to text, check out the same post on YouTube, available below or with this link. Also consider subscribing to my YouTube channel.
My Sample Data Set
To demonstrate this hack, I am going to use a fake production data set. It is “tall, skinny”, and I have one record per well, per day, per product. The goal is to transform this data into one record per day, with a column for oil and a column for gas.
The Alteryx Hack
The trick to confirming the granularity is simple.
- Convert numeric columns to string columns before the Cross Tab tool.
- Then, concatenate the data rather than summing.
- Convert the string columns back to numeric after the Cross Tab tool.
Here is my basic workflow with a Text Input tool, two Select Tools, and a Cross Tab tool.
In the first Select tool, change the Amount column from a numeric data type to a string.
In the Cross Tab, define the granularity expected with key columns Date and Well. Concatenate the strings, and if the granularity is correct, there will only be one value in each cell. If incorrect, there will be multiple values separated by a comma.
Finally, in the second Select tool, change the newly transformed columns into numeric data types.
How It Works
This hack works because it makes bad data visible in three ways.
- You will see concatenated values on the output of the Cross Tab.
- Alteryx will warn you about commas.
- If the data set is large enough, you will get truncated values, and Alteryx will issue more warnings.
With that said, I do want to warn you that after the last Select tool, the concatenated values disappear, so make sure to check results after the Cross Tab.
And now you know a trick for testing the granularity of your data. This trick has saved me a ton of time and headache in building out large workflows. I hope it helps you too!
Content created with Alteryx 2020.3.
Pingback: How to Reduce the Size of an Alteryx Data Set » The Analytics Corner