Use Find Replace Tools to Simplify Regex

On Monday, I said I would share two Alteryx tips and tricks learned from coworkers. The first tip was from Alice Yu, who showed me that it’s possible to use the In operator in Alteryx Filter tools. The second tip comes from Jack Stewart, who is replacing Alice on our Analytics team. Jack helped me out on a particularly sticky regex problem. I wanted to showcase his solution because it shows how to use Find and Replace tools to simplify regex. Anytime I can make regex simpler, I am all for it. Read on to learn more.

Use Case

My current project is a data quality improvement project. I use Alteryx to apply business rules to data sets and ferret out bad data. For example, the depths shown in the comments field of our operations report must match the depths shown in the top and bottom depths columns. As you might imagine, the comments field is messy and free form. Usually, the comments use “T/” and “F/” notation to indicate “to” and “from”, but the information appears in a variety of formats. Take a look at these examples. (Note: These depths are prone to typos, so I need to inspect them).

Depths are 7,510 to 1200.
Depths are 7,569 and 101.
Depths are 236 and 420.

Jack’s Regex Solution

Clearly, this problem requires a regex solution. But, my regex skills are in development. And, we were sprinting, which means I only had 2 weeks to complete the project. Jack just came on to our team and wasn’t deep in this sprint. He is also a regex expert, so I asked for his help. In his solution, I was most impressed with his use of Find Replace tools to simplify the regex he would have to write.

Jack started with two Text Input tools to serve as a catch-all for the different ways in which to and from could be specified. See below for a screenshot of the start of the workflow and the two containers he created to hold the Text Input tools and Find Replace tools.

Workflow

Text Input

In the From container, in the Text Input tool, Jack entered all the different ways “from” can be specified. As you can see, “from” often appears lowercase, sometimes capitalized, or misspelled. Sometimes different words are used instead like “slide” or “rotate” are used instead. No matter what goes into the workflow “F/” is what will come out and what his regex will be written for. Note: The column names in the Text Input tool are “From” and “To” because Jack changes the text from something to something else.

Next, the screenshot below shows the corresponding Text Input for the To container.

Find Replace

Finally, the screenshot below shows the first Find Replace tool. Here, he tells Alteryx to look in the “time log com” field for the values entered in the “From” field (in the Text Input). Replace it with what is in the “To” field. Then, the second Find Replace tool just repeats the process for the second Text Input tool.

After the two Find Replace tools, comments come out like the examples shown below. Now, there is less variability in the data that he will apply the regex to. He will be able to use simpler regex or fewer regex tools because he is reducing the number of text strings the expression searches for. I thought this was a creative way to use Find Replace Tools to simplify regex.

In conclusion, welcome to the team Jack! I expect we’ll learn a lot from each other.

Alteryx Version

Content created with Alteryx version 2019.4.8.

If You Enjoyed This, ….

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.