It’s a fact. If you have manual data entry, there will be errors. I found this out the hard way when working with our completions team on a Spotfire KPI project. We built the Spotfire KPIs and were attempting to tie out to spreadsheets. The numbers didn’t match. Discrepancies consistently traced back to bad data entry. We would fix the bad data, but without proper controls to keep it out, we were chasing our tails. So, we addressed bad data with a QAQC or error report. The first version was all Spotfire, but it had flaws. Version 2 performed error reporting with Alteryx. Ultimately, I wound up with a combination of Alteryx and Spotfire. To see what it looks like and how it was implemented, read on.
The Spotfire Version (v1.0)
Version 1.0 of my QAQC report was born in Spotfire because I wasn’t super confident in Alteryx yet, and we needed a solution fast. I used Spotfire desktop to create a project consisting of visualizations showing violations of business rules (think Data Limiting with Expressions). I made it available via Spotfire web player to those in field offices with network connections. For those without, I spend PDF reports with Spotfire Automation Services. Now, here’s what I loved and hated about this solution.
I loved that it was simple, and it worked. Here is a quick sample showing where downtime coding rules were violated. As you can see, I tried to explain the rule in the title and description.
Unfortunately, there were drawbacks to this overall solution. Let me show you.
Here’s a screenshot of my title page. As you can see, the QAQC report was 76 pages long. Whaatt???
I couldn’t exclude pages or visualizations if there were no errors. I could only show blank visualizations. In the case below, I was checking to make sure the rig end date occurred after the rig start date. There are no errors, but I have to show a blank visualization. To make sure the data wasn’t squished together, I set up the report with one visualization per page. Thus, the guys/gals in the field had to flip thru all 76+ pages.
The biggest drawback of this solution was on the business side. Completions data lives in WellView. In WellView, field staff must navigate one well at a time. Unfortunately, the project was organized by table where the error was found. Field staff didn’t have a way to see all the errors for a well and correct accordingly. Thus, they wasted a lot of time switching between wells in WellView. This problem only got worse as the number of rules and rule complexity increased. Thus, I created version 2.0.
The Alteryx Version (v2.0)
The cornerstone of the Alteryx solution was an architecture that would spit out one table with all the errors and allow field staff to look at the errors by well. Here’s how error reporting with Alteryx works.
- Bring in all the tables
- Reduce to necessary columns
- Check for errors
- Rename the columns to fit into a “standard structure”
- Provide additional information about the rules being checked
- Union everything together
Error Reporting With Alteryx
Here’s a sample of 5 rules applied to the wvjobprogram phase table. You’ll notice a formula tool and a select tool toward the end of each rule.
The select tool is there to make sure that I have the same 11 columns of information for every rule and to rename them to fit into the standard structure I created for this purpose.
I went with the smallest structure I could get away with. Sure, some rules would have been helped by more information. But, I thought it prudent to keep it simple.
The formula tool is where I provided additional information field staff would need to fix the error.
Finally, I unioned everything together (always a manual configuration).
Just in case you want to see the whole thing, here’s what the workflow looks like. Over 70 QAQC rules live in the brightly colored containers.
Spotfire Still Plays a Part
Now, I still wanted to use Spotfire and Spotfire Automation Services because I wanted users to be able to interact with the content. Here is what the result looked like in Spotfire.
This was a much better solution than the previous version. This allowed me to put all of the errors into one page, and it was still clear and coherent.
Spotfire or Alteryx?
Now, you may be questioning whether I could have created version 2.0 in Spotfire instead of Alteryx. I could have. I didn’t for several reasons.
- The same data-wrangling done in Spotfire would have been slow to open. This solution is very fast.
- I renamed 2 – 5 columns for every rule. There are over 70 rules. Renaming columns in Spotfire is still ridiculously painful. It’s one of the improvements the application desperately needs. This task is much simpler in Alteryx.
- I used the exact same formula tool for every rule. Copy and paste. You can’t transport formulas between Spotfire tables. It’s labor-intensive.
- Every time the workflow runs, I calculate the number of errors and append a table. This allows me to create a picture of data health over time as shown below. This would not have been possible with Spotfire.
You can even see a mistake or two I made when building the rules. My mistake in the rule logic caused the number of errors to shoot up artificially until I fixed it.
In conclusion, I am very proud of this error reporting with Alteryx workflow and the impact it has had on the business. It allows field staff to fix bad data hours after entry, which keeps KPIs clean and trustworthy.
With that said, I’m clearly also working out when to use Spotfire and when to use Alteryx. They both provide value in different scenarios. I find myself leaning towards Alteryx anytime there is heavy data processing, and then leaning towards Spotfire when there is a lot of user interaction with the data. I really enjoy using both tools.
Content created with Spotfire 10.2 and Alteryx 2019.2.7.
If You Enjoyed This
Check out some of my other work on Alteryx.