The Danger of Hiding Data in Spotfire

Occasionally, I have a really hard time coming up with a good title for a post. This is one of those circumstances. But what I’m going to show you here is important if you want to sharpen your troubleshooting skills. A user reached out to me this week for help with a calculation on a cross table. It seemed like Spotfire was calculating inconsistently, but software doesn’t work that way. So, we had to be missing something. Would you have figured this out? How quickly would you have solved this puzzle? Can you find data hiding in Spotfire? Read on to learn more.

My Use Case

Here’s a look at my cross table. I have a column called “Scenario Comparison” that contains 2 values that represent spending scenarios, Scenario 1 and Scenario 2. They are chosen from a drop-down on the left-hand side of the screen. The value column, also chosen with a drop-down, is called “Pool of Money” (for funsies).

The Value Selector drop-down configures the x-axis, and a custom expression (shown below) calculates values on the y-axis. If you’ve never written calculations with axis names or don’t know how to use axis names on cross tables, check out this post. If the “$esc” function is unfamiliar, check out this post. Or, if you want to know how to use a property control to change a display name, see this post.

Sum($esc(${COLUMNSELECTION})) as [${COLUMNSELECTION}],
Sum($esc(${COLUMNSELECTION})) - Sum($esc(${COLUMNSELECTION})) OVER (Previous([Axis.Columns])) as [Difference]

The first part of the expression takes the string passed thru the COLUMNSELECTION property control and puts square brackets around it so that Spotfire interprets it as a column of data. The second part of the custom expression calculates the difference between scenario 1 and scenario 2. Axis.Columns references Scenario Comparison because Scenario Comparison is a “column” in the cross table.

The Problem

So, what’s the problem? In the first box, notice that Scenario 1 doesn’t have a value and Scenario 2 does have a value. The difference calculation is null. This is what I would expect from a calculation where one value is null. In the second box, Scenario 1 doesn’t have a value. Scenario 2 does have a value, and the calculation produces a result, which is not what I would expect. And just for information purposes, know that when 2 values are present, Spotfire calculates correctly. So, what’s going on here?

You Don’t Know What You Can’t See

It turns out the user was using the Show/Hide feature to hide zeros. The full data set contained several wells where both scenarios were zero. Showing all those zeros cluttered up the chart, so they used the Show/Hide functionality to filter them out as shown below. However, that hid all zeros.

Here’s the Show/Hide rule.

To illustrate, I’ve filtered down to a handful of wells and taken screenshots where Show/Hide is applied and where it is not to compare and contrast. Here’s the result with the rule removed. What do you notice?

  1. With the rule removed, I can see that some wells actually have a zero and some have a null, which wasn’t visible before.
  2. There are more wells in this list than in the list with Show/Hide applied. Show/Hide removed wells with null for Scenario 1.
  3. Previously, we though Spotfire was calculating inconsistently. But really, when the expression was calculating, it was because there is a zero, not a null. The Show/Hide rule removed the zero and made it look like a null, but it wasn’t null.

Now, this is starting to make sense. And, Just to be completely clear, I added in a few dummy wells where one well has 0 in Scenario 1. One well contains null for Scenario 1. And one well doesn’t have a record for Scenario 1 at all. I wanted to see if “null” and “does not exist” were treated differently. They are treated the same.

Here’s the result with the Show/Hide rule in place. Personally, I think this is confusing and hard to understand. It certainly doesn’t tell the whole story.

Conclusion

So, in the end, it came down to the show hide rule “obscuring” data and the user not being aware that some wells didn’t have records for Scenario 1. This was quite the puzzle and took a good bit of digging to find data hiding in Spotfire. If you find yourself in a troubleshooting conundrum in the future, here’s how I would break it down.

  1. Profile the data.
  2. Remove filters.
  3. Review transformations.
  4. Review data limiting with expression and marking.
  5. Check out show hide.

Ultimately, these are all the places data can be hiding in Spotfire. And most importantly, make sure you know and understand the data.

Other Sweet Spotfire Content

1 thought on “The Danger of Hiding Data in Spotfire”

  1. Pingback: How to Modify Individual Values in Spotfire » The Analytics Corner

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.