Extract Numbers from a String in PAD

This weThis week, I am working on a PowerAutomate Desktop flow that extracts data from a PDF. I need to extract numbers from a string. Rather than use regex, the Recognize Entities in Text action made this task a breeze. But, because it wasn’t without its quirks and challenges, I’m doing a write up on how to use it. Read on to learn more.

My Use Case

I’ve scraped data out of a PDF and put it into a list called OcrList. I want all of the numbers in line 1 and none of the text. The format will always be as shown, with the text “Land Number”, followed by some whitespace and then 9 numbers. I want all 9 numbers, but I won’t have to specify the number of numbers, just that I want numbers.

The screenshot below shows the 4 PAD steps required to extract the numbers. I’ll go thru each one in detail, but here are the high level steps.

  1. Set variable looks at my list and grabs index position one and puts it into a variable.
  2. Recognize entities in text pulls out the numbers and puts it into a data table.
  3. Set variable pulls one value from the data table and puts it into a variable.
  4. Convert number to text converts the number to a text so I can use it in a string later on in the flow.

Detailed Steps

Set Variable

The first thing I have to do is get from a list with many rows or values down to just the one row that I want to parse. I use Set Variable and the syntax shown (square brackets inside the variable reference) to grab index position 1. Note, list indexes start at 0. I put that into a variable called ItemLandNumber.

Recognize Entities in Text

The Recognize Entities in Text action lives in the Text menu as shown below.

I select Number from the Entity type drop down, and it will grab only the number. This action creates a new variable called RecognizedEntities.

Explore the Entity type drop down, and you’ll see there are a number of things PAD can recognize like percentages, a number range, currency, phone numbers, emails, or even temperature.

But, this is where it gets a little tricky because the output variable, RecognizedEntities, is a data table. It outputs to a table because PAD provides the original text and an extracted value. I can tell it’s a data table because it appears like this under flow variables.

And if I drill into the view, I see this. That means, I need to extract what I want from the table, which I do in the next step.

Set Variable Again

I use another Set Variable action and similar syntax as last time to reference the first cell in the table. Because all indexes start at 0, I use [0][0] to reference the first row and first column.

Convert Number to Text

Finally, I need to convert my number to text in one last step because it will become part of another string.

And that is how I use the Recognize entities in text action to extract numbers from a string. Maybe regex is more efficient, but I’m not great at regex. This was a better solution for me.

Other Sweet PowerAutomate or PAD Content

1 thought on “Extract Numbers from a String in PAD”

  1. Pingback: Getting Oriented with Power Automate » The Analytics Corner

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.