Blog Posts Process Management Robotic Process Automation (RPA) Tools

What a Robot Sees: Using OCR in RPA


In our post about How to Train Your Robot, we briefly touched on the basic ideas that govern robotic process automation (RPA) and how UiPath Desktop makes using RPA easy for anyone.  It’s worth digging a little deeper into some of the ways UiPath brings automation within the grasp of any computer user, whether you’re a coding master or a self-proclaimed novice.

Perhaps the most useful and versatile tool in the UiPath platform is the Record function. Rather than mapping out a process step-by-step, which can be very time-consuming, any UiPath user can teach a robot to “do as I do” by recording the process as it happens.  The software robot follows your clicks and actions on the screen (aka the presentation layer) and then turns them into an editable workflow.  If you’re working entirely in local programs, that’s as much as you’d need to know.  When accessing remote systems and databases, like Citrix or the open web, a UiPath robot can really show off its abilities.

With remote applications, UiPath’s Record function has a difficult time distinguishing things like buttons and text fields.  The whole application window looks like one big button to the robots.  However, these robots are equipped with optical character recognition (OCR), which allows a computer to distinguish a ‘B’ from a ‘D’, for example, even if the size or font is different.  While recording, a UiPath user can run OCR, select the appropriate text within the window, and the robot will be able to locate that text every single time after.  Even if the text is in a different place, it still works; in fact, using OCR is a much more reliable way to automate.

And it’s not just text that UiPath can recognize, but also images.  Again, in remote applications, everything can look the same to an RPA robot, but UiPath solves this problem with excellent image recognition software.  You simply indicate the image you want your robot to identify in the application window, like a “Create expense report” button, and no matter where it appears on the screen in later processes, the UiPath robots can find it.

If you’d like to see OCR in action, watch this tutorial video.  UiPath can do a lot more than just recognize letters and numbers! 

Sophisticated character and image recognition software is really at the heart of why RPA works today.  You could say that robots have become more perceptive in recent years, though we’re still years away from robots that can make complex decisions based on those perceptions.  Then again, this kind of software has made self-driving cars at Google a reality, so maybe that future is not so far away.

Leave a Comment

Get the BPI Web Feed

Using the HTML code below, you can display this Business Process Incubator page content with the current filter and sorting inside your web site for FREE.

Copy/Paste this code in your website html code:

<iframe src="" frameborder="0" scrolling="auto" width="100%" height="700">

Customizing your BPI Web Feed

You can click on the Get the BPI Web Feed link on any of our page to create the best possible feed for your site. Here are a few tips to customize your BPI Web Feed.

Customizing the Content Filter
On any page, you can add filter criteria using the MORE FILTERS interface:

Customizing the Content Filter

Customizing the Content Sorting
Clicking on the sorting options will also change the way your BPI Web Feed will be ordered on your site:

Get the BPI Web Feed

Some integration examples