Blog Posts DMN

Data Validation with Rule Sets

Blog: The Data Center

Garbage in, garbage out was first uttered by an IBM employee back in 1965 and this computer science term is just as valid today as it was nearly 55 years ago. In fact, we are swimming in more data today than ever, which is created from new software applications and IOT devices that add to the data noise each and every second. With all of this data comes the need to ensure that data is cleaned, summarized or prepared for further analysis or action.

Many software applications lack data validation capabilities or they weren’t configured when the systems first went live. As a result, data records are often duplicated with multiple entries for the same (customer, vendor, product, complaint, ticket, etc, etc, etc). I think we have all experienced this phenomenon and there are likely duplicates living in nearly all our systems, creating havoc when we try to roll data up for analysis or associate different data elements.

It doesn’t have to stay this way. Rule sets within Decisions can be used to clean operational and reporting data. There are typically three methods that can be employed to clean and maintain data for both operational and reporting systems. In each example below, a set of data can be run through a set of data validation rules that all operate on different fields or attributes within a single data set.

Method 1 – Validate Upon Entry: This is always the ideal case, although some software applications don’t have the capability to use external services to incorporate this method. If the application does support this, data can be passed to the rules engine for processing before it is saved back to the application database.

Method 2 – Clean & Replace: Using this method, nightly jobs can be run that grab newly entered data from your operational system (CRM, ERP……) and processed through a rule set within Decisions. These rules can compare this data against previously generated rules that validate addresses, company names and any other commonly duplicated or misspelled data items. This data can then be standardized to a common spelling. Once cleaned the data can be replaced, deleted or repaired in the operational system.

Method 3 – Clean for Reporting: In this method, operational data is run through the rule set and cleaned prior to the data being added to a data warehouse or data lake. In this case, the data is left as-is in the operational system but is cleaned for entry into the data warehouse. If the data warehouse is where business people look to make operational decisions – this can be perfectly ok.

In each example above, the rules themselves are the same. The only difference is when and where the data validation takes place in your process. If you have a particularly tricky data cleaning project we would love to hear about it. Please feel free to reach out to

The post Data Validation with Rule Sets appeared first on Decisions Blog.

Leave a Comment

Get the BPI Web Feed

Using the HTML code below, you can display this Business Process Incubator page content with the current filter and sorting inside your web site for FREE.

Copy/Paste this code in your website html code:

<iframe src="" frameborder="0" scrolling="auto" width="100%" height="700">

Customizing your BPI Web Feed

You can click on the Get the BPI Web Feed link on any of our page to create the best possible feed for your site. Here are a few tips to customize your BPI Web Feed.

Customizing the Content Filter
On any page, you can add filter criteria using the MORE FILTERS interface:

Customizing the Content Filter

Customizing the Content Sorting
Clicking on the sorting options will also change the way your BPI Web Feed will be ordered on your site:

Get the BPI Web Feed

Some integration examples