Writing fast constraints with OptaPlanner: the secret recipe

Blog: Drools & jBPM Blog

Do you want OptaPlanner to run faster?
Do you want to increase your score calculation speed, reaching great solutions sooner?
Let me show you how to optimize your Constraint Streams constraints for performance and scalability.
Turns out you only need to remember one advice:

Do less

The key to well-performing constraints is limiting the amount of data that flows through your Constraint Streams,
which starts with joins.
Consider a school timetabling problem, where a teacher must not have two overlapping lessons.
This is how the lesson could look in Java:

    @PlanningEntity
    class Lesson {

        ...

        Teacher getTeacher() { ... }

        boolean overlaps(Lesson anotherLesson) { ... }

        boolean isCancelled() { ... }

        ...

    }

The simplest possible Constraint Stream we could write to penalize all overlapping lessons would then look like:

    constraintFactory.from(Lesson.class)
        .join(Lesson.class)
        .filter((leftLesson, rightLesson) ->
            !leftLesson.isCancelled()
            && !rightLesson.isCancelled()
            && leftLesson.getTeacher()
                .equals(rightLesson.getTeacher())
            && leftLesson.overlaps(rightLesson))
        .penalize("Teacher lesson overlap", HardSoftScore.ONE_HARD)

What this Constraint Stream does is:

It creates all possible pairs of Lessons from the planning solution.
Then it filters out all the lessons that are cancelled, where the teachers do not match, or which do not overlap.
It penalizes all the remaining lesson pairs.

Do you see the problem here?
The join creates a cross product between lessons,
producing a match (also called a tuple) for every possible combination of two lessons,
even though we know that many of these matches will not be penalized.
This shows the problem in numbers:

In order to process a thousand lessons, our constraint first creates a cross product of 1 million pairs,
only to throw away pretty much all of them before penalizing!
If we can reduce the size of the cross product by half, only half of the time will be spent processing it.
This is where the original advice comes into play: do less, by avoiding unrestricted cross product.
Here’s how.

Table 1. Fast growth of cross product
Number of lessons	Number of possible pairs
10	100
100	10 000
1 000	1 000 000

Filter before joining

As you can see from the first example, cancelled lessons are eventually filtered out after the join.
Let’s see if we can remove them from the cross product instead.
For the first lesson in the join (also called “left”), this is straightforward;
we simply bring the cancellation check before the join like so:

    constraintFactory.from(Lesson.class)
        .filter(lesson -> !lesson.isCancelled())
        .join(Lesson.class)
        .filter((leftLesson, rightLesson) ->
            !rightLesson.isCancelled()
            && leftLesson.getTeacher() == rightLesson.getTeacher()
            && leftLesson.overlaps(rightLesson))
        .penalize("Teacher lesson overlap", HardSoftScore.ONE_HARD)

The cancelled lessons are no longer coming in from the left, which reduces the cross product.
However, some cancelled lessons are still coming in from the right through the join.
Here, we will use a little trick and join not with a Lesson class, but with a filtered nested Constraint Stream instead:

    constraintFactory.from(Lesson.class)
        .filter(lesson -> !lesson.isCancelled())
        .join(
            constraintFactory.from(Lesson.class)
                .filter(lesson -> !lesson.isCancelled()))
        .filter((leftLesson, rightLesson) ->
            leftLesson.getTeacher() == rightLesson.getTeacher()
            && leftLesson.overlaps(rightLesson))
        .penalize("Teacher lesson overlap", HardSoftScore.ONE_HARD)

As you can see, we’ve created a new Constraint Stream from Lesson, filtering before it entered our join.
We have now applied the same improvement on both the left and right sides of the join,
making sure it only creates a cross product of lessons which we care about.
But we can still do better!

Prefer Joiners to filters

Filters are just a simple check if a tuple matches a predicate.
If it does, it is sent downstream, otherwise the tuple is removed from the Constraint Stream.
Each tuple needs to go through this check, and that means every pair of lessons will be evaluated.
When a Lesson changes, all pairs with that Lesson will be re-evaluated, but not anymore:

    constraintFactory.from(Lesson.class)
        .filter(lesson -> !lesson.isCancelled())
        .join(
            constraintFactory.from(Lesson.class)
                .filter(lesson -> !lesson.isCancelled()),
            Joiners.equal(Lesson::getTeacher))
        .filter((leftLesson, rightLesson) ->
            leftLesson.overlaps(rightLesson))
        .penalize("Teacher lesson overlap", HardSoftScore.ONE_HARD)

Notice that the Teacher equality check moved from the final filter to something called a Joiner.
We are still saying the same thing – a Lesson pair will only be sent downstream if the Lessons share the same Teacher.
Unlike the filter, this brings the performance benefit of indexing.
Now when a Lesson changes, only the pairs with the matching Teacher will be re-evaluated.
So even though the cross-product remains the same, we are doing much less work processing it.

The final filter now only performs one operation on the final cross product,
and the Lesson pairs that get this far are already trimmed down in the most efficient way possible.

Remove more, earlier

In some cases, you may have an option to pick the order of your Joiners.
In these situations, you should put first the Joiner that will remove more tuples than the others.
This will reduce the size of your cross products faster.

Consider a new situation, where lessons also have rooms in which they happen.
Although there are possibly dozens of teachers, there are only three rooms.
Therefore the join should look like this:

    constraintFactory.from(Lesson.class)
        .join(Lesson.class,
            Joiners.equal(Lesson::getTeacher),
            Joiners.equal(Lesson::getRoom))
    ...

This way, we first create “buckets” for each of the many teachers,
and these buckets will only contain a relatively small number of lessons per room.
If we did it the other way around, there would be a small amount of large buckets,
leading to much more iteration every time a lesson changes.

For that reason, it is generally recommended putting Joiners based on enum fields or boolean fields last.

Conclusion

The key to efficient constraints is the reduction of cross product.
There are three main ways of reducing cross product in Constraint Streams:

Filtering before joining.
Preferring Joiners earlier to filtering later.
Applying the more restrictive Joiners first.

There are other optimization techniques as well, and we will discuss some of them in the future,
but none of them will give as big a benefit as reducing the size of cross products.

The post Writing fast constraints with OptaPlanner: the secret recipe appeared first on KIE Community.