bpmn dmn standard blog posts

Translating Excel Examples into DMN Logic

Blog: Method & Style (Bruce Silver)

In my Low-Code Business Automation methodology, the first step is something I call whiteboarding – developing examples of the logic created in Excel with an eye toward generalizing the logic to handle any possible example and then translating to DMN. This is actually the hardest part of the project, and the only part requiring subject matter expertise. The rest of it is purely mechanical.

In the beta testing of my Business Automation training, some students have complained about the difficulty of doing this. In the whiteboard spreadsheets, if a cell value is not simply hard-coded, it is often a simple reference to some other cell, or maybe adding two cell values together. But why does the formula reference that particular cell? It’s much easier to answer that question in plain English than to express the generalized logic in Excel formulas. Unfortunately, Excel formulas have never been great at defining generalized table-handling logic.

This is why I was excited to hear about some new capabilities of the Excel formula language, now being rebranded as Power FX, to handle data tables, using what Microsoft is calling dynamic arrays. New built-in functions like FILTER and SORT now offer the chance to manipulate table data in a way that, even Microsoft now admits, was virtually impossible before. I set out to see whether using the new dynamic array functions in Excel to generalize the logic before translating to DMN would simplify my methodology. This post lays out what I found.

Here is a simplified version of the example we use in my training, in which business events representing stock trades are recorded in tables that manage the user’s stock portfolio. Each account in the system is initialized with $100,000 in Cash. If the user buys 100 shares of ticker X at $200/share, the portfolio immediately afterwards contains $20,000 of ticker X and $80,000 of Cash. If he next sells 50 shares of ticker X at $220/share, that returns $11,000 in Cash, for a total of $91,000, and his remaining 50 shares of X are now worth $11,000, for a total portfolio value of $102,000. If he makes no more trades and ticker X drops to $210/share, he still has $91,000 Cash and $10,500 worth of X, for a total portfolio value of $101,500. You don’t need to be a Wall Street genius to comprehend this.

In my simplified training whiteboard, there are 5 trades, including initialization. The logic maintains 3 tables: Trade, simply a transaction log of the trades; Position, which calculates the change in two account positions – the traded ticker and Cash – at the time of each trade; and Portfolio, which reports the current value of each ticker and Cash, based on current prices. (There is a fourth table also, Prices, which holds the current price of each ticker in the Portfolio.) After 5 trades, the tables look like this:

The Trade table is self-explanatory. After initialization, the TradeType is either Buy or Sell. Each trade adds one record to the table. Shares is always positive, but the Value (magnitude calculated as Shares * Price) is negative for a Sell.

Like the Trade table, the Position table is immutable. Two records are appended with each trade, one for the traded ticker and one for Cash, but existing records are unchanged. TradeID links the Position record to a record in the Trade table. So, looking at this table, Trade 2 is a Buy of 10 shares AMZN at $3399.79/share, a Value at the time of the trade of $33,997.90. This amount is subtracted from the previous Cash balance, leaving $66,002.10. In the Position table, Shares means the net shares remaining after the trade. Trade 3 is a Sell of 5 shares AMZN at $3401.07/share, or a Value of -$17,005.35. So in the Position table, AMZN Shares decreases by 5 from the prior value of 10, leaving a net value of 5, and Cash increases by $17,005.35, for a net value of $83,007.45.

Unlike the first two tables, records in the Portfolio table are updated with each trade and with every change of the current Price. There is one Portfolio record for each ticker plus Cash. Here Shares and Price are the current net share count and price at the moment indicated by the Timestamp, so the sum of the Value column is the total portfolio value at that time.

Now the modeler’s challenge is to look at these tables and generalize the logic. If he is unlucky, the values in the subject matter expert’s whiteboard table are hard-coded. That’s not good. More often, you can cajole him into providing formulas referencing other cells.

So consider cell U8 in the Position table above, the net Shares of AMZN remaining after Trade4. Typically the subject matter expert’s formula would be written

=U6 - N5

where U6 is the previous Shares of AMZN in the Position table (5) and N5 is the number of Shares sold in Trade4 (5), leaving a net value of 0. Why subtract the shares instead of add? That’s because it is a Sell not a Buy. A Buy would add, not subtract. Now hopefully you see the problem: The Excel formula is based on the subject matter expert’s knowledge of these particular table cells. It is not generalized.

The modeler must deduce the logic from the formula above, or he may be able to get the subject matter expert to do it for him. In plain English, the generalized logic for the Shares column of the Position table would be something like this:

If Ticker is not Cash and TradeType is Buy, Position.Shares is the value of Portfolio.Shares of that ticker before the trade PLUS the value of Trade.Shares.
If Ticker is not Cash and TradeType is Sell, Position.Shares is the value of Portfolio.Shares of that ticker before the trade MINUS the value of Trade.Shares.
If Ticker is Cash, Position.Shares is Portfolio.Shares of Cash before the trade MINUS Trade.Value.

In plain English, that’s pretty straightforward. Can we generalize it in the Excel formula language using the new dynamic array functions? Yes, we can, but it’s harder than you think. The formula for cell U8, AMZN Position.Shares after Trade4, is:

=IFERROR(FILTER(Portfolio3[Shares], Portfolio3[Ticker]=Event4[Ticker]),0)
 +IF(Event4[TradeType]="Buy", Event4[Shares], -1*Event4[Shares])

In the expression above, Portfolio3 refers to the Portfolio table before Trade4, and Event4 is Trade4. The key is the new FILTER function, which selects rows of a table. To use the column names rather than cell coordinates, we first need to turn those cell ranges into Excel tables. (Just select the range, Insert table. Check Use header rows, uncheck Total row, and assign the table a name.) The syntax is:

FILTER(table, condition)

returning rows of table satisfying the condition. In the Excel formula language, Portfolio3[Shares] is a single column table, the Shares column of Portfolio3. The condition selects rows of the Ticker column of Portfolio3 that match the trade ticker, Event4.Ticker. What if this is a new Buy and there are no rows of Portfolio3 that match? This is where the Excel function IFERROR comes in. The first argument of IFERROR is the returned value if FILTER finds a match, and the second argument is the returned value if FILTER does not return a match, which we want to be 0. The second part of the expression adds or subtracts the traded shares, using the IF function, with the syntax

IF(condition, value if true, value if false)

Let’s compare this Excel formula with the corresponding FEEL:

(if Portfolio3[Ticker=Event4.Ticker]!=null then Portfolio3[Ticker=Event4.Ticker].Shares[1] else 0)
 + if Event4.TradeType="Buy" then Event4.Shares else -1*Event4.Shares

The syntax of filter expressions is different but overall the two are about equal in complexity, and you could argue that going from the generalized Excel formula to FEEL is easier than going from plain English directly to FEEL.

Now look at cell U9, the Position.Shares of Cash after Trade4. The generalized Excel formula is

=FILTER(Portfolio3[Shares], Portfolio3[Ticker]="Cash") - Event4[Value]

Here we know that Portfolio3 always has a row for Cash, so we don’t need IFERROR. The corresponding FEEL formula is

Portfolio3[Ticker="Cash"].Shares - Event4.Value

Here I would give a slight friendliness edge to FEEL, but again going from the generalized Excel to FEEL is probably easier than going from plain English directly to FEEL.

The harder case is Shares in the Portfolio table. Here let’s look at Portfolio5, meaning after Trade5. In plain English, Portfolio5.Shares for a ticker (including Cash) is the most recent value of Position5.Shares for that ticker, where Position5 is the new Position table after Trade5. The Excel formula is this:

=INDEX(SORT(FILTER(Position5, Position5[Ticker]=[@Ticker]),1,-1),1,4)

This is messy. Starting from the innermost expression and working outward, we filter Position5 to extract rows matching the Ticker value for the current row in Portfolio5. That is the meaning of the cryptic [@Ticker] in the filter. Next, to get the most recent one, we sort that filtered table based on the column number (1, the second argument), in descending order (-1, the third argument). Finally, we use Index to extract row 1 of that sorted filtered list (second argument), column 4 (third argument). That’s pretty ugly. Here it is in FEEL:

Position5[item.Ticker=Ticker].Shares[-1]

In this case it’s a clear win for FEEL, and going directly from plain English to FEEL is probably easier than trying to use the Excel formula. Excel would clearly benefit from having a better way to select the last item in a list.

In the end, would it be advantageous in the training to encourage students to develop generalized Excel formulas before going to FEEL? I’m not sure. The FEEL-haters insist that Excel formulas are MUCH more business-friendly than FEEL, but I have my doubts. Another thing arguing against is the fact that Microsoft’s documentation is not great. Some Power FX functions work only in PowerApps, not in Excel, and some that work in both use different syntax. For example, how do you use the Filter function with multiple conditions? The Power FX documentation says any of the following work:

FILTER(table, condition1, condition2)
FILTER(table, AND(condition1, condition2))
FILTER(table, condition1 And condition2)
FILTER(table, condition1 && condition2)

In Excel, NONE of those work. What works, as shown in the Excel Formula Help, is this:

FILTER(table, (condition1) * (condition2))

I’m debating what to do about generalized Excel formulas, hoping that some reader more experienced in Excel table operations can show me a better way to handle it. If that’s you, please add your comments.