As a developer, it is often just as important to understand the unspoken implicit requirements of a new feature or project as it is to understand the explicitly stated ones. Sometimes a customer or Subject Matter Expert (SME) will make an innocuous request and couch it in terms that lead you down the wrong path. “We just need a checkbox so we can see if this is done.” “Can you add a place where we can enter free-form notes?” “Can you show me the date this was approved?” Each of these examples sounds like a straightforward request: slap a boolean, string or date field on an entity, make the necessary service level and UI changes, and boom! New feature added, Jira ticket resolved.
Let’s take a step sideways for a second and review some patterns from Volume 3 of “The Data Model Resource Book” by Len Silverston, wherein he lays out four levels of modeling for statuses. In the examples below, I’ll be modeling an Order entity that can be in one of four states: Created, Approved, Cancelled or Completed. The first level tracks each status event with a date field on the entity:
When an order is created, the created_date field is populated. When the order is approved, the approved_date is populated, etc. This allows you to keep track of when each state transition occurred, but isn’t very flexible, requiring a schema change to add new states. It can also lead to ambiguities if it isn’t populated correctly (e.g. if the cancelled_date and completed_date are both set, is the order cancelled or completed?)
The second level tracks the current status and date the status was entered in the entity. Statuses are constrained to a limited set of valid values according to a status type:
Level 2
The main advantage of this pattern is that the set of statuses in your workflow is unconstrained by the data model. You can add new ones without schema changes. The downside is that it doesn’t provide a history of transitions, which leads to the third level:
Level 3
This combines the benefits of the first and second levels without the drawbacks. You can add new statuses without schema changes and record transitions. Also note that I added a column to track the user who invoked the status change. The fourth level abstracts tracking of status transitions into a single table for all entities:
Level 4
In this example, order_id, shipment_id, invoice_id and task_id each refer to a different entity. It captures the same information as the level three pattern, but with a different table structure. Now back to the original problem with identifying unspoken customer requirements.
One common mistake that I’ve seen several times, and made myself more than once, is the failure to recognize when a request for a date field is actually a request for a status and all its attendant information. It starts with the date an order was completed, then wanting to know the date it was approved, and before long, you’ve fallen into level one without even realizing it was happening. Now when you want to move to something more robust (i.e. level three or four) you have to modify the schema, migrate a bunch of data, rewrite services, etc. It will be hard to rationalize to the customer the time and effort to correct the problem, and you may get stuck with a lousy model. That’s why I always take extra care with requirements for date and time information, and try to frame them with the following questions:

  1. Will the date drive the workflow surrounding the entity?
  2. Do we need to know when the date changes and who changed it?
  3. Does this date capture an event at a single point in time? Or does it cover a period of time?

If the answer to the first question is “yes”, you almost certainly need to be tracking the status of the entity, and the answer to question two will be “yes” as well (if not today, then at some point in the future). In such a case, I would strongly recommend that you use the third or fourth data modeling levels. You will look like a visionary when the customer wants to know the average time it takes to approve an order, and you will be able to modify workflows without any changes to the underlying data model.
You can also take it a step further by modeling the rules about valid state transitions, like so:
This makes reasoning about the workflow straightforward, allows you to make changes to the flow without changing the validation logic, and can form the basis for workflow configuration tooling.
By the way, if the answer to the first question from above is “no”, the third question will help you accurately model the business process. For example, the estimated delivery date for a shipment might be a single point in time, whereas the cable guy is going to show up within a six month window. Apologies to all the cable guys out there, I’m just kidding.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *