Monday, December 7, 2009

Human Resolution or Disambiguation -- False Positive and False Negative Identification: Heath Care and Information Technology Perspectives

Disambiguation of IDs is the process of resolving multiple potential matches into a match with the correct person. In general, statistical matching algorithms are likely to require substantially more-frequent disambiguation compared to that required by a system that uses theoretically perfect universal IDs; often, disambiguation is done by human intervention. Such disambiguation imposes significant costs and operational inefficiencies, particularly if, for example, a physician must resolve the ambiguities.

Note 1: Many of the efficiency and safety benefits theoretically possible with health information technology (HIT) systems depend on eliminating such human involvement and its concomitant slowness, expense, and propensity for error.

Note 2: What follows applies to IDs in general, even though I’ve chosen the healthcare industry for much of this discussion.

Disambiguation sometimes entails implementing significant new workflows that may require substantial time and resources. When human involvement is required, many of the potential benefits of automation are lost. For example, at the point of care, disambiguation is often done by asking the patient further questions regarding personal characteristics and/or health care history.

The potential for error in the statistical matching methods (see my December 1 post on unique patient IDs) has important safety implications, which are a chief concern for many in the health care profession. Two types of errors are involved in statistical matching: false positives, in which there is a link to the wrong patient’s records, and false negatives, in which not all of a patient’s records are found. A graphic representation of these types of errors and of how they relate to the probabilities and threshold for matching is shown in the figure below.

The horizontal scale shows the score of a particular match. As more and more attributes match and as the match is weighted by its score, or value, the higher is the probability that the patient is correctly matched to that record. A low score indicates a low probability of match (and a high probability that it does not match). It is possible to use a threshold above which the record is assumed to match and below which it is not assumed to match, which leads to the shaded areas above and below the threshold.

The area shaded to the right of the threshold is the region corresponding to false positives, or picking up the wrong patient’s records. The shaded area to the left of the threshold is the region of false negatives, or the records of the patient that are not picked up because of some non-matching personal attributes. Setting a balance between the two types of errors involves tuning.

Another approach illustrated in this figure is to define a region of ambiguity within which possible matches are tagged for human resolution, or disambiguation. Whether matching uses a single threshold or two thresholds, it is not possible to avoid encountering false-positive and false-negative matches. Adjusting the threshold or thresholds can result in a different proportion of false-positive and false-negative errors, but cannot be used to eliminate them because they result from the inherent characteristics of the population that lead to the two S-shaped curves.

As stated above, many end-to-end business processes require human interactions with the process.

Task Assignment and Routing

Human workflow supports declarative assignment and routing of tasks. In the simplest case, a task is assigned to a single participant (user or group). However, there are many situations in which more detailed task assignment and routing is necessary (for example, when a task must be approved by a management chain or worked and voted on by a set of people in parallel, as shown in the figure below). I’ve chosen tools in the Oracle SOA Suite to illustrate (in the figures below) human workflow that can provide declarative pattern-based support for such scenarios.

I’ll briefly elaborate here with an introduction to human workflow and continue the discussion in my next post, where I'll talk about how you might implement such a system.

Participant Type

In simple cases, a participant maps to a user, group, or role. However, workflow supports declarative patterns for common routing scenarios such as management chain and group vote. The following participant types are available:

Single approver

This is the simple case where a participant maps to a user, group, or role. Since at least one human being is involved, much more than his or her looking at a monitor screen and clicking with a mouse is involved.

For example, a vacation request is assigned to a manager. The manager must act on the request task three days before the vacation starts. If the manager formally approves or rejects the request, the employee is notified with the decision. If the manager does not act on the task, the request is treated as rejected. Notification actions similar to the formal rejection are taken.


This participant indicates that a set of people must work in parallel. This pattern is commonly used for voting.

For example, multiple users in a hiring situation must vote to hire or reject an applicant. You specify the voting percentage that is needed for the outcome to take effect, such as a majority vote or a unanimous vote.


This participant indicates that a set of users must work in sequence. While working in sequence can be specified in the routing policy by using multiple participants in sequence, this pattern is useful when the set of people is dynamic. The most common scenario for this is management chain escalation, which is done by specifying that the list is based on a management chain within the specification of this pattern. More on routing later.

FYI (For Your Information)

This participant also maps to a single user, group, or role, just as in single approver. However, this pattern indicates that the participant just receives a notification task and the business process does not wait for the participant's response. FYI participants cannot directly impact the outcome of a task, but in some cases can provide comments or add attachments.

Readers who are interested in learning more about the subject of human resolution or disambiguation in an otherwise automated system might look at the following two books, while waiting for my next post.