Monday, August 24, 2009

Reasoning for Ontology Engineering and Usage and The Challenges of Modern Medical Ontologies

In my August 6 post, I briefly introduced ontology editor Protégé 4.0 with the reasoners FaCT++ (implemented using C++) and Pellet (Java based). Today's post picks up this story and adds the RacerPro (commercial) reasoner to the mix. You can -- and I recommend that you do -- download your own copies of the latest versions of these tools. Links that enable you to do so are located at the end of this post.



Protégé 4.0 with three reasoner added-ins

A
reasoner is a piece of software able to infer logical consequences from a set of asserted facts or axioms. In the present context, a reasoner makes inferences about classes and individuals in an ontology, tasks that are beyond the Web Ontology Language (OWL) model alone.

Ontologies, as described in prior posts, are formal vocabularies of terms, often shared by a community of users, and, as such, ontologies play an important role in semantic interoperability and Web 3.0. One of the most prominent application areas of ontologies is medicine and the life sciences. For example, the Systematised Nomenclature of Medicine Clinical Terms (SNOMED CT) is a clinical ontology. Another example is the OBO Foundry -- a repository containing about 80 biomedical ontologies.

These ontologies are gradually superseding existing medical classifications and will provide the future platforms for gathering and sharing medical knowledge. Capturing medical records using ontologies will reduce the possibility for data misinterpretation, and will enable information exchange between different applications and institutions.

Medical ontologies are strongly related to description logics (DLs), which provide the formal basis for many ontology languages, most notably the W3C standardised OWL. All the above mentioned ontologies are nowadays available in OWL and, therefore, in a description logic. The developers of medical ontologies have recognised the numerous benefits of using DLs, such as the clear and unambiguous semantics for different modelling constructs, the well-understood tradeoffs between expressivity and computational complexity, and the availability of provably correct reasoners and tools (discussion to follow).

The development and application of ontologies crucially depend on reasoning. Ontology classification, i.e., organising classes into a specialisation/generalisation hierarchy, is a reasoning task that plays a major role during ontology development: it provides for the detection of potential modelling errors such as inconsistent class descriptions and missing sub-class relationships. For example, about 180 missing sub-class relationships were detected when the version of SNOMED CT used by the NHS was classified using the DL reasoner FaCT++. Query answering is another reasoning task that is mainly used during ontology-based information retrieval; e.g., in clinical applications query answering might be used to retrieve "all patients that suffer from nut allergies".

Despite the impressive state-of-the-art, modern medical ontologies pose significant challenges to both the theory and practice of DL-based languages. Existing reasoners can efficiently deal with some large ontologies, but many important ontologies are still beyond the reach of available tools (i.e., they are unable to classify some widely used ontologies).

Applications currently need to work around these limitations, e.g., by using subsets of ontologies that can be successfully processed. For example, the version of
GALEN typically used in practice contains only about 20% of the axioms of the full version; this reduces the interaction between concepts and thus makes the ontology "processable". This is, however, highly undesirable in practice, because it reduces coverage, weakens the conceptualisation of the domain and may prevent the detection of modelling errors.

Furthermore, the amount of data used with ontologies can be orders of magnitude larger than the ontology itself. For example, the annotation of patients' medical records in a single hospital can easily produce data consisting of hundreds of millions of facts, and aggregation at a national level might produce billions of facts. Existing reasoners cannot cope with such data volumes, especially not if ontologies such as
GALEN and FMA are used as schemata.

Having forewarned you about these limitations, I'd like to recommend the following video on reasoners -- free and commercial.

http://videolectures.net/iswc08_moller_itsr/

Some readers of this blog might not be familiar with terms that appear in the video, starting with ABox and TBox.



For those readers especially, the following links for accessing the Protégé, Pellet and Racer sites could be used to install and examine this software (and accompaning documentation) before watching the videos.

Protégé http://protege.stanford.edu/

Pellet http://clarkparsia.com/pellet/

RacerPro http://www.racer-systems.com/

Then, as you watch the video, you could follow along using your own running code. For some, this will require more than a single session.

Recommended reading - basics of description logics:
www.inf.unibz.it/~franconi/dl/course/dlhb/dlhb-01.pdf
www.inf.unibz.it/~franconi/dl/course/dlhb/dlhb-02.pdf
http://www.cs.man.ac.uk/~horrocks/Slides/IJCAR-tutorial/Print/p1-introduction.pdf

Thursday, August 13, 2009

Semantic Web Technologies: Ontologies, Agents, Web Services


Here's a Web app developed at Stanford University some time ago. It's made up of a single ontology with entries for foods and wines, an ontology agent that returns text from this ontology and a portal that manipulates the text returned by the agent.

http://onto.stanford.edu:8080/wino/index.jsp

The ontology underlying the agent used here contains hierarchies and descriptions of food and wine categories, along with restrictions on how particular instances might be paired together.

For readers wanting an easy-to-read discussion on what exactly an agent is and how it works with ontologies, see

http://www.nature.com/nature/webmatters/agents/agents.html

The next three figures are of screen shots taken from this Web app, which doesn't always employ state-of-the-art technologies like OWL (discussed in earlier posts) under the hood; but, this app provides a simple example of the most basic building blocks that are found in many of today's Semantic Web apps (also discussed in earlier posts).



Figure I - The home page




Figure II - After clicking on the "Roast Duck" link




Figure III - After clicking on the "Web Inventory Search" link

The ontology defines the concept of a wine. According to the specification, a wine is a potable liquid produced by at least one maker of type winery, and is made from at least one type of grape (such grapes are restricted to wine grapes elsewhere in the ontology.)

The declaration additionally stipulates that a wine comes from a region that is wine-producing and, most importantly to the agent, that a wine has four properties: color, sugar, body, and flavor.

The concept of a meal course underlies pairing of a food with a wine. Each course is a consumable thing comprising at least one food and at least one drink, the latter of which is stipulated to be a wine.

When the user selects a type of course, or an individual food that gets mapped to a type of course, the agent will consult that course definition for restrictions on the constituent food or wine. All such course types map back to this concept, like objects to their superclasses in object-oriented programming.

Suppose the user has selected pasta with fra diavolo, or perhaps pasta with spicy red sauce directly. The concept of such a food is defined elsewhere in the ontology. Furthermore, such courses must be a subclass of those with specific restrictions on the properties of their wines.

One wine that matches the above restrictions is the Pauillac. This individual wine is simply defined as a Pauillac whose maker is Chateau Lafite Rothschild. Together with other statements in the ontology, this allows the reasoner (discussed in an earlier post) to deduce many additional facts: that this a Medoc wine from Bordeaux, in France, and that it is red, to name a few.

The concept of a Pauillac specifies that all such wines feature full bodies and strong flavors and are made entirely from cabernet sauvignon grapes. Further, Pauillacs are a particular subset of Medocs, distinguished by their origin in the Pauillac region. It is through this additional subclass relationship that Pauillac are defined elsewhere as red and dry.

Following the above example through the ontology reveals a straightforward logical path for pairing the Pauillac with the selected course. Because these items were specified in a standardized, machine-readable format, it is an equally straightforward task for any compliant automated reasoner.

Why use an ontology?

The functionality provided by the wine agent is not unlike that which could be provided by a simple look-up table. Indeed, food/wine pairings are traditionally published in some form of tabular chart where marks appear at the intersections of columns and rows representing compatible varieties of food and wine. The wine agent demonstrates that at least this simple task can be accomplished within semantic markup technology, but what about more complicated applications like, for example, those required by electronic health rcords (EHR)? For now, I'll postpone any discussion of EHR and stick with this food-wine pairing app.

Suppose that if not the entire web, then at least some number of cooperating parties were using semantic markup to participate in this project. Rather than the traditional approach of trying to build an enormous database of foods and wines, then, the definitions would be distributed across the participating parties. A restaurant or retailer offering an on-line menu could mark each food item with standardized machine-readable definitions. Similarly, then, a wine retailer could mark its wines according to the definitions exemplified above.

Such markings would benefit from well-known advantages of ontologies. For instance, through the subclassing, adding a new Pauillac to the inventory would not require wine.com (the source of the GUI shown in Figure III) to mark all of the wine's properties; it would just be another Pauillac as specified in the example, plus any differentiating features. But more importantly in terms of software agents, all the markings would be machine readable, and could be handled by systems from any organization.

Rather than relying on a human user to select a food or food type, the agent could crawl the web for foods marked within the wines namespace and pre-compile suitable pairings. This is where Web services (discussed in earlier posts) come in.

Footnote

The notion of agent-based computing has been adopted enthusiastically in the financial trading community, where autonomous market trading agents have been said to outperform human commodity traders by 7%.

Machines can monitor stock market movements much more quickly than humans, and if you can encode the kinds of rules that you want, then it is not unreasonable to imagine that computational traders will be able to outperform humans.

I commented on high-speed trading in my August 3 post.

Monday, August 10, 2009

SNOMED, OWL, Ontologies, Protege, etc. Hardware and Software Issues.

In my last two posts, I rather glibly talked about SNOMED, OWL, ontologies, Protege, etc. without mentioning any hardware or software issues that sometimes arise.

For most of us, these matters are best understood from our own "trial and error" experiences. Nonetheless, I've added links to a couple of informative anecdotes below: in one, the account of someone computing successfully with 8 dual CPUs and 15 gig RAM; in another, the account of someone computing unsuccessfully having far less resources.

Link 1: Modeling Massive Ontologies (SNOMED) at Kaiser

Link 2: SNOMED OWL On Protege

Footnote 1: Some large OWL files will not load into Protege on a 32-bit computer when to do so requires in excess of 4 Gigs RAM, the theoretical limit of a 32-bit system. In order to get around this limitation, a 64 bit machine is needed. Fortunately, the Linux, Windows, and Mac operating systems are available in 64-bit versions.

Footnote 2: The entire computing industry is moving from 32-bit to 64-bit technology, and it’s easy to see why. While many of today’s computers can hold far more than 4GB of physical memory, the 32-bit applications that run on them can address only 4GB of RAM at a time. 64-bit computing shatters that barrier by enabling applications to address a theoretical 16 billion gigabytes of memory, or 16 exabytes. 64-bit machines can also crunch twice the data per clock cycle, which can dramatically speed up numeric calculations and other tasks.

Friday, August 7, 2009

Ontologies, Ontology Languages, and Semantic Interoperability -- Segue to Electronic Health Records (EHR)


This post serves as a transition between my previous one on OWL and ontologies and my upcoming ones on electronic health records (EHR) - past, present and future.

OWL is the latest standard in ontology languages from the World Wide Web Consortium (W3C) - it is built on top of RDF (i.e., OWL semantically extends RDF).

These two languages are explained in

http://www.co-ode.org/resources/tutorials/intro/slides/OWLFoundationsSlides.pdf

As an example of an ontology that's central to building an interoperable EHR system, I'll cite the Systematized Nomenclature of Medicine (SNOMED) ontology, which is used in more than 50 countries around the world:



{click to enlarge}

The Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) Ontology includes a Core terminology of over 364,000 health care concepts with unique meanings and formal logic—based definitions organized into hierarchies. As of January 2005, the fully populated table with unique descriptions for each concept contained more than 984,000 descriptions. Approximately 1.45 million semantic relationships exist to enable reliability and consistency of data retrieval. SNOMED CT is available in English, Spanish and German language editions.

What is its structure?

Core content includes the concepts table, descriptions table, relationships table, history table, an ICD-9-CM mapping, and the Technical Reference Guide. And, it can map to other medical terminologies and classification systems already in use.

SNOMED is meant to be complementary to LOINC (Logical Observations Identifiers, Names, Codes), another clinical terminology important for laboratory test orders and results.

See http://www.ihtsdo.org/ for a good deal more information on SNOMED.

For advanced IT readers

The DL that SNOMED uses is much less expressive than OWL. The result is, even though you can mechanically translate SNOMED into OWL, the resulting OWL ontology will be very unlike anything an OWL author would create starting from scratch, and might also be a challenge to classify successfully under an OWL reasoner without a lot of manual editing.

Furthermore, even at that point it would also be of limited value in supporting reasoning over OWL instances, as many kinds of assertions that would be routine in an OWL ontology (like disjoints, explicit domain/range constraints, etc.) do not exist in native SNOMED and would have to be created.

One opinion holds that creating your own OWL ontology and using SNOMED as a mapping target leverages SNOMED in a more useful way for most conceivable applications.

Finally, for a good deal more on the topics covered so far, consider

Clinical Decision Support Systems
Theory and Practice
Series: Health Informatics
Berner, Eta S. (Ed.)
2nd ed., 2007
ISBN: 978-0-387-33914-6

Thursday, August 6, 2009

Semantic Interoperability -- Part III Ontologies -- Prelude to Electronic Health Records (EHR)

This post is a continuation of the introduction to ontologies that I posted on August 4 and July 20.

Readers new to this subject might also find the 2 ½ minute video What Is Web 3.0, Anyway? worthwhile.

An ontology is an explicit specification of a conceptualization (defined earlier), that is to say, a formal representation of a knowledge domain. Usually an ontology consists of: (i) classes, which represent the concepts of the domain (for example, in an ontology about the domain of Telecommunications, as in the listing below, a possible concept could be "Phone"); (ii) properties, to establish relationships between the concepts (for example, a "Phone" concept could have as property the "Company"; (iii) instances, with concrete examples associated with every concept (for example, "Siemens" could be an instance of the "Company” concept); and (iv) axioms, which are restrictions applicable to certain elements of the ontology, necessary to specify completely the knowledge domain (for example, in the ontology about telecommunications, it could define a restriction to indicate that in this domain a "Phone" must have always, at least, a "Company").

Ontologies can be stored using XML-based markup languages such as OWL (Ontology Web Language), which facilitates their reuse in different semantic platforms to annotate and search resources. These languages allow us to define tags in order to represent the different ontology elements. The listing below shows an extract of a OWL file containing the Telecommunications example ontology that has been created using the Protegé tool. As you can observe, in this language, the concepts are delimited by the Class tag, the properties by the ObjectProperty tag, the instances by the tag corresponding to the associate class (in the example, the class Company has as instance "Siemens"), and the axioms with tags like Restriction or subClassOf (this one is used in the example for representing that "Cellphone" is a type of "Phone").



Content of a OWL file - a fragment of an
Ontology about Telecommunications
{click to enlarge}

Today one of the main uses of ontologies is to support the Semantic Web (aka Web 3.0), especially for annotating Web resources and facilitating the localization of these annotated resources when users formulate queries to semantic search engines. For this purpose, in the previous example of Telecommunications, an ontology has included two annotations as instances of the “Phone” and “Cellphone” classes which correspond to two documents (“Gigaset3015Classic.pdf” and “MobileC55.pdf”, respectively) located in a hypothetical Web server (“http://www.telecosiemens.com”).

The Reality

Researchers have written much about the potential benefits of using ontologies, and most of us regard them as central building blocks of the Semantic Web and other semantic systems. Unfortunately, the number and quality of actual, “non-toy” ontologies available on the Web today is remarkably low. This implies that the Semantic Web community has yet to build practically useful ontologies for a lot of relevant domains in order to make the Semantic Web a reality.

In striking contrast to the data within a stand-alone document, publications have yet to benefit from the opportunities offered by cyber infrastructure. While the means of distributing publications has vastly improved, publishers have done little else to capitalize on the electronic medium. In particular, semantic information describing the content of these publications is generally sorely lacking, as is the integration of this information with data in public repositories.

The Reasoner

One of the key features of ontologies is that they can be processed by a reasoner. One of the main services offered by a reasoner is to test whether or not one class is a subclass of another class. By performing such tests on all of the classes in an ontology it is possible for a reasoner to compute the inferred ontology class hierarchy. Another standard service that is offered by reasoners is consistency checking. Based on the description (conditions) of a class, the reasoner can check whether or not it is possible for the class to have any instances. A class is deemed to be inconsistent if it cannot possibly have any instances.

Reasoning with Protégé 4.0

Reasoning with your ontology is one of the most commonly performed activities and the ontology editor Protege 4.0 comes with 2 built-in reasoners, FaCT++ and Pellet. To classify your ontology, open the Reasoner menu and select one of the available reasoners. FaCT++ will automatically classify your ontology. Pellet requires that you select classify. Once you have done this, the class hierarchy on the Entites tab changes to show the inferred class hierarchy. Unsatisfiable classes appear in red under Nothing, and everything else appears in the hierarchy under their inferred superclasses. The asserted class hierarchy is still available, stacked under the asserted one, as shown in the next screenshot.



Screenshot of an inferred class hierarchy
{click to enlarge}

Instructions for getting started with the OWL editor in Protege 4:
http://protegewiki.stanford.edu/index.php/Protege4GettingStarted

A Practical Guide To Building OWL Ontologies:
http://www.co-ode.org/resources/tutorials/ProtegeOWLTutorial-p4.0.pdf

The Microsoft Word Add-in For Ontology Recognition – An Introduction

There are other tools (e.g., The Microsoft Word Add-in For Ontology Recognition to name one) that might be suitable for your needs. It’s a MS Word 2007 add-in that enables the annotation of Word documents based on terms that appear in ontologies.

This Word Add-in For Ontology Recognition is a free Microsoft download. With it, as shown in the figures below, you select and then download one or more ontologies which are thereafter available automatically from within your Word document.

The Microsoft Word Add-in For Ontology Recognition – An Overview

This add-in enables authors who use Microsoft Word for content creation to incorporate semantic knowledge into the content. This add-in should simplify the development and validation of ontologies, by making ontologies more accessible to a wide audience of authors and by enabling semantic content to be integrated in the authoring experience, capturing the author’s intent and knowledge at the source, and facilitating downstream discoverability.

The goal of the add-in is to assist authors in writing a manuscript that is easily integrated with existing and pending electronic resources. The major aims of this project are to add semantic information as XML mark-up to the manuscript using ontologies and controlled vocabularies (from the National Center for Biomedical Ontology) and identifiers from major biological databases, and to integrate manuscript content with existing public data repositories.

As part of the publishing workflow and archiving process, the terms added by the add-in, providing the semantic information, can be extracted from Word files, as they are stored as custom XML tags as part of the content. The semantic knowledge can then be preserved as the documented is converted to other formats, such as HTML or the XML format from the National Library of Medicine, which is commonly used for archiving.

The full benefit of semantic-rich content will result from an end-to-end approach to the preservation of semantics and metadata through the publishing pipeline, starting with capturing knowledge from the subject experts, the authors, and enabling this knowledge to be preserved when published, as well as made available to search engines and presented to people consuming the content.

The Microsoft Word Add-in For Ontology Recognition – Screen Shots

{click to enlarge}

The Word Add-in For Ontology Recognition User’s Guide for the Semantic Mark-up and XML Formatting of Scholarly Articles is a good place to start for further information on this tool.

Semantic Tagging

When a word or set of words is tagged by the add-in, the word is wrapped with some tags that associate it with the ontology term. The example below shows the word "disease" being tagged with Human Disease ontology.

{click to enlarge}

If the Word file (docx) is to be transformed to other formats, this set of tags would need to be processed using xslt or other technologies. Note that there are other CodePlex projects available which implement transformations of docx files to other formats, which one can start from.

Ontology Add-in for Microsoft Office Word 2007 Video

Tuesday, August 4, 2009

Electronic Health Records (EHR) – Semantic Interoperability – Part 2


Before discussing the connection(s) between electronic health records (EHR) and semantic interoperability directly, I’d like to spend a little time talking about Semantic Web interoperability (not necessarily the same thing as the interoperability of present and future EHS systems).

In general, semantics is the study of meaning. Semantic Web (also called Web 3.0) technologies help separate meanings from data, document content, or application code, using technologies based on open standards. If a computer understands the semantics of a document, it doesn't just interpret the series of characters that make up that document: it understands the document's meaning. See my July 20 post for a brief introduction to this material.



Benefits of the Semantic Web to the World Wide Web

The World Wide Web is the biggest repository of information ever created, with growing contents in various languages and fields of knowledge. Search engines might help you find content containing specific words, but that content might not be exactly what you want. What is lacking? The search is based on the contents of pages and not the semantic meaning of the page's contents or information about the page.

Once the Semantic Web exists, it can provide the ability to tag all content on the Web, describe what each piece of information is about and give semantic meaning to the content item. Thus, search engines become more effective than they are now, and users can find the precise information they are hunting. Organizations that provide various services can tag those services with meaning (service-oriented architectures -- SOA -- are discussed in my June 20 post and mentioned again below, this time in the context of semantics); using Web-based software agents, you can dynamically find these services on the fly and use them to your benefit or in collaboration with other services (See
http://www.oracle.com/technology/pub/articles/matjaz_bpel1.html for a discussion of the orchestration and choreography of Web services).

Ontologies

The use of words to refer to concepts (the meanings of the words used) is very sensitive to the context and the purpose of these words.

An ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. It may be used to reason about the properties of that domain, and may be used to define the domain. The word “reason” is an important part of this story and will come up again in a later post, when I talk about Protégé, a free, open source ontology editor and knowledge-base framework.

A domain ontology (or domain-specific ontology) models a specific domain, or part of the world (e.g., healthcare, banking or politics). It represents the particular meanings of terms as they apply to that domain. For example the word card has many different meanings. An ontology about the domain of poker would model the "playing card" meaning of the word, while an ontology about the domain of computer hardware would model the "video card" meaning.

For each domain of human knowledge, an ontology must be constructed, partly by hand and partly with the aid of dialog-driven ontology construction tools (to be discussed in an upcoming post).

Ontologies are not knowledge nor are they information. They are meta-information. In other words, ontologies are information about information. In the context of the Semantic Web, they encode, using an ontology language (to be discussed in an upcoming post), the relationships between the various terms within the information. Those relationships, which may be thought of as the axioms (basic assumptions), together with the rules governing the inference process, both enable as well as constrain the interpretation (and well-formed use) of those terms by the Info Agents (to be discussed in an upcoming post) to reason new conclusions based on existing information, i.e. to think. In other words, theorems (formal deductive propositions that are provable based on the axioms and the rules of inference) may be generated by the software, thus allowing formal deductive reasoning at the machine level. And given that an ontology, as described here, is a statement of Logic Theory, two or more independent Info Agents processing the same domain-specific ontology will be able to collaborate and deduce an answer to a query, without being driven by the same software.

When an organization adapts an ontology-driven approach, it can capture and represent its total knowledge in a language-neutral form and deploy the knowledge in a central repository that provides the same semantic meaning across applications.

Semantics are the future of service-oriented integration

To properly model and manage a service-oriented architecture (SOA), enterprise architects must maintain active representations of the services available to the enterprise. Specifically, to discover and organize their services, the architects must use best practices that model and assemble services using metadata, encapsulate business logic in metadata for dynamic binding, and manage with metadata. Ontologies provide a very powerful and flexible way to aggregate, visualize, and normalize this service metadata layer.

Note: When delivering services as part of a service inventory, there is a constant risk that services will be created with overlapping functional boundaries, making it difficult to enable wide-spread reuse. Normalization addresses this problem.
When services are delivered with complementary and well-aligned boundaries, normalization across the inventory is attained. Note also how the quantity of required services is reduced.

Semantic technologies provide an abstraction layer above existing IT technologies, one that enables the bridging and interconnection of data, content, and processes across business and IT silos.

For advanced IT readers

For a rather technical discussions of this topic, see the video Artificial Neural Network based Techniques for Semantic Data Interoperability below and parts of my article Using Neural Networks and OLAP Tools to Make Business Decisions to which there is a link in the bibliography at the very bottom of this blog.





To be continued …



Monday, August 3, 2009

The game is sometimes rigged in favor of the house

High-speed trading: some institutions, including Goldman Sachs, have been using superfast computers to get the jump on other investors, buying or selling stocks a tiny fraction of a second before anyone else can react. Profits from high-frequency trading are one reason Goldman is earning record profits and likely to pay record bonuses.

And there’s a good case that such activities are actually harmful. For example, high-frequency trading probably degrades the stock market’s function, because it’s a kind of tax on investors who lack access to those superfast computers — which means that the money Goldman spends on those computers has a negative effect on national wealth.

So, what's this got to do with information technology in healthcare? Possible, nothing. Remember, however, that one of the stated goals of EHR is to deliver better healthcare at lower cost.


I believe that we should keep in mind that technology - not high-speed computers per se - doesn't always benefit society. As I mention - without much elaboration - in my July 22 post, "EHR / EMR has the potential to facilitate the execution of today's healthcare scams. So, for the time being, we might well have a reason to temper our enthusiasm for the inevitable computerization of our still-largely-paper-bound healthcare records systems."

Saturday, August 1, 2009

Metcalfe's law vis-à-vis the value of semantic interoperability

A new economic model

Metcalfe's law states that the value of a network is proportional to the square of the number of connected users of the system (n**2).

Metcalfe's law characterizes many of the network effects of communication technologies and networks such as the Internet, social networking, and the World Wide Web. It is related to the fact that the number of unique connections in a network of a number of nodes (n) can be expressed mathematically as the number n(n − 1)/2, which is proportional to n**2 asymptotically.



The law has often been illustrated using the example of fax machines: a single fax machine is useless, but the value of every fax machine increases with the total number of fax machines in the network, because the total number of people with whom each user may send and receive documents increases.

Metcalfe's law is more of a heuristic or metaphor than an iron-clad empirical rule. In addition to the difficulty of quantifying the "value" of a network, the mathematical justification measures only the potential number of contacts, i.e., the technological side of a network. However the social utility of a network depends upon the number of nodes in contact. For instance, if Chinese and non-Chinese users don't understand each other, the utility of a network of users that speak the other language is near zero, and the law has to be calculated for the two sub-networks separately.

When considering electronic health record (EHR) interoperability, two nodes are in contact in a meaningful way if the nodes themselves, not just human beings sitting at those nodes, can understand the content of a message from the other node in an unambiguous way. In other words, semantic interoperability is what counts.

Apropos of EHR, Rod Beckstrom, the recently appointed president of the Internet Corporation for Assigned Names and Numbers (Icann), has used his address to the Black Hat USA 2009 conference to propose a new economic model for valuing computer networks and the internet.

"Who cares how many nodes there are?" Beckstrom said. "If you look at a value of the network, focus on the transactions. The value of the network equals the net value added to each user's transactions, summed for all users."

For example, some networks grow the number of users but become less valuable since the value of their transactions is so small.

Microsoft chairman Bill Gates dropping his Facebook account in July was a case in point, Beckstrom said. The number of 'friends' became so great that the network lost its value.