The poor performance of healthcare.gov

by Graham Email

Anybody in the USA who has not just landed from Mars will know what I am about to write about…for those of you outside of the USA, suffice it to say that a major government initiative’s web site is currently a train-wreck, with inadequate performance to support more than a fraction of the expected number of visitors. This is leading to all sorts of what I term Light Heat and Sound, and since we are dealing with real Politics (not just organizational politics), it is getting ugly.

I wrote this posting earlier today to my personal Facebook page. I am adding it here also. I would also recommend that Twitter users watch Clay Shirky for insight. He has already made some telling comments.
I am reading that various parties to the healthcare.gov debacle are blaming ‘lack of testing’. This is BS. As some readers may know, I am an Enterprise Testing Consultant in my day job. Let me expound on this debacle in general terms.
It is clear to me from what little I have read about healthcare.gov that the site is currently unable to handle the volume of users and transaction rates. This is not a “testing problem”. It is probably not a “coding problem”. It is a potentially multi-faceted root cause, involving some combination of poor design, poor decisions on runtime architecture, inadequate or poorly configured infrastructure, or inadequate connectivity bandwidth.
“lack of testing” cannot be a root cause. and here’s why. Testing is not part of the process of building a solution, although it can be (and should be) integrated into that build process. It is a major part of the process of Validating a solution (i.e. did we build the right solution, does it function correctly as per its detail specification, and does it meet or exceed user or leadership expectations?).
Testing is a Risk Management activity. if properly defined, planned and executed, Testing allows leadership to make informed decisions about whether to/when to implement a solution, based on clearly defined functionality and performance criteria, the same criteria that should have been input to the solution definition and design process.
Full and properly structured testing, involving Integration Testing and Performance/Throughput testing, would have demonstrated the existence of performance and accessibility issues. In fact, enough testing may have shown up some of those issues in the past. However, at some point, a leadership individual, a number of individuals, or even one or more committees (we are, after all dealing with government) made the decision to Go Live with the site. If those decision-makers knew at the time that they made the decision to Go-Live that the site could not meet the requirements, then they made a defective decision and the accountability for that decision is theirs.
Now…there are two possible ways in which the Testing organization could have contributed to the defective decision:
(1) the scope of Testing might have been inadequate to show up the underlying issues. This is common in high-pressure projects, and most of the time it is the result of leadership decisions to reduce the scope of Testing. Bluntly, having to cope with a shower of bugs from a Testing team if you are a Development or Delivery leader is an emotionally wrenching event, on a par with somebody saying to you “your child is ugly”. Leaders have a natural desire to reduce the risk of Bad News from any direction, and they tend to regard Testing teams as sources of Bad News.
(2) If the information that the site was defective was known to the Testing team but that team withheld the information, then the Testing team is also culpable and accountable. However, no truly professional test manager or test director that I have ever worked with would try to withhold that sort of information. It would be like a doctor knowingly poisoning a patient. It’s a line you do not cross.
The underlying root causes of the poor performance in healthcare.gov in all probability have nothing much to do with Testing. The root causes lie elsewhere, with one of a combination of design, architecture and infrastructure provisioning teams.
At this point I need to make a statement that may seem blindingly obvious to many of us in I.T. but which appears to not be easy to communicate to either legislators or electorates.

Requiring government bodies that commission an I.T. solution to nearly always accept the lowest bid will nearly always result in the delivery of a low-quality solution.


This seems to be a statement of the obvious, but history shows that key decision-makers cannot process it.
One final comment about the aftermath. In a situation like this, public hearings dominated by table-thumping, publicity-whoring Congressmen are unlikely to find the underlying answer. Nobody with half of a functioning brain is going to make any substantive statement in public to those kinds of vultures if they can avoid it. As a general rule, the sorts of questions asked by Congresscritters are not primarily designed to discover root causes. They are designed to Make Some Guys Look Bad and to engage in ritual public humiliations of The Bad Guys. In other words, a half-baked investigation without most of the checks and balances (such as due process, verification of facts and evidence) that would really yield a useful result.
Sometimes the right answer to a debacle like this is tough but fair behind-the-scenes investigations. Public floggings may make some people feel better, but like all pissing contests, they never achieve anything.