Failed by design

I recently wrote about an Indian nuclear submarine that had to be decommissioned for 10 months because someone failed to properly close a hatch. Or more accurately because someone failed to build in a mechanism that would prevent someone unknowingly submerging it with an open hatch.

Far more seriously, but on a similar note, last weekend in Hawaii, residents and visitors received texts telling them that the island was about to come under missile attack and that they should take cover. It turned out that a government employee accidentally hit the wrong button thereby unleashing an emergency communications protocol that warned of impending danger. It took a whole 38 minutes for them to notify the population that this had been a false alarm.

During that time people did all they could to prepare. That included a father having to decide which of his children he would go and rescue on the basis that he didn’t have time to rescue all of them. A heartbreaking story that has potential longer term consequences as the “abandoned” kids ask themselves why they weren’t the ones he chose to rescue.

Even more worryingly, the text message was a small part of a wider warning system that unleashed a series of communications protocols. In a world where there is heightened tension with North Korea and Russia, it’s really not good having people in a position to take retaliatory action given the impression that the US is under attack.

This NYTimes article on the reasons why a South Korean passenger jet was inadvertently shot down by the Russians in the 1980s makes chilling reading. It’s really not that far-fetched to assume that someone might think they were doing the right thing by unleashing a “fire and fury” response to what has all the hallmarks of an act of war.

The idea that someone could accidentally press a button that has potentially serious consequences seems somewhat disturbing. So I did a little research; surely there must be some form of failsafe mechanism?

Its actually much, much worse than you might think. This tweet, from what I understand is a a well-respected investigative news source (interestingly owned by the founder of Ebay), shows the interface used to set off the alert:

Yes, you read that right. There isn’t a button. It’s a webpage with hyperlinks. The one the operator was supposed to use, had the word DRILL in front of it. The one he used didn’t. It’s that simple.

There are so many things to criticise here. It’s a master class in utterly appalling design.

I had initially questioned why it took 38 minutes to give the “all clear”. But having seen the interface used to set off the alert, I’m willing to believe that resetting it is probably not as easy as it ought to be.

If we allow people do take decisions that have serious consequences, then we need to think about the manner in which we allow them to take that decision. It’s very easy to blame the individual responsible for taking a decision; but if it is too easy to do the wrong thing, then arguably it’s also the architect of the system that’s allowed the poor decision to be made that easily in the first place, that should also bear some responsibility. How can it be harder to take money out of an ATM, than it is to unleash an emergency alert?

That’s slightly unfair: it turns out that there was an “are you sure?” option which the operator clicked “yes” to. Which isn’t really that surprising. After all you’ve selected the option you think is right, so being asked to confirm it, is unlikely to yield a different result. Unless the consequences of that choice are mad crystal clear. Which I suspect didn’t happen.

Fortunately it seems as though lessons are being learned from this. As this article explains, the responsible individual is being protected and has been reassigned and new controls are being introduced. In short they are treating it as an opportunity to learn from the incident:

“Looking at the nature and cause of the error that led to those events, the deeper problem is not that someone made a mistake; it is that we made it too easy for a simple mistake to have very serious consequences,” Mr. Miyagi wrote. “The system should have been more robust, and I will not let an individual pay for a systemic problem.”

All too often we hear that Human Error is the cause of something going wrong. When we think about that term, it’s important to consider where fault really lies. Don Norman, a User Interface expert, rightly points at the part poor design plays (whilst plugging his rather good book!)

Norman coined the term “user-centred design” which is the concept of ensuring that when you design something, you think about the end user. If something just looks good or it’s technically brilliant but terrible to use, you’re going to get bad results and it won’t achieve its ultimate aim. Or to put it another way: if you want people to do the right thing, make it easy for them.


Submarinated salt

A (not so) small but interesting footnote to my article on Human Risk in the Military comes from India, where a submarine has been out of action for 10 months because it submerged without having all the hatches closed. Apparently submarines don’t do quite as well when they’ve got salt water on the inside.

Of course there’s the obvious point that someone should’ve closed it. But I’d also ask why anyone designed something that didn’t mitigate against a very simple an obvious potential risk. You’d think they’d have some form of alarm that would prevent it from submerging if a hatch was open. I’ve seen more sophisticated warning systems on cars…

Full article here


Human Risk Military Edition

When I discovered that one of my favourite podcasts, This American Life had released a episode entitled Human Error In Volatile Situations, I thought it might be interesting. After all it sounds remarkably like a story of Human Risk. Which it is. In an episode that is nothing short of astonishing, I learned more than I ever wanted to about risk in the US military.

The episode covers two separate issues: the first is how a very minor error caused an explosion that could’ve detonated a nuclear weapon. The second is around sleep deprivation (caused in part by what might best be described as a macho culture, but equally a lack of resources) in the US Navy and the impact that is having on safety.

What’s remarkable about the episode is that it reveals quite how fragile things are. Near misses, in an environment where the consequence of an actual miss would be immense, seem to be commonplace.

This is worrying to say the least. One of the “givens” that we all have to live our lives by, is that someone somewhere has taken responsibility for managing safety. It’s what allows us to nonchalantly get on planes because we know that air traffic control exists, or buy food at the supermarket because we know that there are food safety standards that mean what we eat is fit for human consumption. In most cases we probably don’t know what the relevant authority that has this covered is called or how they go about their business. But we know that they exist. Or at least we work on the presumption that they do.

You’d think that the military would have even tighter controls around their risks. Because they’re even bigger when you’re dealing with weaponry that can cause real damage. But also because the knock-on consequences of an error are also pretty serious. Firing off a nuclear missile by mistake would be really bad. But the potential response, whether accidentally you’ve fired it at another country or just set it off in your own, could be even more devastating. It just doesn’t bear thinking about.

It’s one of the reasons a lot of veterans choose risk management as a career and do very well at it. Because the military understands real risk and it’s engrained in their culture. What is apparent from this podcast is that, at least in these two examples, is that they don’t always manage risk as well as we’d expect.

You can find the episode right here. If it’s piqued your interest in any way, then I’d also thoroughly recommend a book called On The Psychology Of Military Incompetence by Professor Norman Dixon. First published in 1976, its continuing relevance is demonstrated by the fact that it is still virtually a compulsory read at many Military Staff Colleges. Though I’m not sure whether those responsible for the situations so vividly depicted in the podcast have had the benefit of reading it.

In his book Dixon covers similar ground, focussing on famous events from history. Heres the summary from the book’s cover:

The Crimea, the Boer War, the Somme, Tobruk, Singapore, Pearl Harbour, Arnhem, the Bay of Pigs: just some of the milestones in a century of military incompetence, of costly mishaps and tragic blunders.

Are such blunders simple accidents –as the ‘bloody fool’ theory has it –or are they an inexorable result of the requirements of the military system?

In this superb and controversial book Professor Dixon examines these and other mistakes and relates them to the social psychology of military organization and to the personalities of some eminent military commanders. His conclusions are both startling and disturbing

Both the book and the podcast are very engaging and thoroughly recommended. Perhaps unsurprisingly given the nature of what they do and the pressures they’re under, as Human Risk stories go, the experiences of the armed forces are right up there.


Human Risk at the movies

A Business Week article entitled “How Hollywood insures against actors behaving badly” caught my eye over the festive period. It explains how, following the allegations made against Harvey Weinstein and others, film studios are having to make contingency plans for when stars reputations are so tarnished that they threaten the prospects of a yet to be released movie.

What’s interesting about this, is that unlike other situations where an individual’s bad behaviour is likely to have such a negative impact on their employer that action is required, is that once a movie is released, you’re stuck with the actors in it.

If a TV or radio host misbehaves then you can take them off air. Which is what happened with NBCs Matt Lauer after allegations of sexual harassment forced his employer to act. Netflix reacted similarly to the scandal surrounding Kevin Spacey by taking his character out of all future series of House of Cards. Arguably timing played in their favour as the news didn’t break just as they were about to release a new series.

It’s a well worn tactic deployed by corporates when their senior management become more of a liability than an asset; take Uber’s founder and former CEO Travis Kalanick who was removed by investors once it became clear that his behaviour and the culture within the Firm had a downside.

New management can sometimes rescue a firm’s reputation. However, if you’re about to release a movie and one of your stars’ reputations becomes tarnished, then you probably have little choice but to reshoot it without them or write the project off. Both are expensive, though throwing more money to potentially achieve some form of return (arguably “good” money after “bad”) probably makes sense given the economics of the film industry. Which is what the article explains happened to a movie starring none other than Kevin Spacey; his scenes were re-shot entirely with another actor at some cost.

Hollywood is having to adjust to this dynamic: after all you can’t predict with certainty which of your stars might fall from grace; the excuse of “it happened in another era” long having worn thin. One way is to re-design completion bonds, in essence insurance policies designed to bridge funding shortfalls when films can’t be completed within budget.

Another is to actually seek redress from the actors themselves.This is akin to the mechanisms which financial services regulators put in place after the crisis to ensure that bankers incentives were better aligned to those of shareholders. Many countries now require a proportion of bonuses to be deferred so that there is no payment if it subsequently transpires that things that looked profitable in the short term, turn out not to be in the longer term.


One idea that was floated by Andrew Bailey, a former head of one of the UK financial services regulators and the current head of the other one, was the idea of “malus”; a negative bonus. In effect, they’d require repayment of bonuses in circumstances where, with the benefit of hindsight, they shouldn’t have been paid out in the first place. That’s hard to do when people can spend what you pay them. Which is why deferrals and a tougher regime which, in extremis, could see people go to jail, was deemed a more practical solution.

The challenge that both Hollywood and banking have in common is that the behaviour of one individual can have an impact that goes way beyond their own pay packet. We’ve seen individual bankers take down entire Firms (step forward Nick Leeson) and one toxic movie star could sink an entire movie. So punishing the individual financially, won’t necessarily adequately compensate other stakeholders. Which is why jail, or at least the threat of “you’ll never work in this industry again”, probably works as an ultimate deterrent in the former.

I suspect it will be a lot harder to find something workable in the film industry. For very obvious reasons, insurance tends to be forward looking at the point at which it is underwritten. Pricing insurance against Human Risk is going to be challenging ,especially when the risk may already have crystallised years ago.

Hollywood is going to need to be at its creative best to solve this one.