how we learn reading room: normal accidents, Charles Perrow, 1999

1999

Perrow, Charles.

Normal accidents : living with high-risk technologies / Charles Perrow

1. industrial accidents.

2. technology--risk assessment.

3. accident.

HD7262 P55 1999

363.1--dc21

The paper used in this publication meets the minimum requirement of

ANSI/NISO Z39.48-1992 (R1997) (permanence of paper)

http://pup.princeton.edu

Charles Perrow, Normal accidents : living with high-risk technologies, 1999 [ ]

( Normal accidents : living with high-risk technologies / Charles Perrow, 1. industrial accidents., 2. technology--risk assessment., 3. accident., HD7262 P55 1999, 363.1--dc21, 1999, )

p.x

Nick and Lisa are inheriting our radioactive, toxic, and explosive systems, and I am aware that we are passing on a planet more degraded than we inherited. So I dedicate this book to them. I hope they can do more than Edith and I have been able to do.

p.4

The problem is just something that never occurred to the designers.

p.4

This interacting tendency is a characteristic of a system, not a part or an operator; we will call it the “interactive complexity” of the system.

pp.4-5

But suppose the system is also “tightly coupled”, that is, processes happen very fast and can't be turned off, the failed parts cannot be isolated from other parts, or there is no other way to keep the production going safely. Then recovery from the initial disturbance is not possible; it will spread quickly and irretrievably for at least some time. Indeed, operator action or the safety systems may make it worse, since for a time it is not known what the problem really is.

Probably many production processes started out this way──complexly interactive and tightly coupled. But with experience, better designs, equipment, and procedures appeared, and the unsuspected interactions were avoided and the tight coupled reduced. This appears to have happened in the case of air traffic control, where interactive complexity and tight coupling have been reduced by better organization and “technological fixes”.

p.5

The odd term normal accident is meant to signal that, given the system characteristics, multiple and unexpected interactions of failures are inevitable. This is an expression of an integral characteristic of the system, not a statement of frequency. It is normal for us to die, but we only do it once. System accidents are uncommon, even rare; yet this is not all that reassuring, if they can produce catastrophes.

p.7

Though the failures were trivial in themselves, and each one had a backup system, or redundant path to tread if the main one were blocked, the failures become serious when they interacted. It is the interaction of the multiple failures that explains the accident.

p.8

DEPOSE components (for design, equipment, procedures, operators, supplies and materials, and environment).

p.8

That accident had its cause in the interactive nature of the world for us that morning and in its tight coupling──not in the discrete failures, which are to be expected and which are guarded against with backup systems. Most of the time we don't notice the inherent coupling in our world, because most of the time there are no failures, or the failures that occur do not interact. But all of a sudden, things that we did not realize could be linked (buses and generators, coffee and a loaned key) became linked. The system is suddenly more tightly coupled than we had realized.

p.9

In complex industrial, space, and military systems, the normal accident generally (not always) means that the interactions are not only unexpected, but are incomprehensible for some critical period of time. In part this is because in these human-machine systems the interactions literally cannot be seen. In part it is because, even if they are seen, they are not believed. As we shall find out and as Robert Jervis and Karl Weick have noted,3 seeing is not necessarily believing; sometimes, we must believe before we can see.

3. Robert Jervis, Perception and Misperception in International Politics (Princeton, N.J.: Princeton university press, 1976); and Karl Wieck, “Educational Organizations as Loosely Coupled Systems”, Administrative Science Quarterly 21:1 (March, 1976): 1-19.

p.9

But if, as we shall see time and time again, the operator is confronted by unexpected and usually mysterious interactions among failures, saying that he or she should have zigged instead of zagged is possible only after the fact. Before the accident no one could know what was going on and what should have been done. Sometimes the errors are bizzare. We will encounter “noncollision course collisions”, for example, where ships that were about to pass in the night suddenly turn and ram each other. But careful inquiry suggests that the mariners had quite reasonable explanations for their actions; it is just that the interaction of small failures led them to construct quite erroneous worlds in their minds, and in this case these conflicting images led to collision.

p.9

Patient accident reconstruction reveals the banality and triviality behind most catastrophes.

p.9

Small beginnings all too often cause great events when the system uses a “transformation” process rather than an additive or fabricating one.

p.10

In many transformation systems we generally know what works, but sometimes do not know why. These systems are particularly vulnerable to small failures that “propagate” unexpectedly, because of complexity and tight coupling. We will examine other systems where there is less transformation and more fabrication or assembly, systems that process raw materials rather than change them.

p.73

1967, according to a perceptive and disturbing article by one of the editors of Nuclear Safety, E. W. Hagen.4

p.73

Hagen concludes that potential common-mode failures are “the result of adding complexity to system designs”. Ironically, in many cases, the complexity is added to reduce common-mode failures.

p.73

The addition of redundant components has been the main line of defense, but, as Hagen illustrates, also the main source of the failure. “To date, all proposed ‘fixes’ are for more of the same──more components and more complexity in system design.”5 The Rasmussen safety study relied upon a “PRA” (probabilities risk analysis), finding that core melts and the like were virtually impossible.

p.73

The main problem is complexity itself, Hagen argues.

pp.73-74

p.73

Proximity and indirect information sources are two others. For a graphic illustration of complexity in the form of unanticipated interactions from these sources, let us go to a different system, marine transport.

p.73

A tanker, the Dauntless Colocotronis, traveling up the Mississippi river near New Orleans, grazed the top of a submerged wreck. The wreck had been improperly located on some of the charts. Furthermore, it was closer to the surface than the charts indicated because its depth had been determined when the channel was deep, and the listing for mariners had not been corrected for those seasons when the river would be lower. The wreck, only a foot or so too high for the tanker, sliced a wound in the bottom of the tanker, and the oil began to seep out. Unfortunately, the gash occurred just at the point where the tank adjoined the pump room. Some of the oil seeped into the pump room.

pp.73-74

At first it was probably slow-moving, but the heat in the room made it less viscous [sticky], allowing it to flow more rapidly, drawing more oil into the pump room. When enough accumulated it reached a packing “gland” around a shaft near the floor that penetrated into the engine room next to it. The oil now leaked into the engine room. In the hot engine room, evaporation was rapid, creating an explosive gas.

p.74

There is always a spark in an engine room, from motors and even engine parts striking one another. (In fact, nylon rope can create sparks sufficient to cause explosions in tanker holds.) When enough gas was produced through evaporation, it ignited, causing an explosion and fire.

p.74

In this example, an unanticipated connection between two independent, unrelated subsystem that happened to be in close proximity caused an interaction that was certainly not a planned, expected, linear one. The operators of the system had no way of knowing that the very slight jar to the ship made a gash what would supply flammable or explosive substances to the pump and engine rooms; nor of knowing, until several minutes had passed, that there was a fire in the engine room; or of the extensive amount of oil involved.

p.74

They tried to put it out with water hoses, perhaps not realizing it was an oil fire, but the water merely spread the oil and broke it up into finer, more flammable particles.

p.74

This accident went on for several hours, because the crew made various mistakes (e.g., a key fire door had been tied open and was not closed by an escaping crew member, allowing the fire to spread), and they did not have a clear notion of the location of the many rooms and closets and passageways in that part of the ship.

p.74

Later in the accidents a trained fire crew boarded the ship equipped with protective devices and proper extinguishing equipment. But when they opened one door there was a series of explosions. They immediately closed the door and made no further attempt to put out the fire in that part of the ship, believing that the oil was producing the explosive gases.

p.74

Subsequently, however, it was determined that there were no explosive gases involved by this time; instead, three small empty gas tanks stored just inside the door, before being exchanged for full ones, had exploded when the heat expanded the residual amount of freon, oxygen, and acetylene in them.6

p.74

Thus even in the recovery stage of the accident a nonlinear interaction intervened to mislead the recovery attempt. It was a sensible place to store the tanks; who could imagine that there might be a fire, that firefighters would not be certain that there were not explosive mixtures behind the closed doors, and that these tanks would go off just as the firemen were entering the passageway? A requirement that empty tanks be stored elsewhere would hardly make the ship safer; who could know where the next fire might be?

p.75

These represent interactions that were not in our original design of our world, and interactions that we as “operators” could not anticipate or reasonably guard against. What distinguishes these interactions is that they were not designed into the system by anybody; no one intended them to be linked. They baffled us because we acted in terms of our own designs of a world that we expected to exist──but the world was different.

I will refer to these kinds of interactions as complex interactions, suggesting that there are branching paths, feedback loops, jumps from the one linear sequence to another because of proximity and certain other features we will explore shortly. The connections are not only adjacent, serial ones, but can multiply as others parts or units or subsystems are reached.

p.75

The much more common interactions, the kind we intuitively try to construct because of their simplicity and comprehensibility, I will call linear interactions. Linear interactions overwhelmingly predominate in all systems. But even the most linear of systems will have at least one source of complex interactions, the environment, since it impinges upon many parts or units in the system. The environment alone can constitute a source of failure that is common for many components──a common-mode accident. But even the most complex systems of any size will be primarily made up of linear, planned, visible interactions.

p.77

Or the nonlinear interactions may be intended but rarely activated, and thus operators or designers forget about them. It could have been foreseen for a designer that demineralized water might sometimes be needed in containment, so she could have made it possible to line up various values to provide the water. But if it is rarely used or is usually lined up before a crew enters the containment, the faucet creates no problems, it is not an expected production sequence but an infrequently used system possibility (in this case, for maintenance, not production).

p.77

Thus, complex interactions may be unintended ones, or intended but unfamiliar ones.

p.77

While linear interactions occur overwhelmingly in an anticipated production sequence, there is another kind of interaction that does not occur in a production sequence but is nevertheless obvious, and thus can be defended against. This is a visible interaction, even though it is outside of the normal sequence.

p.77

If the operator of a crane sees that the part of the crane that holds up the load (cable, ratchet, motor, hook) has failed and the load is about to drop on a boiler on the floor, he knows what this interaction will produce. There is nothing mysterious about the connection between these events, though they are certainly not in any expected production sequence.

p.77

Knowing there is a remote possibility of a large load falling, the operator usually tries to avoid passing it over the boiler, but he cannot always do so. Similarly, the designers may have considered covering the boiler or moving it, but estimated that the problems involved were too great, given the remote possibility of a large load falling at just that point.

p.78

But the production of, say, pharmaceuticals or F-16 fighter planes is anything but simple, though it is linear.

p.78

Another warning is in order. Since linear interactions predominate in all systems, and even the most linear systems can occasionally have complex interactions, systems must be characterized in terms of the degree of either quality. It is not a matter of dichotomies. Furthermore, systems are not linear or complex, strictly speaking, only their interactions are. Even here we must recall that linear systems have very few complex interactions, while complex ones have more than linear ones, but complex interactions are still few in number.

p.78

Nor does the designation “complex system” necessarily imply highly sophisticated technology, numerous components, or many stages of production. We will characterize universities as complex systems, but as most large-scale organizations go, they have few of the above characteristics.

p.79

coping with hidden interactions

‘’“”──

pp.93-94

1. Tightly coupled systems have more time-dependent processes: they cannot wait or stand by until attended to.

Reactions, as in chemical plants, are almost instantaneous and cannot be delayed or extended.

2. The sequences in tightly coupled systems are more invariant.

3. In tightly coupled systems, not only are the specific sequences invariant, but the overall design of the process allows only one way to reach the production goals.

Loosely coupled systems are said to have “equifinality”──many ways to skin the cat; tightly coupled ones have “unifinality”.

4. Tightly coupled systems have little slack.

In loosely coupled systems, supplies and equipment and human power can be wasted without great cost to the system.

pp.94-95

In a tightly coupled systems the buffers and redundancies and substitutions must be designed in; they must be thought of in advance. In loosely coupled systems there is a better chance that expedient, spur-of-the-moment buffers and redundancies and substitutes can be found, even though they were not planned ahead of time.

p.95

But in tightly coupled systems, the recovery aids are largely limited to deliberate, designed-in aids, such as engineered safety devices (in a nuclear plant, emergency coolant pumps and an emergency supply coolant) or engineered safety features (a more general category, which would include a buffering wall between the core and the source of coolant). While some jury-rigging is possible, such possibilities are limited, because of time-dependent sequences, invariant sequences, unifinality, and the absence of slack.

p.95

In loosely coupled systems, in addition to ESDs (emergency safety device) and ESFs (engineered safety device), fortuitous recovery aids are often possible.

p.95

Tightly coupled systems offer few such opportunities. Whether the interactions are complex or linear, they cannot be temporarily altered.

p.95

This does not mean that loosely coupled systems necessarily have sufficient designed-in safety devices; typically, designers perceive they have a safety margin in the form of fortuitous safety devices, and neglect to install even quite obvious ones.

p.95

unplanned safety devices

spur-of-the-moment jury-rigged contraption

p.95

At TMI two pumps were put into service to keep the coolant circulating, even though neither was designed for core cooling. Subjected to intense radiation they were not designed to survive; one of them failed rather quickly, but the other kept going for days, until natural circulation could be established.

p.96

Table 3.2

Tight and loose coupling tedencies

-------------------------------------------------------------------------------

Tight Coupling Loose Coupling

-------------------------------------------------------------------------------

Delays in processing not possible Processing delays possible

Invariant sequences Order of sequences can be changed

Only one method to achieve goal Alternative methods available

Little slack possible in supplies, Slack in resources possible

equipment, personnel

Buffers and redundancies are designed-in, Buffers and redundancies

deliberate fortuitiously available

Substitutes of supplies, equipment, Substitutions fortuitously

personnel limited and designed-in available

-------------------------------------------------------------------------------

p.102

Chemical poisoning is already well covered in Michael Brown's book, Laying Waste5 and the incredible sotry of PBBs in the cattle feed in Michigan is covered by Egginton.6

p.103

Many of our accidents will concern European firms simply because they discuss their problems more openly. When I once attended a joint West German-U.S. conference on safety in chemical plants, only one of the invited U.S. representatives came──and he was not sponsored by his organization but came on his vacation time! I even found it extremely difficult to get a simple plant tour of a refinery; after being turned down by firms I finally tagged along with a group of Stanford engineering students. My efforts with companies and trade and technical associations were generally met with statement that, “We do not want to wash our dirty linen in public.”

pp.114-115

p.119

One part of the start-up process involved reducing fuel gas pressure. Note the lack of knowledge in the following, and the consequences for operators.

The procedure assumed that reduction in fuel gas header pressure would

directly affect the process exit temperature from the reformer. We have

learned since that this is not true over the entire range of fuel gas

pressures. What happens is that with a given draft condition, at the time

of purge gas extraction, any extra fuel over the air supply goes unburned

into the convection section and does not impart any change in process

exit temperatures. The operator can therefore very easily misjudge the

required pressure reduction by only monitoring process temperatures.31

p.119

Note the number of complex interactions in this part of the system, some of them unexpected, such as the effect of using the auxiliary boiler. Note also the design problems: fan speed, and the heat added by using zinc oxide for desulfurization. Next, we have an equipment failure, the stuck louvers. Consider the amount of instrumentation that would be needed to have proper checks upon all of these components and interactions. (Some intrumentation was added during the rebuilding, but for louvers there is not much you can reasonably do.) Consider, finally, the lack of understanding of the particular dynamics of small parts of the system, such as the afterburning problem that made operator misjudgment easy.

p.119-120

The author who, as you recall, was an engineer at the plant, appears to share our concern:

Now, while this catastrophic failure is fresh in everyone's mind, it

is not apt to recur. [But] since the heat exchange relationship are

sometimes obsure to the operators because of the many variations, it

is feared that some time in the future this knowledge will change as

people change, and time will erode the memory of the catastrophe.

As the entire event is contemplated, anything above 900°F. is to

hot. It is possible to operate within this limit even though all of

the recommended changes have not been installed. [He strongly

recommended installing a desuperheat.] If the present state of

knowledge had been achieved prior to December 11, the failure would

not have occurred; it can be chalked up to lack of operating

experience and incomplete mechanical diagnosis.

On the other hand, if a system is complex and integrally meshed

as to require superhuman operators* to constrain the process within

safe limits, then it needs some modification. As the new generation

of reformers with air preheat is about to be born, steam temperature

sensitivity is likely to continue unless some help is given the plant

operators.32

*Emphasis added.

p.120

After the paper was presented, a discussant noted it is not the newness of the plant that is problem. Even in the older plants, he said, “We struggle to control it .... Runaways will take place and control by these caps is not the answer .... The way it is now we are in difficulties and I don't think anybody is sophisticated enough to operate the plant safely.”33

p.120

The biggest killer, an explosion in an I.G. Farben chemical plant in Germany in 1921, took 550 lives; the biggest series of explosions, in Texas City, Texas, killed 56 and injured over 3,000. As disasters go, petrochemical plants have contributed very little to human suffering, though we are excluding air pollution and other forms of contamination here.

p.120

Yet even some industry personnel are concerned that their scale, complexity, and proximity to human communities has been increasing steadily. The DuPont power plants along the Brandywine river in Delaware in the 19th century had walls facing the river built of insubstantial material so that the frequent explosions would be vented over the river, and not damage adjacent powder sheds.

p.168

Air Safety Reporting System

The Air Safety Reporting System, ASRS, was established in 1975, and receives over 4,000 reports a year on safety-related incidents and near accidents. Similar systems had been established in Europe, tried in the United States, and used by at least one U.S. airline, United Airlines. In fact, in 1974, a TWA flight crashed on a Virginia mountain top as a result of a confusing map and misinterpretation of ATC reports. In the subsequent NTSB investigation in turned out that United pilots had been warned of the hazard by their program. The FAA had sponsored a program but TWA had no such program. The FAA had sponsored a program in the late 1960s, ostensibly nonpunitive in nature, but pilots and controllers did not support it. After the Virginia crash, they sponsored another, but this time allowed the respected National Aeronautics and Space Administration to supervise it. NASA selected the Battelle Memorial Institute as the contractor. This insured considerable independence from the FAA, and with guarantees of immunity except in extreme cases, the program succeeded.

p.169

As is true of all accident reporting systems, this is clearly a “political” data source in some respects, but neither I nor others involved find any reason to doubt its overall accuracy. Indeed, the extent of mea culpa in the reports is striking, as it the objectivity of the analysis. Once de-identified, a report is part of the public record. I have used these reports to a limited extent myself to investigate incidents where airline management was somehow involved; the cooperation of the ASRS was exceptional.

It would be extremely beneficial if such a virtually anonymous system were in operation for the nuclear power industry and the marine transport industry.

pp.182-183

Consider the following accident. Like all the examples I will use, it involves a variety of failures, but economic pressure is clearly important in this case.

p.183

An experienced, meticulous captain of a large tanker chose to take a less safe, but more direct route to Angle Bay, British Petroleum's deep-water terminal on the western tip of Wales. He would save about six hours by passing through the Scilly Islands; the normal route avoids them. He had been informed by British Petroleum's agent that if he did not reach Milford Haven, at the entrance of Angle Bay, in time to catch high water, he would have to wait five days because of the considerable fluctuation of the tides.

p.183

As it was, he figured he needed to arrive four hours early to shift cargo in the calm waters off Milford Haven. At sea, to reduce drag, more oil was in the midship tanks than the fore and aft tanks; thus the tanker drew 52 feet 4 inches at the deepest part of the hull. This was too deep a draft to make it into the harbor even at high tide, so some oil had to be pumped from the midship tanks to the fore and aft ones. This would save two inches! (One wonders what happens if they miss the precise center of the channel, or if there is a swell that might raise and lower the monster four inches.)

p.182

The captain decided to pass through the Scilly Islands, a rash of sandspits and rocks comprising forty-eight (48) tiny islands. Four are inhabited, mostly by fisherman, and there have been 257 wrecks there between 1679 and 1933. Tales of false lights and plundered ships abound. Edward Cowan, in his lively tale, Oil and Water, notes that the following petition has been attributed to the Reverend John Troutbeck, a chaplain in the Scillies in the later 18th century:

We pray thee, O Lord, not that wrecks should happen, but if wrecks do

happen Thou wilt guide them into the Scilly Isles for the benefit of

the poor inhabitants.12

p.183

Nagivation in the passage the captain took, in good weather, even at night is “perfectly simple” as long as one's position is frequently checked, says the navigator's bible, the Channel Pilot.

pp.183-184

But in the “perfectly simple” passage, he came across fishing boats (which one would expect to meet on occasion) and was unable to make his final turn to avoid some underwater rocks just when he wanted to; unfortunately, in his rush he was making full speed in the channel. Six minutes later, after another bearing was taken, he realized he had overshot the channel. When the helmsman received the order to come hard left on the wheel, nothing happened. The captain had forgotten to take it off automatic the last time he turned it himself. He then threw the switch to manual so it could be turned and helped the helmsman turn the wheel, but it was too late. The Torrey Canyon dumped its cargo of 100,000 tons of oil over the coastlines bordering the English Channel.

p.184

The accident involves the usual number of “if only” statements. If the captain had not forgotten to put the helm on manual, they might have turned in time; if the fishing boats had not been out that day, he could have made his turn earlier; if he had prudently slowed down once he saw the boats, he could have turned more sharply; once deciding to risk going through the Scilly Islands, he used a peculiar passage through them, and another might have been safer (even faster), and so on. We simply don't know why he did various things, and we do not, of course, know whether we should believe his explanations even if we had them.

p.184

Production pressures are clearly present, however. They contributed to a decision that increased the proximity of subsystems and reduced the amount of slack available, moving it towards the complex, tightly coupled cell of our Interactive/Coupling chart.

p.233

Earthquakes not only stem from “the restless earth”, as Nigel Calder calls it, but from restless humans who think on a small scale in an ecology that is large scale.

p.233

With some dams an unanticipated expansion of system boundaries creates what we shall call an “eco-system” accident, a concept relevant to the toxic waste problem as well as earthquakes. It is also the only way to explain one hilarious example, the loss of a large lake in a few hours.

p.233

Small dams built by industrial concerns and municipalities, generally with little catastrophic potential, are more likely to fail. We should not forget the small industrial dam that destroyed the whole community of Buffalo Creek in West Virginia in 1972.

pp.233-234

The fascinating sociological classic, Everything in its Path, by Kai Erikson, tells the story of the history of the community, the disaster, and the poignant attempt of the survivors to establish new lives when the social web of their community was destoryed.3

p.234

Not insignificantly, Erikson gathered his data while helping in preparation of a lawsuit against the mining company, which had ample warning of the danger they had created; the suit was successful.

p.234

They also wanted to remind the Bureau that there was evidence that reservoirs actually cause earthquakes.

For example, in 1935 the Colorado river was damned, creating the large reservoir called Lake Mead. In the next ten years, 6,000 minor earthquakes occurrred in what was previously an earthquake-free area. The underlying rocks──had 10 cubic miles of water set on top of them.

pp.234-235

More violent disturbances were created when the Kariba dam in Africa was built in 1963 through 1966. When the Koyna dam in India was being filled with water, there was violent earthquake which cracked the dam and killed 177 people living nearby.5

p.235

The geologists were sufficiently alarmed to note at the end of their draft memo that a failure would cause an enormous flood. Further, they noted satirically: “Since such a flood could be anicipated, we might consider a series of strategically-placed motion picture cameras to document the process of catastrophic flooding.”6

p.241

radioactive dams

pp.242-244

Teton disaster

p.243

Denver, Colorado, had a mild earthquake in April 1963. It was a surprise, since there had not been an earthquake in the area in eighty-one years. Small ones continued for several years; one in 1967 did a little damage to the city. It turned out that the army caused them.

p.243

The army's Rocky Mountain Arsenal is 10 miles from Denver. It manufactures toxic materials, such as nerve gas, and had to get rid of large amounts of contaminated water. For a time they just put it into holding ponds, but this led to the death of crops, livestock, and wild life. So they dug a well, 2 miles deep, and forced the garbage into it under high pressure. Six weeks later there was the first earthquake, and then an almost daily series of minor tremors. The source of the earthquake was suspected within a year, but the army denied it could happen and went on pumping. The water, under high pressure, force the old cracks in very old rocks to grow, and this allowed the rocks, under pressure from tectonic movements, to slide in jerky movement over one another. Even after the pumping stopped, for a time the pressurized water continued to force open the cracks. About two years after the army finally stopped the practice, the earthquakes also stopped.

p.243

National Center for Earthquake Research

The National Center for Earthquake Research took over part of the field for deliberate experimentation. When they pumped water in, earthquakes occurred; when they pumped the water back out, they stopped.

p.244

Since the Pacific plate moves northward at about 2.5 inches a year, it builds up tremendous pressure. In some places near San Francisco the rocks are estimated to be 13 feet out of adjustment.

p.244

Fortunately, the fault is close to the surface here, and appears to be self-lubricating below a depth of about 12 miles.22

p.244

We are even wary of seeding hurricanes, on the off chance that the storm might then change direction or get worse.

p.245

On the Interaction/Coupling chart in Chapter 3, mining is seen as fairly loosely coupled, and a bit more complex thatn linear. The argument for loose coupling is that when failures occur there is generally room for recovery because affected areas can be segregated; alternative sequences can be used for a time; some slack resources exist, and indigenous substitutions can be made in many cases. Yet it is not as loosely coupled a system as, for example, manufacturing firms, governmental agencies, or universities. There are time-dependent processes, there is only one way to perform certain activities, and the nature of the physical site eliminates some slack in resources.

p.245

But the setting can create unexpected, unplanned, and invisible interactions, largely in the matter of the flow of gases and pressures in the complex web of shafts and tunnels and ventilation holes. Some of these are probably unavoidable; there are simply too many interacting factors to anticipate just where an explosive mixture will be formed, and where it will explode; which doors will be blown out, forming unexpected paths for the force of the explosion and the dangerous gases to travel. Mines, then, partake of some aspects of both complex and linear systems.

p.246

I am sure that the first exceeds the second; a risk-taking, macho culture has probably developed to make sense out of what is a fairly inhuman activity, a view of this world that makes it conceivable that one could function in it.

p.246

(The article shows that half of the deceased had less than one year's experience at the job they were doing, but the statistics tell us nothing about the role of experience since we do not know the “base rate”──proportion of all miners that are inexperienced.)

p.248

The next accident occurred in a Wyoming uranium mine. A worker walking with a probe over his shoulder was run down and crushed by a bulldozer that was backing up.

p.249

As I have tried to indicate, even this selection suggests that experience and training are perhaps less relevant than job pressures, careless management, and supervision, and above all, the inherent dangers of these enterprises. None of these are system accidents. The system is not very tightly coupled nor complexly interactive.

p.259

‘’“”──

p.259

The Mercury part of the space program, involving only orbital single-astronaut flights, was reviewed by NASA in a 440-page document that New York Times reporter John Finney called a “remarkably harsh indictment of American industry.”2 Among the problems: spare parts that were 50 percent defective; capsules with more than 500 defects; batteries with holes in them; vital electronic parts improperly soldered; valves improperly installed, leading to attitude-control problems on flights; dirty gas pressure regulators; and contaminated oxygen and water for breathing and drinking. The contractor denied they performed anything but superbly, but did not respond to the specific charges.

p.259

After the tragic fire on the launch pad that killed three astronauts, an Apollo inquiry board was appointed, and it in turn drew upon the research of twenty-one panels of experts──1,500 overall──when it prepared its report.

p.259

Rockwell International

p.260

According to a Babcock and Wilcox engineer, a short in a safety device (a testing circuit) depleted the power supplies by the time the Ranger reached the moon. The engineer notes that the more redundancy is used to promote safety, the more chance for spurious actuation; “redundancy is not always the correct design option to use”.4

p.260

The third was in a navigational satellite sent up in 1964 that failed to achieve orbit when its rocket engine failed. It reentered the atmosphere over the Indian Ocean and distributed 1 kilogram of plutonium-238 about the earth. By 1970 it was estimated that about 95 percent of it had settled on the ground or the earth's waters. The accident was estimated to produce a three-fold increase over the amount of plutonium contamination produced by all atmospheric nuclear weapons testing.5 This received almost no publicity, in contrast to the breakup of a Soviet nuclear-powered satellite in 1978 and another one in 1983. The first public mention of it may have been in a 1967 item in the journal Science.6

p.261

Some variants of organizational theory, reinforced by democratic values, generally assert that those closet to the disturbances in a system are in the best position to act upon them. Decentralization is the recommendation. But other variants and much of engineering logic assert that those close to disturbances cannot act fast enough, or with enough comprehension, to cope; therefore, designers should try to eliminate as many human tasks as possible and give them to machines, and mangers should have the means to tell the system and/or its operators just what to do. Centralization is the recommendation. The conflict between these views runs through the space program. The regular appearance of mysterious interactions only escalated the conflict. We will consider this in the next section.

p.262

(I am drawing here on the immensely entertaining, and exceptionally perceptive book by Tom Wolfe, The Right Stuff. He will be our guide for this section.8)

p.262

Think first of the “great designers”, scientists and engineers who plan, or design, these spaceships. There are two basic sections, the rocket and the pod at its tip. The rocket has to go up, abort if anything is wrong, depositing the pod safely by parachute, or, if the launching goes well, shut off and detach itself from the pod at the proper moment, allowing the pod to hurtle like a cannonball into space. The “great designers” are also responsible for the ground control system that monitors the sensors in the rocket and the pod and intervenes if anything goes haywire. Within the rocket and pod they have put automatic systems that come on or shut off without anyone doing anything.

p.262

Next think of the ground controllers, or middle management──dozens of them sitting in front of keyboards and television screens with headsets on. They run the control and monitoring system that the designers have created. Finally, there is Ham, or Alan Shepard as the case may be, in the pod. Ham does not interfere in the system at all; Shepard is allowed to play with the thrusters that keep the pod from tumbling or turning, after the capsule is free of the rocket, or before that, to punch the abort button if there is an emergency. The abort button too could be automated. The hierarchy was clear: designers, controllers, and subjects. Project Mercury was supposed to be a scientific enterprise; astronauts were part of the test.

p.262-263

Much of the complexity of the system that could create emergencies was the result of habitat and retrieval requirements once humans were aboard.

p.263

Though human subjects were not expensive in labor terms (...), they were expensive to keep alive up there and to recover in one piece.

p.262

If humans were to be significant nodes in the loop in design, equipment, supplies, and ground control, exercising some judgement, doing a bit of piloting in a “spacecraft” rather than a “capsule”, there were no signs of that plan in the first flights. What Wolfe suggests is that in selling the space program to the public, the public response caused that decision to be made. The first seven astronauts were instant heroes after they were selected. They were to ride on top of rockets that were always blowing up, and come down in a large can in the ocean, and to do that, they must be heroes.

p.262

The importance of the question goes to the organizational heart of all high-risk systems: If they are risky, to operators and other potential first-party victims, as well as to the immensely expensive investments, why not eliminate the operators? A package could be landed on the moon and ordered to send back pictures, perhaps even samples. Some argued this would be safer and cheaper. Operators are unreliable, as well as alive.

pp.280-281

p.285

What keeps these outpost of our defense busy are primarily atmospheric disturbances that produce an infrared “signature” somewhat similar to that produced by an actual missile launch that is detected by our satellites. There are also flocks of birds to contend with, and space shots and testing, as well as miscellaneous anomalies in the atmosphere or the equipment. With so many warnings, the system probably has its responses worked out quite well. This was true on November 9, 1979, when the monitors indicated a massive Soviet attack.

November 9, 1979

p.285

The attack showed up simultaneously on monitors in the Colorado headquarters of NORAD, the National Command Center in the Pentagon, Pacific Headquarters in Honolulu, and “elsewhere”.27 A thousand Minuteman Intercontinental Ballistic Missiles (ICBMs), capable of hitting targets in Russia, were placed on low-level alert.

26. Gary Hart and Barry Goldwater, “Recent False Alerts from the Nation's Missile Attack Warning System”, Report to the Senate Committee on Armed Services (Washington, D.C.: government printing office, 9 October 1980).

27. New York Times, 16 December 1979.

pp.286-287

p.287

The cause of the false alarm could not be determined. But NORAD knew it was false because neither the satellites nor the radar picked up signals of land or submarine-based missiles being launched or penetrating our perimeters. NORAD kept the system in the same configuration for the next few days to see if it would reappear, or so they testified in Congress.28

p.287

Three days later the identical alarm recurred. SAC crews again started their engines, and again the alarm was determined to be false in less than three minutes. NORAD switched to a backup computer, and began the search for a possible malfunction. Eventually they found it. It was a defective, tiny silicon computer chip (cost, 46 cents), not in the computer itself that was set aside, but in the “multiplexer” that routes messages to the various command posts.

p.287

This multiplexer sends a message to the command posts on a continuous basis to confirm that the channel is open and usable. However, this message was in the same format as that used to indicate a real attack. Why it was the same format is not clear. It seems unlikely that this was an inadvertent similarity, though that is possible; more likely, the form of the message was similar to insure that that form could be sent. The message has a space to indicate the number of missiles, the number given it zero in the routine, test message──we have no missiles today. Due to the malfunction of the chip, this zero changed to a “2” and then apparently to other numbers; it sent this data out to some, but not all of the command posts. The available testimony does not indicate what information other than “2s” for the number of missiles was also sent.

p.287

NORAD changed the routine message format so that it no longer resembled an indication of an actual attack. It also corrected another oversight, resembling the PORV warning light at the Three Mile Island plant. NORAD had been aware of what it told its equipment to send out, of course, but it had no monitors at its headquarters to show what actually was sent out. (At TMI, the operators were aware of what automatic system had told the valve to do, but had no way of knowing what it actually did.) NORAD operators had no way of knowing that, without intending it, the signal was saying, “We have two missiles for you today.” NORAD installed monitors, and presumably three new shifts of enlisted men and women now watch these and compare them with the signal that is supposed to go out.

‘’“”──

p.288

Perimeter Acquisition Radar Attack Characterization System (PARCS)

PARCS is near Grand Forks, North Dakota,

a special radar system called Pave Paws

For those coming from the Gulf of Mexico, we have an older radar system.

p.288

The command centers can check all these.

p.293

In the chemical industry, work is underway to create micro-organisms that can serve as catalysts for chemical reactions, thereby doing away with some conventional transformative processes requiring high temperatures and pressures.30

30. Nicholas Wade, “Recombinant DNA: warming up for the big payoff”, science, 206 (29 November 1979)

p.295

The oil spill in Santa Barbara, the leeching of toxi wastes at Love Canal, and the contaimination of Seveso, Italy, by dioxin are component failure accidents. They are the result of poor design, operator failure, or equipment failure.

p.295

Hooker chemical company knew the danger of the toxic waste they buried at Love Canal.

pp.295-296

The drug firm, Hoffman-La Roche, in Switzerland was well aware of the danger of dioxin contamination in their plant in Sevesa, Italy, and indeed plant officials were instructed to readily reimburse their neighbors for dead farm animals that continued to appear. Knowing the dioxin was a by-product of the pesticide the plant produced, they would not allow production to take place in clean little Switzerland, where their headquarters were, but instead had it produced in dirty northern Italy. When the chemical reactor exploded one weekend when no one was attending it, the safety device protected the plant by allowing the poison to blow up into the air through a stack, from where it drifted over the neighboring community. Plant officials avoided a panic by simply not informing the community.

p.296

The linkage is not only unexpected but once it has occurred it is not even well understood or easily traced back to its source. Knowledge of the behavior of the human-made material in its new ecological niche is extremely limited by its very novelty.

p.296

It is only since science has learned to replicate complex physical, chemical, and biological processes in the laboratory that its actions have been so consequential for the eco-system. The frequency of unintended interventions in the eco-system are likely to increase as the keys to more natural process are discovered.

p.296

This form of ecological consciousness was at its peak in the early seventies when molecular biologists first raised the possibility that an as yet untested technique, the “splicing” together of genetic materials from unrelated species and their implanting in either an intended or unintended host, might be potentially catastrophic.

pp.296-297

Gene Splicing

The recombinant technology, popularly known as gene splicing, involves a set of technqiues that allows scientists to use special enzymes to cut into pieces the long double strands of molecules that make up DNA and then to recombine the pieces with the DNA of a carrier, called the “vector”. These recombined molecules are then inserted into a host where they will presumably propagate. By combining the foreign DNA into a vector that typically replicates in the host organism, scientists are able to induce the “expression” of the foreign genetic material in its new host.

p.297

For example, if the foreign DNA carried the genetic information for human growth hormone, it might then be combined with a vector that replicated in some readily available host bacteria. The host bacteria would then begin manufacturing human growth hormone. This has actually been accomplished.

p.297

Paul Berg, biochemist at Standford

p.297

Robert Pollock, microbiologist

p.297

Cold Spring Harbor laboratory

p.297

Pollack was perhaps the first scientist to recognize the danger of creating a recombinant hybrid with unknown infectivity that might be capable of surviving in humans. His worried call to Berg touched off first anger and then concern in Berg.35

30. Nicholas Wade, “Recombinant DNA: warming up for the big payoff”, science, 206 (29 November 1979)

33. Kenney, “Genetic Engineering and Agriculture”, 1.

34. Quoted in Nancy Pfund, “Recombinant DNA: miracles and menace” in Do No Harm: health risk and public choices, ed. Diana Sutton (Berkeley, California: university of california press, 1984).

35. I am drawing upon the accounts of Nicholas Wade, The Ultimate Experiment (New York: Walker, 1977); John Lear, Recombinant DNA: the untold story (New York: Grown, 1978); and Sheldon Krimsky and D. Ozonoff, Genetic Alchemy: the social history of the recombinant DNA controversy (Cambridge: M.I.T. press, in press).

p.297

By June 1973 other scientists had expressed their concern to the National Academy of Science (NAS).

p.297

The NAS (National Academy of Science) group convened in April 1974 at M.I.T.

p.298

The international conference came together seven months later, in February 1975, at the Asilomar conference center in Pacific Grove, California.

p.309

The field acknowledges the difference between voluntary risks such as skiing and hang-gliding, and involuntary ones such as leaching of chemical wastes.10 But it does not acknowledge the difference between the imposition of risks by profit-making firms who could reduce the risk, and the acceptance of risk by the public where private pleasures are involved (skiing) or some control can be exercised (driving).

pp.309-310

Something similar took place at the Ford Motor Company when it decided not to buffer the fuel tank in the Pinto, and at the General Motors Company when it rejected warning from engineers that the Corvair would flip over for the lack of a $15 stabilizing bar.13

p.310

Baruch Fischhoff, in a thoughtful examination of cost-benefit analysis (the article has the engaging title, “Cost-Benefit Analysis and the Art of Motorcycle Maintenance”), notes another consequence of the monetarization of social good by economists.14 Cost-benefit analysis is “mute with regard to distribution of wealth in society”, he notes. “Therefore, a project designed solely to redistribute a society's resources would, if analyzed, be found to be all costs (those involved in the transfer) and no benefits (since the total wealth remains unchanged).” Risks from risky technologies are not borne equally by the different social classes; risk assessments ignore the social class distribution of risk.

p.311

There are those who argue that we are losing our moral fiber because we no longer want to take risks with technologies.16 But it is striking that those who feel we have abandoned risk in our search for security are speaking only of technological risks associated with large corporations and private profits, or aggressive military postures.

p.311

The corporate and military risk-takers often turn out to be surprisingly risk-averse (to use the jargon of the field) when it comes to risky social experiments which might reduce poverty, dependency, and crime.

p.311

Though the following list of proposals may strike one as fanciful, they all involve substantial risks that liberals and leftists have suggested, but the corporate and military risk-takers would not want to try because of the consequences for the class structure, their power, and their values.

p.311

The proposals include guaranteed income maintenance plans (as found in Europe); truly progressive taxation; investment in poor and declining areas (U.S. banks pay one of the lowest income tax rates in the nation──about 3 or 4 percent──but refuse to risk investing in poor inner city areas); heroin maintenance programs to reduce crime; unilateral nuclear disarmament(a risky venture that could not only reduce the risk of nuclear accidents but promote economic prospects); withdrawal from Central America; and so on.

p.312

Along with highway fatalities, lung cancer from smoking is the favorite referent of the new body-counters. It is treated as a voluntary activity, like hang-gliding. But most of us who smoke today do so because we were barraged with advertisements and inducements that soon addicted us. In World War II every packet of field rations held its five cigarettes per meal, and the sale of cigarettes to the armed forces was heavily subsidized and untaxed. Airliners used to pass them out gratis, presumably to calm one's nerves after takeoff. No Hollywood hero was without them. The promotion was intense, and so were the private profits. So addicted was the whole economy that the government subsidies to tobacco farmers today far exceed all that is spent on warnings and research by the government.

p.312

Ironically, the health of an important sector of our economy (tobacco growing, cigarette sales, and advertising) depends upon the illness of the victims; the costs of stopping smoking are not only individual (because of addiction) but corporate.

p.312

An individual's addiction to smoking should not be compared to the costs industry must be forced to incur to reduce brown lung disease or make safer Christmas toys.

p.312

Driving is a key one. We appear to accept risks more readily when we think our skill will play some part in avoiding the hazard. We fear and reject risks where we are passive recipients of harm.

p.312

The plant, we feel, not unreasonably, should not blow up, the dam break, the air controller goof, the Ford executives fail to protect a gas tank from exploding; over these risks we have little control.

pp.312-313

But we are willing to take our risks with driving, skiing, and parachuting.

p.313

Driving to work for many of us is about as involuntary an activity as there is, but at least we have some control over it. On the other hand, although we voluntarily fly in an airliner to a distant vacation, we have no control over the aircraft or the airways.

p.313

We voluntarily attend large events at stadiums that sometimes burn or collapse, but we have no control over the architects or the construction firms, or the owners who always seem to lock the safety exits.

p.313

For active risks, those that the individual performing the activity has some control over, the marketplace provides at least a rudimentary, though imperfect, way of addressing safety issues.

p.313

These activities are beyond our control. For these, the government must step in.

p.313

In more and more areas of our life the government must step in. This is not the result of the cancerous growth of government, but rather is essential because our personal control over our environment and our activities is being steadily eroded by systems that we participate in, or are passively affected by. In some cases Congress recognizes this danger.

p.314

Perhaps risk assessors might advise industry that it is easy to predict an aversion to such passive risks as mercury poison, dioxin, DES, asbestos, and brown lung disease.

p.314

Two dangers of “active risks” should be noted. Consumers will not always voluntarily pay for safer products, and often are not attentive to the risks even if they are well known. I assume it will always be thus.

p.314

Second, active risks are attractive; we like to take some risks, if we feel we have personal control over them. This means that as the risk declines with better equipment, more people may be brought into the activity, those who now feel the risk is reduced to the level they will tolerate. The end result is that the accident level may not change with the new safety devices.

p.314

For example, as better ski equipment appeared, and the slopes were better groomed and more safely designed, the ski resort industry began to advertise heavily to attract novice skiers. More novices meant more accidents, including more accidents for the advanced skiers they ran into. While the play was more safe, the risk was increased because of more inexperienced players. The safety record of any active risk activity must include the number of participants and the proportion of new, unskilled ones.

p.314

The risk assessors, then, have a narrow focus that all too frequently (but not always) conveniently supports the activities elites in the public and private sectors think we should engage in. For most, the focus is upon dollars and bodies, ignoring cultural and social criteria.

p.314

The assessors do not distinguish risks taken for private profits from those taken for private pleasures or needs, though the one is imposed, the other to some degree chosen; they ignore the question of addiction, and the distinction between active risks, where one has some control, and passive risks; they argue for the importance of risk but limit their endorsement of the approved risks to the corporate and military ones, ignoring risks in social and political matters.

pp.314-315

Few of the risk assessors call for this outright, most imply it; some state that the public must be involved, but only on the risk assessor's terms, and a few reject the implication and genuinely think the public has something to contribute (principally the Decision Research group, an important private company in Eugene, Oregon, that has done the best work in the field I believe, though they hedge on the value of the contribution).

p.315

Humans in general do not reason well (even experts can be found to make simple mistakes in probabilities and interpretation of evidence); heroic efforts would be needed to educate the general public in the skills needed to decide complex issues of risk.

p.315

‘’“”──

p.318

One important and unintended conclusion that does come from this work is the overriding importance of the context into which the subject puts the problem.

p.318

The decisions made in these cases were perfectly rational; it was just that the operators were using the wrong contexts. Selecting a context (“this can happen only with a small pipe break in the secondary system”) is a pre-decision act, made without reflection, almost effortlessly, as a part of a stream of experience and mental processing.

p.318

And defining the context is a much more subtle, self-steering process, influenced by long experience with trials and errors (much as the automatic adjustments made in driving or walking on a busy street).

p.318

For example, take the supposedly widespread failure to take the “base rate” into account (all the past events, such as the number of flights were there were no accidents, or the proportion of single people in a community).

pp.318-319

If the problem is presented in one way, the subject will use the frequency and probabilistic reasoning to estimate a rate (since we do use them in many situation in life); if the problem is presented in another way, with some vague but misleading cues, the subject will rule out probabilistic reasoning.

p.319

Kahneman and Tversky

p.319

When no description of Jack is given, the subjects estimate the probability on the basis of the base rate.

p.319

When a supposedly “irrelevant” and “useless” description is given, they ignore the base rate. But the subjects probably figure that it would be silly for the experimenters to give a description that has no meaning, so they search for meaning, and find something.

p.319

Given these cues, the subject may ignore the base rate because they assume the cues are meant to be used as guides, in spite of the base rate.

p.319

Finally, heuristics are akin to intuition. Indeed, they might be considered to be regularized, checked-out intuition. An intuition is a reason, hidden from our consciousness, for certain apparently unrelated things to be connected in a causal way. Experts might be defined as people who abjure intuitions; it is their virtue to have flushed out the hidden causal connections and subjected them to scrutiny and testing, and thus discarded them or verified them. Intuitions, then, are especially unfortunate forms of heuristics, because they are not amenable to inspection. This is why they are so fiercely held even in the face of contrary evidence; the person insists the evidence is irrelevant to their “insight”.

p.320

Baruch Fischhoff wonders if our intuitive judgements, or logic, might not be utilized even if it is denigrated by the experts. “It is worth asking”, he writes, “whether there is not a method in the people's apparent madness. Are there not decision-making criteria overlooked by formal analysis yet essential for human welfare or psychological well-being?”30 As we shall shortly see, there are such criteria, if we examine the work of Slovic, Fischhoff, and Lichtenstein carefully.

p.320

A significant event such as this is an indication to the public of what is possible; it is a signal that these plants can have serious troubles even though the experts say they will do so only very rarely.31 Since some experts appeared to think it could almost never happen, expert predictions might reasonably be questioned by the public. If experts are wrong, then this may not have been the one time in three hundred years, but the first one of many, many times over three hundred years. (Note the bounded rationality theorist is not siding with the public on the risks of nuclear power, only saying that it is not irrational that they could come to their conclusion.)

p.320

It is an efficient logic, given the fact that experts, like everyone else, are fallible and have been proven wrong in the past. It is efficient to question them. Such logic is also efficient because it motivates the public to demand: “Remove that threat; I don't want to live with a lot of threats; you are not counting in my psychic costs; find another energy source.”

pp.321-322

People vary in their cognitive abilities in absolute terms, but they also seem to vary with respect to different thinking abilities for different tasks. You and I may be equally intelligent, when measured over a number of areas, but you are good at counting while I (as I tell my quantitative colleagues) don't count. Yet I have learned how to visualize, or model, things in 3-dimensional space, or perhaps have an innate capacity for it. Because of my limitation in counting, I need you, and vice versa. Our limitations bring about social bonding. Bonding by diversity in skills (which is related to limitations in cognition, incidentally) is more stable and perhaps more satisfying than bonding by addition of equal talents. That is, the standard illustration of two people moving a rock that neither could move alone as the basis for social life is a very minimal one; any partner would do, and once the rock is moved, we can part. But bonding because sometimes we need to count and sometimes we need to visualize, so we had better have each other around when these tasks appear, is a strong basis for social life. If everyone were equally rational, we would not need economists. Since we are not, we need both economists who try to see where rational, quantitative solutions will work and sociologists who try to see how social bonding can be utilized and maximized.

p.322

A second cheer for our limitations stems from your propensity, for example, to see all problems as one of measurement and counting, and my propensity to see all problems as one of social interactions. If we have a common problem and it seems to have a lot of numbers, rates, proportions, and so on in it, you are likely to move quickly to a mathematical solution. Your “heuristics” are better than mine if numbers are included. But because of your expertise, you are very likely to end up deciding that the problem should be seen in a manner that allows a quantitative analysis. Your “framing” of the problem prejudges the problem and prejudices the answer. So does mine. For you, the choice of nuclear or coal can be measured by toting up the deaths per megawatts of power produced to date by each activity. The risks of DNA research can be measured by seeing how many experiments have gone on without any accidents. But I might define the power generation problem in terms of potential deaths in a rare but conceivable catastrophe, the fact that the deaths would involve related people (communities), and potential contamination of large land areas for generations to come.

p.322

A working definition of an expert is a person who can solve a problem faster and better than others, but who runs a higher risk than others of posing the wrong problem. By virtue of his or her expert methods, the problem is redefined to suit the methods.

p.323

Because I look for social relations, symbolic values, and human progeny, I define the problem as one of potential consequences, not observed ones.

p.326

The most important factor, which they labeled “dread risk”, was associated with:

• lack of control over the activity;

• fatal consequences if there were a mishap of some sort;

• high catastrophic potential;

• reaction of dread;

• inequitable distribution of risks and benefits (including the transfer

of risks to future generations), and

• the belief that risk are increasing and not easily reducible.

p.326

p.97

Figure 3.1 Interaction/ Coupling chart

p.327

Figure 9.1 Interaction/ Coupling chart

< the following TEXT listing is misleading, because the spatial aspect of the figure has been eliminated >

1. linear interactions/ tight coupling

dams, power grids, rail transport, marine transport, airways, some continuous processing, e.g. drugs, bread

2. complex interactions ([ non-linear ])/ tight coupling

aircraft, DNA, chemical plant, nuclear plant, nuclear weapons accidents, space missions, military early warning

3. linear interactions/ loose coupling

junior college, assembly-line production, trade schools, most manufacturing, single-goal agencies (motor vehicles, post office)

4. complex interactions ([ non-linear ])/ loose coupling

mining, military adventures, R&D firms, multi-goal agencies (welfare, DOE, OMB), universities

p.328

The dimension of dread──lack of control, high fatalities and catastrophic potential, inequitable distribution of risks and benefits, and the sense that these risks are increasing and cannot be easily reduced by technological fixes──clearly was the best predictor of perceived risk. This is what we might call, after Clifford Geertz, a “thick description” of hazards rather than a “thin description”.36 A thin one is quantitative, precise, logically consistent, economical, and value-free. It embraces many of the virtues of engineering and physical sciences, and is consistent with what we have called component failure accidents──failures that are predictable and understandable and in an expected production sequence. A thick description recognizes subjective dimensions and cultural values and, in this example, shows a skepticism about human-made systems and institutions, and emphasizes social bonding and the tentative, ambiguous nature of experience. A thick description reflects the nature of system accidents, where unanticipated, unrecognizable interactions of failures occur, and the system does not allow for recovery.

p.330

If the complex interactions defeat design-in safety devices or go around them, there will failures that are unexpected and incomprehensible. If the system is also tightly coupled, leaving little time for recovery from failure, little slack in resources or fortuitous safety devices, then the failure cannot be limited to parts or units, but will bring down subsystems or systems.

p.332

figure 9.2 Centralization/ decentralization of authority relevant to crises

pp.332-333

pp.333-334

p.334

Chemical plants are also not extremely high on complexity and coupling, but we saw safety engineers ruminating about the problem of bringing in operators more as the process got more complex, and we saw operators overriding the centralized, automatic system to decouple parts of the plant and to manually shut down affected parts on their own. I do not know enough about this industry to say that the problems of being both centralized and decentralized is expensive and continuous, but I would expect that it is.

p.335

We cycled endlessly through the problem of insuring rapid, unquestioning response to orders from on high (or orders in the procedures manual), and at the same time allowing for discretion to operators. Regarding discretion, the operators would have the latitude to make unique diagnoses of the problem and disregard the manual, and be free of orders from remote authorities who did not have hands-on daily experience with the system. We could recognize the need for both; we could not find a way to have both.

p.330

Quality control, operator training, design experience, and environmental controls will help, but will not be sufficient. These are benign steps to reduce the frequency of system accidents. But there is one other solution that may not be so benign: institutioning highly centralized, authoritarian organizational structures.

p.330

The topic of organizations is quite relevant to high-risk systems, at least as important as the possibilities of technological fixes.

p.336

The influential Washington lawyer and reputed power-broker in Lyndon Johnson administration, Harry C. McPherson, echoed the sentiment.

p.337

Commissioner Peterson, President of the National Audubon Society and former research director of duPont, was not convinced. “They's got some damn dangerous stuff out there in that building” he mused. “There is a little sign on the desk as we were in the plant”, he continued, “saying right there, ‘Nuclear Power Is Safe.’” Commissioner Pigford (professor of nuclear engineering at the University of California, Berkeley, and former employee of a nuclear vendor) insisted that “they thought it was safe.”

p.337

Peterson retorted, “They knew damn well it was dangerous”, and went on to say, “It's easy to work safety when you're making a suit of clothes, but not nuclear energy.” Later he brought up the difference again: “There's one big helluva difference” between nuclear plants and garment factories or General Motors plants, “and the community knows that now.”

‘’“”──

p.338

Two nuclear submarines have gone to the bottom of the ocean with all its crew.

p.339

At some point the cost of extracting obedience exceeds the benefits of organized activity.

p.339

Second, most of our technologies do not threaten our values, nature, or our lives. One can no more be anti-technological than they can be anti-culture or anti-nature; the hoe and the wheel and cooking meat and baking bread are technologies.

p.355

deadly methal isocyanate gas (MIC) in Union Carbine's Bhopal, India, plant in December 1984,

p.360

Swiss Re, the world's second largest reinsurance firm

(by forming a syndicate of insurers)

to increasing the resources expended upon money management, or arbitrage.

pp.360-361

Because the profits from arbitraging the spread between the currency of the dozens of countries where they reinsure and collect premiums is so great in total, it pays to insure more properties to obtain the premiums.

p.361

(The findings of a Marsh & McLennan survey of large property damages in the hydrocarbon-chemical industries also documents the rise in accidents and the size of the human and property loss [Mahoney 1993].)

p.361

Historically, insurance companies have played a large, positively synergetic role in accident prevention. They have insisted on safe practices in order to reduce the losses they pay out, and workers and communities have benefited. But the global nature of the industry and the ease and speed of market clearing in currency values has “diversified” their business──given the huge sums involved and the many countries──into the highly profitable one of arbitrage.

p.361

(“Downsizing” can be a misleading term, suggesting that fat has been removed. But in many cases the number of workers does not decline in the system; fewer are employed by the main firm and the rest are employed by small contracting firms, which are nonunion.)

p.362

The workers in the contractor's organization are far less experienced, far less skilled, and payed at far lower rates than the chemical company's own employees──yet they do the most risk work. The industry claims a very low death and accident rate for its own employees, but the statistics reported do not include the deaths and accidents of the contractors.

p.362

Some of these are quite small operations and no reporting needs to be made; those that are reported go into a large general category where they cannot be attributed to petrochemical companies, let alone to, say, Shell oil.

p.362

Both risks and statistics are thus exported.

pp.363-364

Frederick Wolfe, Eli Berniker, Mitchel Bloom, and Alfred Marcus

(Wolf, Berninker, Bloom, and Marcus 1999)

p.363

Berniker

p.364

He writes: “It appears that the largest and most complex refineries, in the sample, are also the oldest, 70 years. Their complexity emerged as a result of historical accretion. Processes were modified, added, linked, enhanced, and replaced over a history that greatly exceeded the memories of those who worked in the refinery.” (It has been similar with computer programs and embedded chips, as we shall see when we take you Y2K.)

p.364

All in all, it was a striking big of research by a chemical engineer (Wolfe) and his academic advisors.

p.365

it is not simply “capitalism”. It is organizational.

p.365

Some astute poll scanning by William Freudenberg finds that the best predictor of a concern with risks and environmental damage is not political values or technical expertise, but the extent to which those polled hold large organizations accountable for their actions (Freudenberg 1992; 1993; Freudenberg and Gramling 1994).

p.365

Companies have power, and those polled appear to say, with it goes great responsibility and, in the area of risk to the public, they find the companies wanting.

p.365

Cognitive psychologists and others have discredited the public's ability to judge risks and thus their credibility in policy matters, as noted in Normal Accidents; the work of Freudenberg and others I mention below contradicts this; the public makes more subtle judgements regarding risk than the cognitive psychologists make, and the public often places organizations at the center of the problem. (For a good early statement on the issue of risk and power, see Gephart 1984.)

pp.368-369

More valuable still is Sagan's development of two organizational concepts that were important for my book: the notion of bounded rationality (the basis of “Garbage Can Theory”), and the notion of organizations as tools to be used for the interests of their masters (the core of a “power theory”). Both garbage can and power theory are noted in passing in my book (10, 261, 330-339), but Sagan's work makes both explicit tools of analysis. Risky systems, I should have stressed more than I did, are likely to have high degrees of uncertainty associated with them and thus one can expect “garbage can” processes, that is, unstable and unclear goals, misunderstanding and mis-learning, happenstance, and confusion as to means. (For the basi work on garbage can theory, see Cohen, March, and Olsen 1988; March and Olsen 1979. For a brief summary, see Perrow 1986, 131-154.) A garbage can approach is appropriate where there is high uncertainty about means and goals and an unstable environment. It invites a pessimistic view of such things as efficiency or commitments to safety.

p.369

The second advantage of Sagan's theoretical work is that he recognizes more than I did the need to specify the role of group interests in thwarting commitments to safety. An interest group theory says that a variety of groups within and without the organization will tend to use it for their own ends and these may not be consistent with the official goals or the public interest (Perrow 1986, Ch. 8).

p.369

Sagan makes this a researchable question rather than an assumption, and the analysis of the role of group interests in the systems he studies is vital to his story and to our understanding of catastrophes.

‘’“”──

p.373

The first book recounts how Clarke witnessed, in prolonged field work, the struggle between various organizations to “capture” the right to define the accident.

p.373

the struggle between various organizations

p.373

to “capture”

p.373

the right to define the accident.

p.373

the janitors

the local health deparment

the state agency

the local medical society

an environmental decontamination organization

the health department

affected citizen's group

independent scientists with national reputations

Environmental Protection Agency

New York Governor

the county health commissioner (who was later fired after several months)

p.374

Clarke utilizes the James March et al. garbage can theory that we met in the previous section. It is probably first attempt to use it on organizations rather than [on] people or groups within organizations; it is certainly the best.

p.374

By emphasizing uncertainty, changing participants, solutions looking for problems rather than the reverse, timing problems creating disconnects or coincidence, and unclear and shifting organizational interest, we get a fascinating view of the social construction of risk and disaster.

p.374

Clarke's work goes beyond social construction and garbage cans in two respects. The idea that reality is in large part a social construction is now so institutionalized in sociology that it threatens to mask the different degrees of “agency”, or the different power of agents (actors). Yes, what we observe is neither necessary nor “natural”; it is constructed out of negotiation and agreement. But it is important to see the resources of those doing the most construction; power and interests are still involved.

p.374

The case is similar to the ‘garbage can’ metaphor. It emphasizes the role of symbols, happenstance, variable participation, and so on, but this masks intent, power struggles, and conflicts over interests.

p.374

Clarke not only argues that the metaphor works best when trying to understand relationships among organizations rather than within them, but shows that the ‘garbage can’ nature of organizations (due to limited rationality) is itself a problem for powerful organizations. If ‘garbage can’ conditions exist, the organizations with less power (like the local health department), loosely organized associations (the fire fighters and citizen's groups), and even individual citizens have [a] chance to affect the definitions of acceptable risk.

pp.377-378

https://en.wikipedia.org/wiki/1994_Black_Hawk_shootdown_incident

Scott Snook

Snook, Scott, 377-78

([ a normal accident without any failures, or a normal preventable incident with multiple failures ])

p.377

Another political scientist, Scott Snook, of West Point, analyzes the 1991 shoot-down of two UN peacekeeping helicopters, full of officials, in Nothern Iran [Iraq], by two U.S. jets. The helicopters was on an announced flight, in the proper zone, under perfect weather conditions, in an area where there was no military action for months, and it was not contacted by the fighters.

p.377

The AWACS flying above said they had no knowledge of [two] friendly helicopters when the two jets spotted them; the jets made a close fly-by but still misidentified the lumbering ship as Russian, and when it [the two helicopters] did not respond to signals it could not hear, blew it away.

p.378

Snook makes another contribution. This is, he says, a normal accident without any failures. How can that be? Well, there are minor deviations in procedures that saved time and effort or corrected some problems without disclosing potential ones. Organizations and large systems probably could not function without the lubricant of minor deviations to handle situation no designer could anticipate. Individually they were inconsequential deviations, and the system had over three years of daily safe operations. But they resulted in the slow, steady uncoupling of local practice──the jets, the AWACS, the army helicopters──from the written procedures, a practice he labels “practical drift”. Local adaptation, he says, can lead to large system disaster. The local adaptations are like a fleet of boats sailing to a destination; there is continuous, intended adjustment to the wind and the movement of nearby boats. The disaster comes when, only from high above the fleet, or from hindsight, we see that the adaptative behavior imperceptibly and unexpectedly result in the boats “drifting” into gridlock, collision, or grounding. What makes this analysis plausible is the presence of multiple actors──jets, AWACS, helicopters, Air Force, Army, and diplomatic corps──in contrast to the familiar progressive deviation and routine the accumulates in an accident. Both have practical drift to a noncompliant state, but for the friendly fire accident it required just the right combination of minor deviations──none of them significant in themselves and, indeed, quite practical in isolation──to produce the accident. It is a story of the failure of coordination, not just the normalization of deviance. This is why it is so important to analyze the event in terms of the individual, group, and system levels; the latter two introduce the coordination problem disclosed by “practical” (useful) “drift” (imperceptible change).

“”──

p.379

They are questions of technique and management, and they are questions of humanizing work and finding ways to make the drive for efficiency compatible with safety and culture. But they rarely question how important efficiency goals should be in risky systems, and who has the power to impose those goals, and their contribution ([ condition ]) to production pressures.

p.379

We should address the improvement of safety but, in addition, address the role of production pressures in increasingly privatized and deregulated systems that can evade scrutiny and accountability.

p.379

Her view ignores the waste and corruption that Diamond details (Diamond 1986a; 1986b) and minimizes the other accounts that are critical of NASA and the manager's actions that night (Boisjoly 1987; Cooper 1986; Maier 1992; McDonald 1986). Instead, she builds what would be called a “social construction of reality” case that allowed banality of bureaucracy to create a habit of normalizing deviations from safe procedures.

p.380

But I think that interpretation minimizes the corruption of the safety culture, and more particularly drains - this case of the extraordinary display of power that overcame - the objections of the engineers who opposed the launch.

p.380

Surprised at the pressures for a launch, the worried engineers fumbled the ball and failed to arrange the data in a way that would make it hard to dispute the role of temperature when all launches were considered. (See the remarkable analysis by Tufte on this point, 1997.) ([ see pp.54-57, Sidney Dekker, The field guide to human error investigations, 2002; the pages number should be used as a guide, rather than as an absolute pointer ])

p.380

They pounded tables and raised their voices but were told to take off their engineering hats and put on their managerial ones.

Charles Perrow, Normal accidents : living with high-risk technologies, 1999

p.380

Those who left the organization described that night quite differently. They made it clear, as did much other evidence she presents, that this was not the normalization of deviance or the banality of bureaucratic procedures and hierarchy or the product of an engineering “culture”; it was the exercise of organizational power. We miss a great deal when we substitute culture for power.

p.383

Of course, in interview, their responses reflected the presumptions of my questions. Langewiesche is the better sociologist in this case; without making such assumptions he discovered that they were primarily there to pack as much traffic into the airports as they could. They judged themselves on the grounds of efficiently using the valuable landing slots, not system safety. Of course, safe landings and take-offs (most accidents occur in the landing phase) were a constraint, but the production pressures of saving airline fuel, increasing capacity, and avoiding passenger inconvenience were clearly foremost (Langewiesche 1998a).

pp.383-384

... I was still dismayed when a graduate student of mine, Leo Tasca, finished his dissertation long after the publication of Normal Accidents, disclosing the highly biased nature of what I thought was an impeccable source: National Transportation Safety Board and Coast Guard reports on marine accidents (Tasca 1990). Tasca took four collisions with stationary objects which the NTSB and the Coast Guard had investigated, and for which there were also court documents concerning public law suits and trials brought by the parties harmed by the collision. Damages were very large in each case. In each case the Coast Guard/NTSB finding of probable cause neglected or ignored completely those factors that would make the ship owners liable. In each case the court trial produced evidence that won the suit for the claimants, in effect overturning the government findings, and forced the ship owners to pay damages, which they otherwise would have avoided. I had harbored suspicions of a bias in favor of ship owners on the part of the NTSB and the Coast Guard, but was surprised at the conclusive evidence that Tasca marshaled and painstakingly analyzed in his four cases, one of which I had used, but fortunately with some skepticism about the official interpretation (Perrow 1984, 190, 192). I suspect that the paucity of powerful pluralistic organizational interests in the marine system (weak unions or none, fragmented shippers, weakened Coast Guard, etc.) encourages the biased investigations by the marine wing of the NTSB, whereas this would be less true of the air transport wing of the agency.

( Normal accidents : living with high-risk technologies / Charles Perrow, 1. industrial accidents., 2. technology--risk assessment., 3. accident., HD7262 P55 1999, 363.1--dc21, 1999, )

p.382

The accident rate for Africa, for example, is twenty-six (26) times that of the U.S., yet air travel is growing very fast there.

p.382

In 1996, 70 percent of the accidents occurred in only 16 percent of the world air carriers! (All figures are from the excellent Flight Safety Digest, published by the Flight Safety Foundation [Matthews 1997].)

pp.383-384

p.383

El Al cargo plane that crashed into an Amsterdam apartment complex in October 1992.

p.383

Recent news reports said that a short time after the crash, while rescue workers swarmed over the wreckage pulling people out, the Dutch authorities ordered them off and had the area secured.

p.383

A helicopter arrived, dislodging men in contamination suits, who scoured the area for a couple hours, and then left, allowing the rescue teams to continue to search for survivors.

p.383

Authorities said they were there to check on the danger of explosion, though the plane had already exploded.

p.383

Much later unconfirmed news reports identified them as Israeli Mossad agents. The flight recorder and the plane's documents and waybills, which had been recovered, then turned up missing and have never been officially found.

pp.383-384

Rescue workers and citizens who had helped before and after the mysterious helicopter arrived and left began coming down with strange illnesses.

p.384

Charging years of cover-up by Israeli and Dutch authorities, a Dutch newspaper in October 1998, revealed that the cargo was not “perfumes and machine parts” as the Israeli government had declared (an error, a spokesman now explained), but included the elements needed to make the deadly nerve gas Sarin.

p.384

(A parliamentary inquiry held in May 1999 found that there were no Mossad agents and that the cargo was not toxic.)

p.384

In addition to the cargo, the plane had depleted uranium as a ballast in the tail, 300 pounds of which was not recovered and is presumed to have been burned, and thus rendered airborne, where it is carcinogenic. (Boeing advised clients to discontinue its use in the 1980s, but Israel did not make the change in a number of its planes.)

p.384

Boeing and El Al have paid undisclosed damages to a number of the crash victims, though not all, and unfortunately for researchers, made them sign the standard secrecy clause (Simons 1999).

p.413

ALARA as low as reasonably achievable

how we learn reading room

Tuesday, February 1, 2022

normal accidents, Charles Perrow, 1999

No comments:

Post a Comment

737 rudder issue

Search This Blog