This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Copyright 1990 by the American Psychological Association, Inc. 0096-3445/90/S00.75

Journal of Experimental Psychology: General 1990, Vol. 119, No. 4, 367-374

Cognition and Metacognition at Extreme Altitudes on Mount Everest Thomas O. Nelson, John Dunlosky, David M. White, Jude Steinberg, Brenda D. Townes, and Dennis Anderson University of Washington

The FACTRETRIEVAL2 test battery, which assesses both retrieval of general information from memory and metacognition about that retrieval, was administered to people before and after a recent expedition to Mount Everest and at extreme altitudes above 6,400 m (higher than any mountain in North America or Europe). The major findings were as follows: First, the same extreme altitudes already known to impair learning did not affect either accuracy or latency of retrieval, and this robustness of retrieval occurred for both recall and forced-choice recognition. Second, extreme altitude did affect metacognition: The climbers showed a decline in their feeling of knowing both while at extreme altitude and after returning to Kathmandu (i.e., both an effect and an aftereffect of extreme altitude). Third, extreme altitude had different effects than alcohol intoxication (previously assessed by Nelson, McSpadden, Fromme, & Marlatt, 1986). Alcohol intoxication affected retrieval without affecting metacognition, whereas extreme altitude affected metacognition without affecting retrieval; this different pattern for extreme altitude versus alcohol intoxication implies that (a) hypoxia does not always yield the same outcome as alcohol intoxication and (b) neither retrieval nor metacognition is strictly more sensitive than the other for detecting changes in independent variables.

In contrast to turn-of-the-century psychology's narrow focus on "normal" human behavior, a major goal of modern psychology is to explain people's behavior in the full range of environments, including extreme situations that force people toward their psychological limits. Simultaneously, especially during the last decade, a strong interest has arisen concerning behavior in naturalistic situations (e.g., Cohen, 1989; Gruneberg, Morris, & Sykes, 1988; Neisser, 1982), including the use of laboratory-developed procedures in naturalistic environments (Cohen, 1989, p. 12). One environment that is both extreme and natural is the high-altitude mountain environment. Because of the unusual characteristics of that environment (e.g., low barometric pressure, danger, hypoxia), one longstanding reason for climbing mountains is to do scientific research (e.g., see Magie, 1963, pp. 70-75). The aforementioned characteristics are maximal at Mount Everest, which has the most extreme altitude and an unusually high degree of danger—an average of two fatalities per expedition (as This research was supported by grants to Thomas O. Nelson from the National Aeronautics and Space Administration, the U.S. Air Force Office of Scientific Research, the National Institute of Mental Health, and the University of Washington. We thank the members of the 1988 Northwest American Everest Expedition for participating in the research (especially S. Ruoss for overseeing the data collection before Thomas O. Nelson arrived at basecamp), R. Schoene for valuable discussions, L. Nelson for technical assistance in preparing the stimulus materials, V. Dewey for collecting part of the Kathmandu data, and R. Kennedy and H. Roediger for comments on the manuscript. This article is dedicated to Charlie Schertz, who was one of the climber-subjects in this research and who died on March 27, 1990, in an avalanche on Manaslu. Correspondence concerning this article should be addressed to Thomas O. Nelson, Psychology (NI-25), University of Washington, Seattle, Washington 98195.

tallied by West, 1986).' Morrow remarked about Mount Everest, "For every two climbers to reach the summit, another has died in the attempt. And climbers continue to die at an alarming rate, even on the nontechnical routes" (1986, p. 63). This high fatality rate is well-known to climbers, and the concomitant psychological reaction to this potential danger is fear: "On a mountain, a man meets fear. The more he understands about what he is doing, the more fear he will experience. Because the abler he is then to recognise danger" (Seigneur, p. 121 of Messner, 1988). Even the most accomplished climbers will experience fear. For example, Reinhold Messner, the first person to reach the summit of Mount Everest without supplementary oxygen, remarked, "The muscles in my stomach tighten with fear.... Fear in the belly! It's as if the fear goes up and down inside me. From head to belly. Belly to head" (Messner, 1980). Although only limited degrees of human fear can be induced in laboratory situations (i.e., human-subject constraints disallow manipulations of extreme danger), the effects of life-threatening danger can sometimes be examined in naturalistic situations such as extreme-altitude mountaineering.2 People's reactions to extreme altitude can produce errors in their performance. For instance, West (1985) found "clear evidence of brain impairment at Camp 2 [6,300 m]" during ' In terms of actual danger for the 1988 post-monsoon season during which we conducted our research, there were expeditions from several countries climbing on Mount Everest: the United States (our expedition), Korea, France, Spain, and a joint Czechoslovakia/New Zealand expedition. Our expedition and the Korean expedition had no fatalities; there were nine fatalities on the expeditions from the other countries. 2 Extreme altitude (also called the "Death Zone") refers to altitudes above 18,000 feet (5,500 m). All of our extreme-altitude tests occurred between the altitudes of 6,500 and 7,100 m. (As points of reference, the summit of Mount Rainier is 4,392 m, and Mount McKinley, the highest mountain in North America, is 6,194 m.)


This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.



his physiological research at Mount Everest and concluded, "It is not surprising, therefore, that there are many examples of bad decisions being made at extreme altitudes, which indicate a befuddled state of mind" (p. 125). Tom Hornbein wrote that during his climb of the West Ridge of Mount Everest he was concerned about "whether I'd remember how to blow up my air mattress at 26,000 feet" (Hornbein, 1965, p. 32). But which specific aspects of memory and judgment are affected and which are not? In terms of memory, previous research at Mount Everest demonstrated that the learning of new information becomes impaired (Townes, Hornbein, Schoene, Sarnquist, & Grant, 1984; related findings that are consistent with that demonstration have been reported by Kennedy, Dunlap, Banderet, Smith, & Houston, 1989; by Cavaletti, Moroni, Garavaglia, & Tredici, 1987; and by Oelz & Regard, 1988). However, Townes et al. pointed out that they had investigated only half of the memory situation and that future research should be done by "separating learning from the retrieval of information" (p. 35). Moreover, for many applied situations, retrieval may be even more important than learning (e.g., retrieval of previously learned first-aid information). One major goal of the present research was to explore the effects of extreme altitude on the retrieval of information from memory. Another major goal was to explore the effects of extreme altitude on people's judgments about whether or not they could retrieve information from memory (e.g., feeling of knowing, FOK; Hart, 1965; reviewed in Nelson, 1988). These self-awareness judgments are part of metacognition, which refers to the monitoring and control of one's own cognitive processes (Zechmeister & Nyberg, 1982). In contrast to traditional memory processes, such as the learning of new information, which appears to be related to physiological activity in the hippocampus (Townes et al., 1984, p. 35), metacognitive processes may be more related to physiological activity in the frontal lobe (Janowsky, Shimamura, & Squire, 1989). Therefore, an independent variable might affect retrieval without affecting the person's metacognitions about that retrieval, or vice versa. In the climbing environment, Morrow emphasizes, "At the core of mountaineering is an ongoing process of evaluation. One tries to determine the location of the line between the skills one has and the risks one faces. The secret is to recognize the line and to know when to turn back. Too soon, and nothing is gained; too late, and you've been reckless and die" (1986, p. 79). This metacognitive evaluation may be difficult at extreme altitudes if, as Houston (1987) has suggested, "Lack of oxygen strikes the highest centers of the brain first—and thus judgment [and some other functions] are clouded early" (p. 124). One independent variable that can have qualitatively different effects on cognition versus metacognition is acute alcohol intoxication. Nelson, McSpadden, Fromme, and Marlatt (1986) found that acute alcohol intoxication (1 ml/kg) disrupts retrieval without affecting metacognitive judgments about that retrieval. Although mountaineers have reported anecdotal evidence that their judgment and self-monitoring are disrupted at altitude, no research has been conducted on that topic. Such research seems timely because of the hypothesis that "hypoxia resembles overindulgence in alcohol" (Houston, 1987, p. 176).

Finally, in addition to exploring the effects of being at high altitude, the present research explored the aftereffects of having been at high altitude. Townes et al. (1984) had shown that the disruptive effects of altitude on learning persisted even upon return to Kathmandu (i.e., the climbers' ability to learn new information had been impaired, and this impairment continued even after descending to low altitude; for additional confirmation about the aftereffects on acquisition because of having gone to extreme altitude, see Cavaletti et al., 1987, and Oelz & Regard, 1988). Accordingly, the participants in our research were examined both as they went to increasingly higher altitudes and when they returned to the lower altitudes from which they began. Method We used a modified version of the FACTRETRIEVAL2 test battery, which assesses both the retrieval of general-information facts from long-term memory and people's metacognitions about their retrieval (Wilkinson & Nelson, 1984). This is the same test battery that was used previously to assess the effects of acute alcohol intoxication (Nelson et al., 1986); we used a paper-and-pencil version of FACTRETRIEVAL2, and the climber's responses were recorded by hand and on a tape recorder, instead of on a computer.

Subjects and Experimenters The subjects, 9 men and 3 women, were the 12 climbers from the 1988 Northwest American Everest Expedition, all of whom lived in the United States, and all of whom expected to climb above 6,000 m on Mount Everest. Although their occupations were not mountaineering, they were highly experienced climbers (e.g., median number of years of climbing was 16 years). All of them had a least two years of college education, with the highest formal education being the M.D. degree for five, the J.D. degree for one, the M.A. degree for two, and the B.A. degree for three. At each testing location, every participant served as both a subject (being tested by another climber) and as an experimenter (testing another climber). Each climber was to be tested once at each of six locations (however, see next section). The testing of climbers was counterbalanced according to a prearranged schedule: (a) Each experimenter tested someone else on only those items that the experimenter himself/herself had already been tested on; (b) an attempt was made to test each climber by a different person at each location and to have all climbers serve approximately equally often as experimenters (namely, six times—once at each of the six locations). To accomplish this, each climber received an individualized schedule of who would test him or her at each location, which set of items would be administered, and backup plans in case the sought-after testor was unavailable due to being elsewhere on the mountain at that time.

Locations for Testing The locations originally planned for testing were: (a) 48 hr after arrival in Kathmandu (elevation, 1,200 m), but before departure on the 2-week trek to basecamp; (b) 48 hr after arrival at basecamp (5,400 m); (c) 48 hr after arrival at Camp 2 (6,500 m); (d) a second time at 6,500 m or higher—at either Camp 2 or Camp 3 (7,100 m)— near the time of summit attempts (approximately 2 weeks after Camp 2 had been established); (e) at basecamp, after having been at the highest point that a given climber attained during this expedition; and (f) at Kathmandu at the end of the expedition (approximately 1 week after returning from the high camps).

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

COGNITION AND METACOGNITION AT MOUNT EVEREST Three unplanned (but not unexpected for a Himalayan expedition) changes occurred that affected the final data matrix: First, 1 climber decided to remain at basecamp (therefore no high-altitude data were obtained from that climber, and accordingly he is not included in the final data matrix). Second, 3 of the remaining 11 climbers did not return to basecamp until the evening before the expedition departed basecamp (therefore they did not have time to administer/receive the second test at basecamp, and accordingly the second test at basecamp is not included in the final data matrix). Third, 1 of the 11 climbers who contributed high-altitude data returned to basecamp before going through the second high-altitude test. Thus, the final data matrix consisted of data from 54 testing sessions, that is, five sessions from each of 10 climbers and four sessions from the 11th climber.

Stimulus Materials, Training of Climbers, and Apparatus The items were 238 of the 240 general-information questions contained in FACTRETRIEVAL2 (Wilkinson & Nelson, 1984). For instance, one question was "What is the capital of Finland?" (answer: Helsinki). This test battery seemed especially appropriate because previous research had already shown that performance in this task is steady-state across different subsets of these kinds of questions, that is, negligible learning-to-learn/practice effects (Bahrick & Hall, 1990; Nelson & Narens, 1990). The normative difficulty of the questions had been determined previously (Nelson & Narens, 1980), and they were arranged into seven subsets (34 items per subset), with each subset having approximately the same normative difficulty. Every stimulus card, laminated in waterproof plastic, showed one generalinformation question and eight randomly ordered recognition alternatives, printed in easily readable 24-point Geneva type. The order of the seven subsets varied across climbers such that a given climber was tested on a different subset of items during every testing session, and across climbers a given subset was used approximately equally often at each testing location. One subset was used for training the climbers in the procedure. During training, which occurred in Seattle shortly before the expedition left for Nepal, the first author trained each climber individually by running the climber through that subset in the role of subject and then by having the climber serving in the role of experimenter and running the first author through the same subset. The major apparatus, which necessarily had to be minimal because of the Everest environment, consisted of a Nagra-brand tape recorder so as to provide both a backup to the climber's recording of the subject's responses (for purposes of assessing any possible recording errors by the experimenter)3 and a source for later scoring to determine the subject's response latencies (this use of the tapes was not mentioned before completion of the research). The only other apparatus, aside from pencils and the testing materials, was a blindfold. The blindfold was used during testing so that the subject could receive any feedback concerning his or her answer to a given question (i.e., seeing the experimenter's facial reaction after guessing an answer might affect the subject's confidence judgment about the accuracy of that guess). Procedure Before testing. The experimenter put a "Do not disturb" sign outside the door of the hotel room (Kathmandu) or tent (all other testing locations). Then the experimenter removed the testing materials for that location (these were stored in waterproof plastic envelopes, all of which were kept in a waterproof canvas bag when not in use) and set up the tape recorder.


During testing. First, the experimenter gave the subject an "Instructions to Subject" card to review. Next, the subject read aloud in his or her normal speaking voice a one-page speech sample, whose primary purpose for the present research was to get the the subject settled in the testing environment. Then the memory experiment began. The subject put on a blindfold (this remained on during the recall phase and the FOR phase to eliminate all visual feedback from the experimenter that otherwise might have affected the subject's judgments of confidence or FOK judgments). After shuffling the deck of 34 memory cards, the experimenter slowly read aloud the question on the first card, and the subject took as long as he or she wanted to try to recall the answer to the question. The subject was encouraged to guess whenever possible, and to take as much time as necessary to try to think of an answer. When the subject made a guess, the experimenter wrote it on the data sheet (or wrote "right" or wrote "no guess" if the subject could think of no guess at all). Immediately after making a guess, and without any feedback from the experimenter, the subject made a confidence judgment about whether his or her previous guess was correct. (Note that these confidence-judgment data will be reported elsewhere and will not be mentioned again here). This procedure was repeated for each of the 34 questions. Following the recall phase, the experimenter assembled the deck of stimulus cards for the FOK phase, which contained only those questions that the subject had not answered correctly. The resulting deck was shuffled briefly, the experimenter read a question aloud, and the subject rated his or her subjective likelihood of recognizing the correct answer to the question if shown a pool of eight plausible answers. The subject said 1 if such recognition would be completely by chance, 6 if such recognition were certain to be correct, or 2, 3, 4, or 5 to reflect in-between states of confidence about the likelihood of subsequent recognition. After the subject had made FOK judgments on all of the previously nonrecalled items, the experimenter shuffled that deck of cards again, and the subject made another FOK judgment on each of those items. For each item, the subject was asked to concentrate on only the current FOK judgment. After the FOK phase, the subject was told to remove the blindfold. Finally, the subject had an eight-alternative forced-choice recognition test on each previously nonrecalled item. The experimenter shuffled the deck of nonrecalled items and passed them, one at a time, to the subject. The subject read 3

An a priori concern (e.g., Houston, 1988, p. 366) was that not only might a climber who is serving as the subject be affected by altitude, but so might a climber who is serving as the experimenter, such that the experimenter might make frequent errors on the data sheets. The tape recordings of the testing sessions allowed us to assess this potential problem. We listened to each tape recording and compared the subject's spoken responses with what the experimenter wrote on the data sheets. Of the more than 5,800 responses recorded by the climbers in their role as the experimenter, only 22 experimenter recording errors occurred, that is, an error rate of less than 0.4%. (Note that 8 of these 22 errors occurred during their first test sessions at Kathmandu.) Thus the climbers were remarkably accurate (99.6%) in their role as the experimenter, and the problem of frequent experimenter errors did not occur.

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.



aloud the item number (so that the experimenter could record the subject's response at the correct location on the data sheet) and then read aloud the response alternative that he or she believed was most likely to be correct for that question. The subject was told in advance that the correct answer would never be the one that he or she gave during recall, because the only items being tested for recognition were those that the subject had not correctly recalled the answers to. Upon completion of the recognition test, the subject went through two short visual-perception experiments (unrelated to the present experiment and not discussed further here), which ended the session. Results Unless otherwise mentioned, all differences that are reliable have p < .05, and all differences that are unreliable have p > .10.

Latency of correct recall. The middle panel of Figure 1 shows the means of the individual climbers' median latency of correct recall, which is sometimes regarded as a more sensitive measure of retrieval than the aforementioned accuracy measure (see MacLeod & Nelson, 1984). This measure did not differ reliably across locations, F(4, 24) = .99, M5e = 1.57 (similarly, Friedman chi-square = 2.6, p = .63). Percentage correct recognition on nonrecalled items. The final measure of retrieval is shown in the right panel of Figure 1. There were no reliable changes in the climbers' mean percent correct recognition on nonrecalled items across the five testing locations, F(4, 36) = 1.24, MS, = 161.42. This measure is somewhat noisier than the recall measure in the left panel of Figure 1, probably because of (1) recognition performance being based only on approximately half as many items (namely, the incorrectly recalled items) and (2) guessing during the recognition test.

Feeling of Knowing The structure underlying the climbers' FOK appears to be quite stable, as indicated by the remarkably high retest relia-

Retrieval Three measures of retrieval were obtained: percent correct recall of answers to general-information questions, latency of correct recall, and percent correct recognition of nonrecalled answers. Each is discussed in turn, and the data are summarized in dot charts (Cleveland & McGill, 1985), with a zeroslope line of best fit shown wherever the null hypothesis cannot be reliably rejected. Percentage correct recall. As indicated in the left panel of Figure 1, there were no reliable4 changes in the climbers' mean percent correct recall across the five testing locations, F(4, 36) = .36, MS, = 95.45.

4 We estimated the number of subjects that would be required for statistical reliability (p < .05) for the largest difference between one of the first two tests and one of the two extreme-altitude tests. Given the observed difference between the means (i.e., Basecamp vs. 1st 6400+ m in Figure 1) and the observed standard deviations, the estimated number of subjects is 68 subjects for the within-subjects design (Loftus & Loftus, 1988, pp. 293-294) that we used and 195 subjects per group for a between-subjects design (Hays, 1973, eq. 10.13.1 and 10.15.2). Notice, however, that the direction of such a difference is opposite to what would be expected if extreme altitude hinders recall.

Average latency., for incorrect recall 1

1st Kathmandu

A '


1st 6400+ m


2nd 6400+ m


2nd Kathmandu









• i







20 40 60 80 100 0 Percent correct recall

6 9 12 Median latency of correct recall (sec)


15 0

20 40 60 80 100 Percent correct recognition

Figure 1. Retrieval as a function of test location. (Panel A shows percent correct recall, Panel B shows latency of correct recall, and Panel C shows percent correct recognition for nonrecalled items. There are no reliable effects of altitude on any of these three aspects of retrieval. Vertical lines are least-squares zero-slope lines of best fit.)

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.


bility on those judgments: The mean Goodman-Kruskal gamma correlations (see Nelson, 1984) between the first set of FOK judgments and the second set of FOK judgments on the same items were +.96, +.95, +.98, +.98, and +.89, respectively, for the locations ordered as in Figure 2. The magnitude of these correlations does not change reliably with altitude F(4, 36) = .99, MSt = .01, which disconfirms the a priori possibility that climbers would be less reliable in their judgments when at extreme altitude than at the lower altitudes (if anything, the trend is for retest reliability to increase at altitude). Due to this near-perfect reliability, henceforth only the first set of FOK judgments will be used when referring to FOK. To determine the effect of altitude on FOK per se, we computed the median FOK for each subject at each location. Figure 2 shows the mean (across subjects) of the individual subjects' medians at each location. As indicated by the lines of best fit, three orthogonal comparisons (MSe = .485) yielded the following conclusions about the effects of altitude on FOK: (a) There was no reliable difference between the first Kathmandu test and the basecamp test, F( 1,36) < 1; (b) there was no reliable difference among the last three tests, F(2, 36) < 1; and (c) FOK declined reliably between the first two tests and the last three tests (i.e., was lower during the two highaltitude tests and the subsequent Kathmandu test), F(l, 36) = 16.84. The effect of altitude on FOK was also confirmed nonparametrically by an omnibus Friedman test, chi-square = 14.26, p

Cognition and metacognition at extreme altitudes on Mount Everest.

The FACTRETRIEVAL2 test battery, which assesses both retrieval of general information from memory and metacognition about that retrieval, was administ...
1MB Sizes 0 Downloads 0 Views