日本財団図書館（電子図書館）　Conference Proceedings Vol. I, II, III

AN APPROACH TO MEASURING SITUATIONAL AWARENESS IN OPERATIONAL CONTEXTS

Capt Anthony Patterson (Memorial University of Newfoundland, Canada)

Prof. Scott MacKinnon (Memorial University of Newfoundland, Canada)

Abstract: An accurate assessment of the current situation is a vital component in the management of ship operations. Flawed situational awareness leads to incorrect decisions and ultimately to incorrect actions. Researchers studying situational awareness, and its various components, are confronted with the task of developing methods that adequately measure situational awareness without altering the performance of the test subjects. In this paper, the authors describe the various methods that have been used to measure situational awareness, and propose a methodology to be used during simulation studies. The benefits of developing robust methods of measuring situational awareness have applications in both research and training.

1. INTRODUCTION

Human error is having a profound impact on shipping. With estimates ranging from 58% to 90%, human error is the dominant cause of accidents [1]. Accidents almost always lead to economic loss, and tragically, occasionally lead to loss of life and environmental damage. From the point of view of the ship owner, the economic losses incurred by a shipping accident are incurred through repair bills, lost time, liability damages, and increased insurance premiums. Insurance premiums, for example, are expected (at the time of writing) to see significant increases of 50% or more and accompanied by a reduction in the scope of insurance coverage [2].

From the point of view of society, the economic losses associated with shipping can include loss of livelihood for coastal communities (fishing and tourism revenues as a result of oil spills for instance); and increased costs associated with more stringent inspection and regulatory regimes.

While there are many components to human error, the main elements are organizational and cognitive failures. Organizational failures lead to conditions in which errors are inevitable (or 'accidents waiting to happen') [3]. These types of failures are linked to the decisions made by managers and regulators regarding the controls (or lack thereof) that are imposed on a system to eliminate errors. The current debate on inadequate crew size is an example of a an organizational error. The second main type of error is cognitive failure which leads to incorrect decisions and/or inappropriate actions [4]. Cognitive failure has a variety of possible sources, of which breakdowns in situational awareness is considered by the authors as a key component.

To illustrate our point consider the events surrounding the loss of the Tricolor and the subsequent collisions with its wreck in the Dover Strait. At the time of writing, official reports detailing the causes of the incidents had not been released. The details given in the illustrating case are derived from press accounts of interviews conducted at the time of the incidents, and are subject to revision with the release of the accident investigation report(s).

The Tricolor, a Norwegian registered ro-ro car carrier, collided with the Bahamian registered container ship Kariba in the Dover Strait on December 14, 2002 at approximately 0125 UTC [5]. The visibility at the time was reduced in heavy fog. Both vessels were travelling in the same direction, and for reasons not yet determined, the Kariba altered course to starboard and hit the Tricolor on her port side [6]. The crew of 24 abandoned the Tricolor which sank within 1 hour and 20 minutes, taking with her 2,862 luxury cars.

Shortly after the accident, Antoine Goulley, of France's Maritime Prefecture, said there had "clearly been a radar mistake, on one of the vessels." [6] The Bahamian authorities, in an interview almost a month after the accident had occurred said "that the accident appear[ed] rather to have been the result of an error of judgement on the part of the master of the other ship involved in the collision, the Bahamas-flagged container ship Kariba." [7] The Bahamian authorities indicated that fatigue may have been a factor in the incident.

The incident with the Tricolor did not end with the initial collision. Just over 2 days later, Den Helder MRCC reported that the general cargo vessel Nicola (2998 GRT, built 2000) struck the Tricolor at 0338 UTC, December 16, 2002 [8]. The Nicola was not seriously damaged, and was refloated. An interview with the manager of the Dover Coastguard station indicated that the Nicola should have been able to receive at least 14 warning broadcasts from British and French radio stations prior to grounding [9].

A third striking of the Tricolor was reported by Dover MRCC when the Turkish combination carrier Vicky (43387 GRT - 1981) hit the Tricolor wreck at about 1900 UTC January l, 2003 while on voyage from Antwerp to New York [10]. The Vicky was loaded with a cargo of 70,000 tonnes of kerosene. The Vicky struck the Tricolor despite warnings from French patrol vessels and five beacons placed around the wreck [11]. In an interview following the incident, the master of the Vicky is reported to have said: "I had no idea it was there...1 saw some light buoys but I didn't understand what it meant or why they were there." [12]

The Tricolor incidents did not come as a surprise to maritime experts. In a debate in the British House of Lords, Lord Greenway stated [13]:

"While seafarers today are very much au fait with radar and electronic gismos, they are woefully lacking when it comes to mental arithmetic and interpreting what they see on their screens in relation to what is going on outside the window, if indeed, they look out the window."

An interview with a Channel Deep Sea Pilot by BBC news following the Vicky incident produced the following statement [14]:

"In the years that I've been negotiating the Dover Straits, the standard of watchkeeping has plummeted. On a run I did last month, I witnessed three near collisions because watchkeepers failed to keep a look out."

What happened that the Kariba, Nicola, and Vicky would all have the dubious distinction of colliding with the same ship? For the Nicola and the Vicky, they were unaware that their vessels were standing into danger even though information was available to indicate the presence of a wreck in their path In this incident, the only losses were economic. In the winter of 1971, a similar situation with the Paracas-Texico Caribbean-Brandenburg-Niki resulted in heavy loss of life [15].

While a complete answer will have to await the results of the formal accident investigations, it would seem that there was a break down in situational awareness (for reasons unknown that this time) on board the vessels involved.

2. SITUATIONAL AWARENESS

Of interest to both the commercial and military naval sectors is how an operator maintains situational awareness (SA). There have been several attempts to define and operationalize the concept of SA [16-18]. The model of SA proposed by Endsley [19, 20] bas attracted the greatest attention from researchers in the field. Her definition of SA includes "the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future" [19]. The main elements of this hierarchical model are represented by three levels: perception. integration and prediction. In this model, simply attending to a disturbance/signal is not sufficient; an operator has to determine its operational relevance and implications for the future [16]. Sarter and Woods [17] refer to this as cognitive science, the process of developing mental models to gain a better understanding of how something works.

Figure 1 represents a simplified operational model of how SA relates to the decision making process. For the purpose of this paper, constructs relating to mental models and short and long-term memory, among others, have been ignored in this model. This, however, it not intended to de-emphasize their importance from an operational context.

Fig. 1 Simplified model of the relationship between situational awareness and decision execution (after Endsley, 1995a).

2.1 Measuring Situational Awareness

2.1.1 Physiological Measures

Historically, measurements of SA have focused on physiological measures of attention and arousal. These include ocular tracking devices, electrocephalographic (ECG) measurements, as well as heart rate, blood pressure and hormone measurements [20] . Ocular tracking devices and electrocephalgraphic measurements generally indicate that a stimulus has been detected. However, it does not confirm whether the operator has identified the importance of the event and that this process initiates higher-level cognitive processing. Likewise, other physiological techniques that monitor the state of arousal of an operator are simply response variables and it is difficult to associate these changes to a specific source. While increases in heart rate, blood pressure and catecholamine levels may be due to operational pressures, other stressors such as hot or cold environments, high motion states or dietary responses (i.e. caffeine or nicotine intake) may equally cause changes in these measures. Consequently, researchers have re-directed their focus on performance-based measures to assess an operator's level of SA.

2.1.2 Performance Measures

Situational awareness is effectively an internal cognitive process that relates a time-history of events beginning with information received by perceptual resources and ending with a projection of future events. Higher order cognitive processes that have been influenced by memory processes, experiences, training and abilities, as well as command goals, objectives and expectations can influence this time-history. Thus, measurement of SA is difficult.

External task measures involves "artificially" changing the visual display by altering information during some point of the time-history and then measuring the subsequent time before an operator recognizes the change(s) in the situation [17, 20]. This approach may be too disturbing for the operator since it effectively changes his/her SA and ultimately interrupts the decision making process.

Imbedded task measures seem to take away some of the intrusiveness of the previous technique since it involves choosing certain tasks of the operator and determining how well they perform these tasks throughout a simulation session [20]. However, the major drawback of this approach does not take into account "a system that provides SA on one element may simultaneously reduce SA on another, unmeasured element" [20]. Furthermore, if the subject in the experiment figures out the purpose of the study, he/she may focus their attention to a specific task [20].

While there is a significant amount of literature which has employed these genres of performance measures, they are limited with respect to simulation training utility. Thus, researchers cannot rely solely on performance measures to determine an operator's SA.

2.1.3 Subjective Techniques

Subjective techniques, such as self-rating, may give researchers an under standing of how an operator views and defines his/her own SA. This method is very easy to implement and is more cost-effective than other approaches. Most often during self-rating trials, the outcome of the simulation session determines how an operator rates his/her SA. So a successful outcome will usually dictate a positive feeling of the operator's awareness (even if they were not particularly aware), while an unsuccessful outcome will usually, regardless of their awareness level, receive a negative review [20]. Self-ratings most likely relates to a measure of a subject's confidence rather than a measure of SA.

Observer-rating techniques involve independent and knowledgeable observers who rate the performance of the operator during a simulation. The observers give their opinion on the operator's SA by evaluating the operator's actions and verbal statements. However, the observer can offer little of how the operator conceives of the situation, and the internal mental processes that are taking place within the operator [20]. This technique is good for diagnosing elements of operational procedures, for instance, how many times an operator sampled certain information in the environment. However, an operator may perceive and process certain events without verbalizing or acting cut the process, and therefore it may seem to an observer that the operator missed that environmental information. An attempt to improve this technique comes from Sarter and Woods [17] who suggest the involvement of another crewmember in a simulation experiment to encourage verbalization of the variables sampled from the environment. However, this could also be a confounding approach as this cohort may direct the subject's attentional resources, thus altering the history of the SA [20].

Complimentary to the subjective measures, questionnaires offer a quasi-objective measurement of an operator's SA. Detailed information can be extracted from the operator then evaluated against the actual simulation exercise performed. Several questionnaire approaches are considered: posttest, on-line and freeze technique.

The posttest questionnaire is administered after the simulation exercise is finished. This allows the operator to provide as much information and feedback as needed regarding their performance during the simulation. Endsley [20] states that people lack the ability to accurately recall, with great detail, past events. She states that "there is a tendency to over generalize and over rationalize" under these circumstances. The time between the beginning of the simulation and the administering of the questionnaire might be too great and an operator may forget about important events that occurred in the simulation exercise [20]. Therefore, this technique loses its reliability the longer the researcher waits to administer it after the simulation.

To overcome the time factor in the previous technique, one could employ the on-line approach. This questionnaire is conducted while an operator is involved in the simulation [20] . However, there are elements of this method that could bias the validity of the data collection. For example, the operator has to answer questions while maintaining other tasks, thus increasing the workload. Furthermore, specific questions could re-focus an operator's attention. This would not only alter the operator's level of SA, but would likely artificially improve it.

A third questionnaire technique, the freeze technique, could eliminate the problems of the posttest and on-line approaches. This method involves freezing the simulator at random points in time to conduct the questioning [20] . As the simulation is halted, all displays and feedback mechanisms become unavailable to the operator. The operator is then asked for various status reports. Sarter and Woods [17] are critical of this method because they feel that this in an intrusive procedure that changes the natural flow of the simulation and, therefore, does not effectively determine the operator's awareness level.

3. PROPOSED APPROACH TO QUANTIFYING SITUATIONAL AWARENESS

3.1 Introduction

It does not seem possible to use one methodological approach to measure SA. Thus, combining various metrics, taking into consideration the operational context (e.g.: aviation, industrial, maritime) and the type of task (e.g., navigation, supervisory) may seem to be a reasonable way forward. Measurements which assess outcomes of the decision process are not necessarily measures of SA and should be avoided. Outcomes are related to many input factors, including but not exclusively related to SA.

The authors believe that imbedded task measures. post-task questionnaires and observer ratings can be employed successfully to measure SA for maritime navigation tasks.

3.2 Developing A Metric

3.2.1 Imbedded Task Measures

The watch handover is an example of an imbedded task measure that avoids many of the shortcomings noted by Endsley [20]. Because watch handovers are self-directed and have, as their goal, the transfer of SA from one person to another, the subjects will not be cued to unduly focus their attention to a specific task(s). The greatest advantage of assessing a watch handover is that the out-going watchkeeper is verbalizing his or her SA which can be observed by an external party. Secondly, the SA of the two watchkeepers (outgoing and incoming) can be compared to assess the quality, characteristics and common elements of the transfer of SA.

Assignment of a numerical value to the SA observed during a change of the watch would most likely be done through the use of a checklist. The checklist would include items that relate to Perception (e.g. target locations), interpretation (e.g. CPA and time to CPA), and projection (e.g. expected maneuvers). A value would be assigned for each element on the checklist, and the overall assessment of SA would be the sum of the scores. A separate checklist with possibly different items would be used for the incoming watchkeeper.

3.2.2 Post-Test Questionnaire

A variation of a post-test questionnaire can be employed following the completion of the watch handover. A navigation task requires developing a mental picture of the situation that is unfolding. Asking the out-going watchkeeper to literally draw a picture of the operational status at the time of the band over could be used as a substitute for the more traditional verbally-based questionnaires. The pictorial representation can then be compared to both known conditions and other perceptions of the same situation (i.e. the incoming watchkeeper).

The numerical value of SA associated with the pictorial technique would be determined by comparing the mental model with the actual values. For example, were all the traffic vessels placed in the correct relative positions with the correct courses and speeds; was the vessel's proximity to navigation hazards correctly assessed; and how many future events were correctly identified? The full list of measures would be dependent upon the specific circumstances of the study.

Another variation of the post-test questionnaire approach would be to ask the watchkeepers to list their goals (e.g. collision avoidance, navigation, administrative) in order of priority. The implication of measuring these goal profiles is to assess the level of congruence between the watchkeepers and to elucidate upon the perception filters being employed to update the SA.

3.2.3 Observer Ratings

Finally, observer ratings are commonly used to assess SA in the training context. Instructors and assessors, for instance, observe students during simulation and make subjective assessments of the level of SA being achieved in order to evaluate the candidates' performance. This method, although useful, cannot be the sole measure of SA and needs to be combined with the measures suggested above.

It is relatively simple to construct a numeric rating system whereby values are assigned to subjective impressions of SA. For example, a 5 point scale could be used where a value of 5 is awarded for 'excellent" SA and 0 is awarded for no SA. To improve the effectiveness of rating system, a descriptive statement corresponding to the verbal tags would need to be constructed (e.g. "excellent = evaluates all incoming data and demonstrates a complete understanding of all possible outcomes").

3.2.4 Combining the Metrics

While using a variety of metrics may be useful to measure elements of SA and to determine qualitative performance (such as good, average or poor SA), it is essential that the metrics be combined to produce a robust quantitative measure of SA. While the precise method of combining the metrics is subject to further investigation, it is envisioned that they would be combined through a table of values. In its simplest form, the table would simply sum the raw scores multiplied by a weighting factor.

Subject	Metric 1	Metric 2	Metric n	SA Score
A	m1・w1	m2・w2	mn・wn	(m・w)

(where m is the numerical value of the SA measurement and w is the weighting factor)

Once a quantitative measure can be established, then mathematical methods can be used to describe SA under a variety of conditions. For example, the Yerkes-Dodson law predicts that increasing workload can have an impact on performance. As scenarios become increasing more complex, workload increases to the point where perception tools such as attention focussing must be used to cope with the increasing demands. By measuring SA under a variety of workload conditions, the impacts of workload on SA can be investigated. Models could be created to predict under what operational conditions SA breaks down due to workload factors. Such investigations may ultimately provide clues why incidents such as those involving the Tricolor occur, and what can be done to reduce the risk of reoccurrence.

Even more importantly, a numerical measure of SA can be used to determine the correlation between SA and performance. A simple linear relationship between SA and performance does not seem to be a satisfactory model. For instance, while it might be said that moderate SA leads to moderate performance, it cannot be said that perfect SA leads to perfect performance. Alternate functions such as a logarithmic function as shown in figure 2 may better describe the relationship between SA and performance. Establishing the nature between SA and performance would help to quantify the expected incremental change with systems or processes that are intended to improve a mariner's SA.

Fig. 2

Theoretical relationship between situational awareness and performance.

4. APPLICATION TO FUTURE RESEARCH

Over the past number of years bridge systems have progressively decreased the apparent need for human supervisory controllers on the bridge. While these changes have the potential to support overall bridge operations, these levels of automation can dramatically increased command and control complexity and amplify the mental workload and cognitive stress of the officer on watch. Such relationships have been observed in the aviation industry [21] and the potential for a similar problem to arise in the marine industry has been noted [22]. Mechanisms thought to improve budge performance can, in fact, lead to the proliferation of operator errors. In the context of aviation, pilots must know more about how these systems work, allocate greater attentional resources and must be trained better to function within more complex operational contexts [21, 23]. Research must be conducted in maritime environments to examine the ergonomic suitability of these automated systems and the effectiveness of emerging bridge management strategies. Future research attention must focus upon h ow operators obtain and maintain SA and organize data provided by automated systems. It remains to be confirmed that automated budge environments actually improve the decision-making process.

REFERENCES

[1] AH Patterson "Simulation and Modeling in Innovation" Presented at the 29th Annual General Meeting of the International Marine Simulator Forum, Keelung, Taiwan, pp. 23 - 27, 2002

[2] J Brewer "Owners brace themselves for record premium hikes" Lloyd's List, Thursday January 2, 2003

[3] RG Bea, R, KH Roberts '"Human and Organization Factors (HOF) in design, Construction, and Operation of Offshore Platforms (OTC 7738)" Presented at the 27th Annual OTC in Houston, Texas, pp. 1-4. 1995

[4] HJA Zieverink et al. "MASSTER - Improving Human Error Control in Maritime Simulator-based training" Wageningen, The Netherlands : Maritime Simulation Centre, 1997

[5] "WorldWatch - Marine - Kariba (Bahamas)' Lloyd's List, Monday December 16, 2002

[6] "Tricolor sinking flags ro-ro rule gap" Lloyd's List - Monday December 16, 2002

[7] A Spurrier "Third ship cleared of blame by Tricolor team" Lloyd's List, Tuesday January 7, 2003

[8] "WorldWatch - Marine - Nicola (Netherlands Antilles)" Lloyd's List, Tuesday December 17, 2002

[9] E Addley "That sinking feeling " The Guardian, December 17, 2002

[10] "WorldWatch - Marine - Vicky (Turkey)" Lloyd's List, Friday, January 3, 2003

[11] "WorldWatch - Marine - Vicky (Turkey)" Lloyd's List, Monday January 6, 2003

[12] D Osler "New Tricolor collision adds to Channel shipping alarm" Lloyd's List, Friday, January 3, 2003

[13] J Porter "Lords question crew skills" Lloyd's List, Monday, December 23, 2002

[14] "There's a near miss every trip" BBC News, January 9, 2003

[15] R Woodman "Reliving a nightmare after more than 30 years" Lloyd's List, Friday, January 3, 2003

[16] MJ Adams, YJ Tenney, RW Pew "Situation awareness and the cognitive management of complex systems" Human Factors 37(1), pp. 85-104, 1995

[17] NB Sarter, DD Woods "Situation awareness: a critical but ill-defined phenomenon" International Journal of Aviation Psychology 1(1), pp. 45-57, 1991

[18] T Sawaragi, K Murasawa "Simulating behaviors of human situation awareness under high workloads" Artificial Intelligence in Engineering 15(4), pp. 365-381, 2001

[19] MR Endsley Toward a theory of situation awareness in dynamic systems" Human Factors 37 , pp. 32-64, 1995

[20] MR Endsley "Measurement of situation awareness in dynamic systems" Human Factors 37(1), pp.65-84, 1995

[21] NB Sarter, DD Woods "How in the world did we ever get into that mode? Mode error and awareness in supervisory control" Applied Ergonomics 37, pp. 5-19, 1995

[22] H O'Mahony "Multi-use screen savers" Lloyd's List, Tuesday December 17, 2002

[23] WA Olson, NB Sarter "Management by consent in human-machine systems: when and why it breaks down" Human Factors 43(2), pp. 255-266, 2001

AUTHOR'S BIOGRAPHY

Capt. Anthony Patterson was appointed as Director, Centre for Marine Simulation at the Fisheries and Marine Institute in October 2000.

Prior to joining the Marine Institute, Anthony served as an officer in the Canadian Coast Guard for 20 years. While at sea, he served in a variety of positions including Commanding Officer. In 1991, he assumed a number of duties in the administration of the Coast Guard's Search and Rescue (SAR) Program, including Superintendent SAR Research and Development, Rescue and Environmental Response Instructor; and Regional Superintendent Maritime SAR for the Newfoundland Region. Anthony is a past Director of the Alliance for Marine Remote Sensing.

As the Superintended SAR R&D, Anthony managed a number of studies to improve search planning techniques. He was the project manager responsible for the implementation of computerized search planning methods in Canadian rescue centres. He was also the project manager responsible for the development of the Operator Proficiency Initiative that brought mandatory safety training to the recreational boating community. He has also been the Canadian delegate on a number of international missions, including the middle-east peace process operational working groups.

Anthony is a graduate of the Canadian Coast Guard College in Sydney, Nova Scotia. He holds a Diploma in Nautical Science, a Coast Guard Command Certificate, and a Master Mariner's certificate. Anthony and his wife live in St. John's with their four children.