Skip Navigation


Annals of Occupational Hygiene Advance Access originally published online on March 30, 2009
Annals of Occupational Hygiene 2009 53(4):311-324; doi:10.1093/annhyg/mep011
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
53/4/311    most recent
mep011v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Logan, P.
Right arrow Articles by Hewett, P.
PubMed
Right arrow PubMed Citation
Right arrow Articles by Logan, P.
Right arrow Articles by Hewett, P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?


© The Author 2009. Published by Oxford University Press on behalf of the British Occupational Hygiene Society

Occupational Exposure Decisions: Can Limited Data Interpretation Training Help Improve Accuracy?

Perry Logan1, Gurumurthy Ramachandran2,*, John Mulhausen1 and Paul Hewett3

1 3M, 900 Bush Avenue, St Paul, MN 55144, USA
2 Division of Environmental Health Sciences, University of Minnesota, Minneapolis, MN 55455, USA
3 Exposure Assessment Solutions, Morgantown, WV, USA

* Author to whom correspondence should be addressed. Tel: +612-626-5428; fax: +612-626-4837; e-mail: ramac002{at}umn.edu


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 FUNDING
 APPENDIX 1. DATA INTERPRETATION...
 ACKNOWLEDGEMENTS
 REFERENCES
 
Accurate exposure assessments are critical for ensuring that potentially hazardous exposures are properly identified and controlled. The availability and accuracy of exposure assessments can determine whether resources are appropriately allocated to engineering and administrative controls, medical surveillance, personal protective equipment and other programs designed to protect workers. A desktop study was performed using videos, task information and sampling data to evaluate the accuracy and potential bias of participants’ exposure judgments. Desktop exposure judgments were obtained from occupational hygienists for material handling jobs with small air sampling data sets (0–8 samples) and without the aid of computers. In addition, data interpretation tests (DITs) were administered to participants where they were asked to estimate the 95th percentile of an underlying log-normal exposure distribution from small data sets. Participants were presented with an exposure data interpretation or rule of thumb training which included a simple set of rules for estimating 95th percentiles for small data sets from a log-normal population. DIT was given to each participant before and after the rule of thumb training. Results of each DIT and qualitative and quantitative exposure judgments were compared with a reference judgment obtained through a Bayesian probabilistic analysis of the sampling data to investigate overall judgment accuracy and bias. There were a total of 4386 participant–task–chemical judgments for all data collections: 552 qualitative judgments made without sampling data and 3834 quantitative judgments with sampling data. The DITs and quantitative judgments were significantly better than random chance and much improved by the rule of thumb training. In addition, the rule of thumb training reduced the amount of bias in the DITs and quantitative judgments. The mean DIT % correct scores increased from 47 to 64% after the rule of thumb training (P < 0.001). The accuracy for quantitative desktop judgments increased from 43 to 63% correct after the rule of thumb training (P < 0.001). The rule of thumb training did not significantly impact accuracy for qualitative desktop judgments. The finding that even some simple statistical rules of thumb improve judgment accuracy significantly suggests that hygienists need to routinely use statistical tools while making exposure judgments using monitoring data.

Keywords: data interpretation training • decision making • desktop study • exposure assessment • judgment accuracy and bias • professional judgment


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 FUNDING
 APPENDIX 1. DATA INTERPRETATION...
 ACKNOWLEDGEMENTS
 REFERENCES
 
Exposure assessment is at the core of environmental and occupational hygiene practice, and thus, a critical skill of occupational hygienists is making accurate exposure judgments to ensure that workers are properly protected and resources are efficiently utilized. This study aims to investigate (i) the accuracy and bias of exposure judgments using the American Industrial Hygiene Association (AIHA) exposure assessment strategy with limited monitoring data for a specific set of exposure tasks and (ii) the impact of simple data interpretation training on the accuracy and bias in exposure judgments.

Exposure judgments are typically made across a continuum of available information and exposure data (Rock, 1986; Kromhout et al., 1987; Hawkins and Evans, 1989; Teschke et al., 1989; Macaluso et al., 1993; Cock et al., 1996; Friesen et al., 2003; Ramachandran et al., 2003; Ramachandran, 2008). Many exposure judgments are made with very limited monitoring data, such as retrospective exposure judgments for epidemiology studies and prospective exposure judgments for planning future manufacturing operations. In this paper, exposure judgments made without direct exposure monitoring data are referred to as ‘qualitative’ and judgments made with monitoring data are ‘quantitative’. When limited or no sampling data are available, occupational hygienists typically review basic characterization information, leverage surrogate data and use what can be referred to as ‘professional judgment’ to arrive at a qualitative exposure judgment (Ignacio and Bullock, 2006). As expected, such professional judgments are often subjective, resulting in exposure judgments having a wide range of accuracy, depending on many factors (Kromhout et al., 1987; Hawkins and Evans, 1989; Teschke et al., 1989; Macaluso et al., 1993; Cock et al., 1996; Walker et al., 2001, 2003; Friesen et al., 2003; Ramachandran et al., 2003).

Several well-known judgment strategies use a method of initially classifying workers into similar exposure groups (SEGs) based on observation of a task or group of tasks in a process (Corn and Esmen, 1979; Mulhausen and Damiano, 1998; Ignacio and Bullock, 2006). An SEG is a group of workers having the same general exposure profile for the agents being studied because of the similarity and frequency of tasks they perform, the materials and processes with which they work, the administrative and engineering controls employed and the similarity of the way they perform the tasks. Occupational hygienists review the workforce, materials, exposure agents, tasks, work practices, equipment and existing administrative and engineering exposure controls and then identify exposure groups that will be assessed, and possibly controlled, depending on the exposure judgments or exposure control category selected. The exposure judgment for any prospective SEG requires the selection of an occupational exposure limit (OEL) and a judgment by the hygienist about where the SEG exposure decision statistic falls in relation to the OEL. Chapter 5 in ‘A Strategy for Assessing and Managing Occupational Exposures’ (Ignacio and Bullock, 2006) contains a good discussion on OEL selection. The selection of a decision statistic is an important element for how judgments are performed and exposure controls are managed (Ramachandran, 2008). Using exposure assessment and control categories defined in the AIHA exposure assessment and management strategy, qualitative judgments can be documented in fractions or multiples of the selected health-based or compliance-based OEL (Mulhausen and Damiano, 1998; Ignacio and Bullock, 2006). The AIHA exposure control categories used in this study are illustrated in Table 1. Judgments are made by identifying the exposure control category in which the 95th percentile of the exposure distribution is most likely located for a given job or task. A judgment can be documented for each SEG, which can represent a single task that may be short in duration or may represent a group of tasks that comprise a full-shift exposure.


View this table:
[in this window]
[in a new window]

 
Table 1. AIHA exposure category rating scheme

 
Expert judgment
Professional judgment studies across many fields have identified similarities across professionals that are likely to be present in hygienists (Connolly et al., 2000; Gilovich et al., 2002). Studies investigating the accuracy of exposure judgments have shown that exposure assessment professionals tend to be more accurate than untrained non-professionals. However, some studies show that exposure judgments made by professionals may not always be highly accurate but can become more accurate with more data and training (Kromhout et al., 1987; Hawkins and Evans, 1989; Teschke et al., 1989; Macaluso et al., 1993; Cock et al., 1996; Walker et al., 2001, 2003; Friesen et al., 2003; Ramachandran et al., 2003). These studies focused on some measure of agreement between assessors or with monitoring data, but few studies looked at systematic biases in professional judgments and potential causes of the bias. Walker et al. (2001, 2003) reported that environmental exposure assessments made by experts appear to have systematic bias and suggested that heuristics may be playing a role, as found in other professions.

Judgments made in medicine, psychology, law and other fields utilize available information and data using similar high-level cognitive processes as those used to make an exposure judgment. Extensive work in the field of psychology indicates that heuristics or simple decision rules make the decision process efficient but can lead to specific biases according to the heuristics or decision rules used (Kahneman and Tversky, 1973; Kahneman et al., 1982; Griffin and Tversky, 1992; Gilovich et al., 2002). A quote from Nobel laureates Kahneman and Tversky illustrates this finding: ‘In making predictions and judgments under uncertainty, people do not appear to follow the calculus of chance or the statistical theory of prediction. Instead, they rely on a limited number of heuristics which sometimes yield reasonable judgments and sometimes lead to severe and systematic errors’ (Kahneman et al., 1982). In their experience teaching and in applied studies, they found that both students and trained professionals tended to make common mistakes when interpreting data. Kahneman and Tversky proposed three heuristics that help explain the underlying mechanisms. These, now famous, heuristics are (i) availability, (ii) anchoring and adjustment and (iii) representativeness.

After a review of professional judgment studies in other fields, one may conclude that hygienists also use common heuristics when making exposure judgments (Kahneman et al., 1982; Griffin and Tversky, 1992; Otway and von Winterfeld, 1992; Gigerenzer et al., 1999; Baron, 2000; Connolly et al., 2000; Walker et al., 2001; Gilovich et al., 2002). Since hygienists periodically need to make quick judgments with limited information and data, they rely on simple processes and heuristics to efficiently make judgments. One could imagine several instances where common heuristics would be utilized when making exposure judgments.

  1. Judgment made using only the ‘available’ information in memory is an example of using the ‘availability’ heuristic. If an exposure judgment needs to be made quickly, a hygienist may use only the information that comes to mind rather than spending additional time and resources collecting more information and data. In this case, the judgment would only be as good as the information available in memory and, therefore, could be prone to memory bias.
  2. Judgments can also made using the ‘anchoring and adjustment’ heuristic by focusing only on one piece of information or data point without taking all information and data into consideration. Again, the quality of the judgment will be dependent on the information or data used to ‘anchor’ the judgment.
  3. An example of the ‘representativeness’ heuristic could be seen where a hygienist fails to understand the most likely ‘representative’ distribution for a population, such as using a normal distribution when the population is most likely log-normal. In this case, the hygienist would likely underestimate an upper tail decision statistic, resulting in an underestimate of exposure. One can imagine many other scenarios where a hygienist could use a combination of these and other heuristics to make exposure judgments.

A few authors have proposed methods for combining qualitative information with quantitative data using probabilistic methods to arrive at an integrated judgment in an exposure category (Wild et al., 2002; Hewett et al., 2006). A powerful attribute of this type of approach is that exposure judgments can be defined in terms of the probability of the decision statistic falling in each exposure category. Hewett et al. illustrated the use of Bayesian statistical methods in the context of the AIHA strategy to combine qualitative and quantitative exposure judgments with monitoring data for a given SEG. This approach provides a transparent method for incorporating the relative certainty of the information or data used to produce a judgment probability chart (Fig. 1).


Figure 1
View larger version (12K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Example qualitative exposure judgment probability chart illustrating an occupational hygienist’s exposure judgment given the information and data available. This chart shows that the hygienist is highly confident that the 95th percentile falls into category 1 to <10% of the OEL.

 
The Bayesian approach applied to exposure judgments presents a structured method for utilizing qualitative judgments (prior) and monitoring data (likelihood) to create an integrated judgment (posterior). The AIHA model integrated with Bayesian methods also provides a very powerful construct to test exposure judgment accuracy which may be based on qualitative exposure judgments or outputs from an exposure model. These exposure judgments for various scenarios can be directly compared with the Bayesian likelihood chart based on monitoring data to test the accuracy and bias for a group of assessments (Fig. 2). By systematically collecting various attributes of the assessments or exposure models, this method can help identify factors that impact the accuracy of exposure judgments and provide insight for specific training or follow-up (Hewett et al., 2006).


Figure 2
View larger version (34K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Bayesian integrated AIHA strategy used to test exposure judgment accuracy. The figure illustrates a method for utilizing the Bayesian integrated AIHA strategy to compare exposure monitoring data analysis (likelihood) with exposure judgments (prior) made by an occupational hygienist for a given SEG.

 

    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 FUNDING
 APPENDIX 1. DATA INTERPRETATION...
 ACKNOWLEDGEMENTS
 REFERENCES
 
Desktop study overview
A desktop study using videos and written and oral information was designed to study the accuracy and potential biases of exposure judgments for several tasks commonly found in industry. Participants made consecutive exposure judgments with increasing monitoring data while seated at desktops in conference rooms without the use of a computer or other statistical tools. Those judgments were then compared to a reference decision calculated for the task data using Bayesian statistical analysis (Hewett et al., 2006) in order to characterize their accuracy. During each data collection, exposure judgments were collected for a subset of tasks before and after data interpretation training in order to evaluate the impact of that training on judgment accuracy. Participants were also given pre- and posttraining standardized data interpretation tests (DITs) in order to further characterize each individual's ability to interpret a small data set (one to six measurements) without the use of a computer or other statistical tools.

Subjects were asked to participate at several annual industrial hygiene conferences. Participation was solicited from conference attendees with various levels of experience making exposure assessments using AIHA exposure assessment strategy categories. Participants were given a brief description of the desktop study and instructed that none of the participants in the study or their current employer would be personally identified in any publication related to the study. All participants were asked not to disclose any of the information used in the study to avoid influencing future participants. No compensation was provided, and participation was voluntary.

Participants gathered in a conference room where the desktop study took place. The judgment elicitation sessions lasted between 2 and 4 h depending on the number of exposure jobs reviewed. The study personnel presented an overview of the research goals, after which participants were instructed on the decision statistic to be used and procedures for providing probabilistic exposure judgments using the AIHA exposure categories in the data forms given to them. Information was collected on the participants’ educational background and their professional expertise in various aspects of occupational exposure assessment (Table 2). The participants were then administered a pretraining DIT that measures the ability of the participant to estimate, based on a small monitoring data set, the probability of the 95th percentile of the exposure distribution being located in each of the four AIHA exposure control categories.


View this table:
[in this window]
[in a new window]

 
Table 2. Summary of certifications, education and experience determinants for all participants (n = 75)

 
Following the pretraining DIT, participants were shown videos of a worker performing a task that had potential exposure to the chemical of interest. The eight tasks selected for this study included various types of material handling of liquids from drums or containers and cleaning tasks. The first judgment was made by each participant without any sampling data given, representing the initial or qualitative judgment. Subsequent quantitative judgments were made by participants as they were given a single personal sample data point, one at a time. Participants were not allowed to use computers or calculators but were able to make hand calculations on the data collection sheets provided.

After this, the participants were given a targeted data interpretation training that provided rules to estimate the most likely AIHA exposure control category. Simple ‘rules of thumb’ or heuristics were developed that could be applied easily to small data sets for estimating in which control category the 95th percentile most likely falls. The rules of thumb are presented in Appendix 1 and require, at most, four calculations that can be easily performed on paper or in one's head. Following this training, we again administered a different DIT and then asked the participants to provide exposure judgments on a different set of video desktop tasks. A flowchart describing the sequence of procedures followed during each judgment elicitation session is illustrated in Fig. 3.


Figure 3
View larger version (24K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 3. Flowchart of procedures for judgment desktop elicitation sessions.

 
Calculating the ‘reference’ judgment
To assess the accuracy of the hygienists’ judgments made by hygienists, they need to be compared against a reference or ‘correct’ judgment. The reference judgment is defined as one that is predicted by estimating the probability of the true 95th percentile falling into each of the four AIHA exposure categories given all the available monitoring data. This is known as the likelihood distribution, which represents the relative probability of observing this set of data given all possible specific combinations of geometric mean (GM) and geometric standard deviation (GSD) (Hewett et al., 2006).

For each of the eight tasks, the number of samples used in quantitative judgments was less than the full data set used to calculate the reference judgment (Table 3). For task data sets that contained one or more values less than the limit of detection, censored data analyses using maximum likelihood estimation method was used to select the AIHA categories that had the highest probability for the 95th percentile (Hewett et al., 2006; Hewett and Ganser, 2007).


View this table:
[in this window]
[in a new window]

 
Table 3. Summary of sampling data used for reference calculations

 
Data interpretation tests
A group of DITs were designed to better understand the accuracy and bias when estimating the decision statistic for a small data set without the use of computer-based statistical tools or any specific information about a job, task or chemical. It is possible that bias in estimating exposure with exposure data may also be present with qualitative exposure judgments where sampling data are not present. Each DIT had a total of eight data sets with the number of samples ranging from one to six samples. Data sets for each DIT were created either from actual sampling data normalized to an OEL of 100 p.p.m. or by using a spreadsheet that randomly selected data from a log-normal distribution with a GSD of 2.5 and various GMs. Each participant was instructed to review each data set and estimate the probability of the 95th percentile falling in each of the four exposure categories, based on an OEL of 100 p.p.m. The participants were asked to ensure that the probabilities for each data set total 100% and that one category has the highest probability. Fig. 4 illustrates a DIT that has eight data sets with an example judgment for Data Set #7 which has three samples. Reference judgments were calculated for each data set for comparison to the exposure judgments made by participants.


Figure 4
View larger version (28K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 4. Example data interpretation test with eight data sets.

 
The category with the highest probability was compared to the reference judgment highest category for each data set. A DIT score was then calculated for each participant's DIT using the following formula in equation (1):


Formula (1)

Pre- and posttraining task judgment data collections
Videos of a worker performing all or part of an actual exposure task were shown to each study group. In addition, for each of the tasks, written basic characterization information was given, including amounts of materials and mixture concentrations, physical properties of exposure chemical, approximate duration of each task, equipment and ventilation. Participants were allowed to question the proctor directly, but were asked not to communicate with other participants to prevent potentially influence. The participants were asked to document each judgment as a probability of the 95th percentile (decision statistic) falling in each of the four exposure categories. This probability judgment method was the same method used for documenting judgments in the DITs. Participants were first asked to document their ‘initial judgment’, which is considered qualitative because no exposure data were yet available to participants. When all participants completed the qualitative judgment, the participants were given the first sample point and asked to document their next judgment, which would be considered quantitative. All participants were instructed not to go back and change any prior judgments throughout the data collection process. This process continued until all available sample data for each task were utilized in the judgments. The same judgment elicitation process was used before and after training.

There were eight task and chemical combinations used in the study, all of which used different chemicals. The judgments made about each task were only for the task duration and not for the full shift. If a task lasted 15 min, participants were asked to estimate the exposure during the 15-min period.

Each judgment was compared to the reference standard for the task, and a category judgment score (CAT score) was calculated based on the percentage of correct judgments made by a participant for a given task.


Formula (2)


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 FUNDING
 APPENDIX 1. DATA INTERPRETATION...
 ACKNOWLEDGEMENTS
 REFERENCES
 
DITs and training
The DIT was designed to help study how hygienists interpret small data sets ranging from one to six samples. The output of the judgment is constructed in probabilities of the 95th percentile falling into each of the four AIHA categories. This provides a transparent method for incorporating the level of certainty for a given judgment. The DIT does not specify a chemical and assumes that all data are of proper duration for comparison with the given OEL of 100 p.p.m. Therefore, a DIT score provides a measure of a hygienist's ability to estimate the 95th percentile of log-normal data. Since each judgment is made of four categories, the DIT score (percent of correct judgments) expected from random chance is 0.25 or 25%. As a whole, study participants did better than random chance, with mean pre- and posttraining DIT scores for the group of 0.47 and 0.64, respectively (Table 4).


View this table:
[in this window]
[in a new window]

 
Table 4. Summary of t-tests comparing accuracy of pre- and posttraining DIT scores to random chance test score of 0.25

 
The percent of participant DIT scores >50% correct increased from 44% before training to 90% after training (Fig. 5). The 95% lower bounds for before and after training were 0.40 and 0.59, respectively, which are both well above what would be expected from random chance (0.25). The upper bound of the mean for before training of 0.55 was below the lower bound of 0.59 for the mean after training, indicating that the rule of thumb training provided a statistically significant positive impact to data interpretation accuracy (Table 4). Before the rule of thumb training, ~24% of DIT scores were at or above 75% correct; after training, the percentage of DIT scores at or above 75% correct jumped to 46% (Fig. 5).


Figure 5
View larger version (23K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 5. Histogram of all DIT scores for pre- and posttraining.

 
The DIT score is one measure of accuracy for data interpretation judgments but does not indicate whether or not bias is present across the judgments. The individual DIT judgments were compared to the reference and plotted in a histogram to help illustrate potential bias occurring across all the judgments pre- and posttraining (Fig. 6). The DIT judgments made pretraining appear to be biased low: ~38% of DIT judgments were below the reference category, while only ~15% of DIT judgments were above it. This finding could be due to anchoring bias or base-rate error from improperly utilizing the representativeness heuristic for normal rather than log-normal data (Kahneman and Tversky, 1979; Griffin and Tversky, 1992; Otway and von Winterfeld, 1992). The rule of thumb training appeared to shift the bias to slightly overestimate exposure for small data sets: ~10% of DIT judgments were below the reference category, while ~25% of DIT judgments were above the reference category. There were a total of four different DITs with different data sets rotated between pre- and posttraining to minimize any specific test effects. Given the shift of bias from underestimation to overestimation, it appears that the DIT training had a significant and positive impact on DIT judgments.


Figure 6
View larger version (19K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 6. Percentage of all pre- and posttraining DIT judgments above, below and reference categories (n = 82).

 
Qualitative judgments and training
The qualitative judgments were made after the basic characterization information and videos were reviewed but before any sample data were made available to participants. As with the DIT scores, the probability of a correct judgment due to random chance is 0.25 or 25%. Table 5 shows that the mean of all pre- and posttraining qualitative scores for the group was 0.32 and 0.29, respectively. The 95% lower bound for the pretraining was 0.27, slightly above what is expected from random chance. The 95% lower bound for after training was 0.23, which is slightly below what is expected from random chance. It appears that the pretraining qualitative judgments were above random chance, while the posttraining score may not be different from random chance. This indicates that the rule of thumb training did not positively impact the qualitative judgment accuracy.


View this table:
[in this window]
[in a new window]

 
Table 5. Summary of t-tests comparing accuracy of pre- and posttraining qualitative judgments to random chance test score of 0.25

 
Each qualitative judgment was analyzed against the reference standard to determine if bias was present. A histogram of all individual qualitative judgments was plotted showing the bias occurring across all the judgments before and after training (Fig. 7). The qualitative judgments made before training appeared to be biased low since ~52% of qualitative judgments were below the reference category, while only ~16% of qualitative judgments were above the reference category. The rule of thumb training did not appear to shift the bias to slightly overestimate exposure as with the DIT score since ~47 and ~24% of qualitative judgments were below and above the reference category, respectively.


Figure 7
View larger version (20K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 7. Percentage of all pre- and posttraining qualitative task judgments above, below and reference categories.

 
This study did not allow for the use of surrogate sampling data or facilitate the use of physical–chemical exposure models to help support making qualitative judgments. This type of information is often available for similar agents or jobs and provides good insight to the possible exposures in a given job or task. This was not a reasonable option for this desktop study since presenting surrogate data could lead to availability or anchoring and adjustment biases (Gilovich et al., 2002). Additional field studies could be initiated to better investigate the impact of accuracy when using surrogate or exposure modeling data to make qualitative judgments.

Since the rule of thumb training focused only on proper data interpretation of sample data, it is not surprising that the qualitative judgment accuracy did not improve posttraining. To have a significant impact on accuracy and bias for qualitative judgments, attributes and rules that impact qualitative judgment accuracy would need to be identified. These attributes or determinants are more likely to be related to either experience that is specific to a selected task or the ability to utilize models for estimating potential exposures. It has been suggested that the use of physical–chemical models incorporating properties such as vapor pressure and scenario characteristics such as contaminant generation rate and ventilation rate can produce more accurate exposure assessments than limited sampling data (Jayjock, 1997; Nicas and Jayjock, 2002; Nicas, 2003). The framework used in this study could be easily modified to include determinants for the use of physical–chemical models or environmental determinants to test the impact on qualitative judgment accuracy.

Quantitative judgments and training
The quantitative judgments were made following the initial judgment when sample data were presented one sample point at a time. This is probably similar to how sampling data are utilized during many routine field investigations. Hygienists typically cannot wait to get a statistically large number of samples before an exposure judgment is made. Judgments are more typically made or updated each time a new monitoring data point is received from the laboratory (Post et al., 1991; Ramachandran, 2008). Hygienists are required to make an exposure judgment and specify whether controls are needed based on the available basic characterization information and whatever sample data are available at the time. Many judgments made with only a few data points have a high level of uncertainty depending on the sample in relation to the exposure limit and other relevant surrogate data and information. This illustrates why Bayesian statistics and other probabilistic methods are attractive to hygienists who are forced to make judgments with sparse sampling data (Ramachandran, 2008).

As with the DIT score and qualitative judgments, task judgments with sampling data also had a random chance of being correct 0.25 or 25% of the time since only four exposure categories are available. The mean of all pre- and posttraining quantitative scores for the group was 0.43 and 0.63, respectively (Table 6). The 95% lower bound for both the pre- and posttraining was well above what would be expected from random chance. The upper bound of the mean before training was 0.47, which is below the lower bound of 0.58 after training. This indicates that the rule of thumb training provided a statistically significant positive impact to quantitative judgment accuracy. The percent of task judgments scoring >50% correct went from 41% before training to 66% after training (Fig. 8), illustrating the positive impact on quantitative judgment accuracy from the rule of thumb training.


View this table:
[in this window]
[in a new window]

 
Table 6. Summary of t-tests comparing accuracy of pre- and posttraining quantitative judgments to random chance test score of 0.25

 


Figure 8
View larger version (28K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 8. Histogram of pre- and posttraining task quantitative judgment accuracy.

 
Each quantitative judgment was also analyzed against the reference to illustrate potential bias. A histogram of all individual qualitative and quantitative judgments was plotted, showing the bias occurring across all the judgments pre- and posttraining (Fig. 9). Judgments made before training appeared to be biased low: ~47% of all pretraining judgments were below the reference category, while only ~10% of all pretraining judgments were above the reference category. The rule of thumb training appears to significantly reduce bias of posttraining judgments: ~19 and ~19% were below and above the reference category, respectively.


Figure 9
View larger version (21K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 9. Percentage of all pre- and posttraining quantitative task judgments above, below and reference categories.

 
Evaluation of the DIT and quantitative judgment scores indicate that hygienists can make biased judgments when they do not use statistical tools. By inspection of specific judgment sets made by individual participants, it appears that some were anchored to the specific data point even after the rule of thumb training. Task 1 in Fig. 10 illustrates how it was significant across the whole group judgments. Specifically, 80% of the posttraining judgments were correct for the second sample in Task 1 for the group. The second sample given for Task 1 happened to be above the exposure limit. Therefore, understanding probability, the majority of participants selected the correct category, 4. However, the judgment accuracy for the third sample dropped to ~43%, indicating that many of the participants were anchored to the third sample, which was ~10% of the exposure limit. This was one of the more dramatic examples of collective bias (Surowiecki, 2005) due to anchoring and base-rate error found in the desktop study.


Figure 10
View larger version (15K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 10. Charts of pre- and posttraining judgment accuracy and overestimation for Task 1.

 

    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 FUNDING
 APPENDIX 1. DATA INTERPRETATION...
 ACKNOWLEDGEMENTS
 REFERENCES
 
These results highlight the importance of a basic understanding in log-normal parametric statistics and the routine use of statistical tools by occupational hygienists when making exposure judgments based on monitoring data. In this study, the use of simple log-normal rule of thumb training significantly improved the accuracy of participant's desktop quantitative exposure judgments and reduced bias of underestimating exposures. One can readily postulate that more consistent and broader use of robust statistical tools would significantly increase the accuracy of quantitative judgments and further reduce bias. Bayesian decision tools incorporated into the AIHA exposure assessment and management strategy provide a comprehensive and transparent framework for ensuring accurate exposure judgments. In addition, it appears that systematically documenting judgments and providing feedback on judgment accuracy can provide valuable training and calibration for hygienists while helping identify specific ways to improve judgments (Plous, 1993; Gigerenzer et al., 1999; Connolly et al., 2000; Meyer and Booker, 2001; Hubbard, 2007).

Participants’ desktop qualitative judgments were little better than random chance and were not improved after statistical rule of thumb training. This is not surprising as the statistical rules of thumb are used in context with quantitative data. However, there is promise for improvement offered by the Bayesian integrated AIHA strategy as a framework that provides a powerful feedback mechanism for comparing qualitative judgments (Bayesian prior) to quantitative data interpretation (Bayesian likelihood). Further research is needed to explore the effect of this important learning feedback loop on the accuracy of occupational hygienists’ exposure judgments.

As with any study, care should be taken when interpreting findings. This study was performed on a relatively narrow number of tasks found in most industries. Therefore, the findings on qualitative judgment accuracy cannot be directly applied across all tasks and industries. The experience and education determinant distributions probably do not reflect the actual experience and education all of today's practicing hygienists. A study that would provide more insight about the accuracy of all qualitative judgments would need to include representative tasks across all industries and hygienists with education and experience determinants representative of the current hygiene profession.

Studies have been performed in various fields to investigate methods for identifying and controlling bias for faulty decision rules or heuristics (Kahneman et al., 1982; Gilovich et al., 2002). Generally, the mechanisms prescribed to control bias and increase accuracy focus on eliminating flawed decision rules by documenting the processes used to make judgments, creating feedback mechanisms and providing analytical tools and heuristics for experts (Gilovich et al., 2002; Hubbard, 2007). Anecdotal evidence suggests that industrial hygienists may not systematically use statistical tools while making decisions based on small data sets (Ramachandran, 2008). This observation may be due to either a general discomfort with statistics or the limited number of statistical tools that can effectively analyze small data sets available to hygienists. Statistical tools that allow for small data sets and provide output in terms of probability could be useful in increasing their usage across the profession. Specific training can then be designed and implemented to focus on the specific elements that improve judgments. If there are several heuristics that are quite common in making judgments across many professions, it is likely that they are also used when making exposure judgments. The insights of heuristics and biases identified in many other studies and professions can help occupational hygienists identify tools and strategies that increase exposure judgment accuracy and reduce bias. This, in turn, will help occupational hygienists utilize available resources to better understand potential exposures and protect workers.

Subsequent discussions on selection of a decision statistic generated valuable discussion and may help to calibrate hygienists when interpreting exposure data as well as provide a ‘frequentist’ format for the decision statistic (Gigerenzer and Hoffrage, 1995, 1999). Periodic use of DITs can help calibrate hygienists and will likely increase quantitative judgment accuracy. The rule of thumb training was a simple exercise that provided significant accuracy improvements to data interpretation exercises and quantitative task judgments. In addition, other simple and well-designed data interpretation rules may also increase judgment accuracy. Even though the rule of thumb training provided significant improvements in accuracy, it is more important that statistical tools be used consistently when interpreting sampling data.

The framework presented in this study can be modified and applied to many other investigations of exposure judgment determinants, bias and accuracy. Analysis of DITs and exposure judgments in further studies may help better understand possible sources of biases from heuristics used by hygienists. Understanding the impact of the exposure judgment elicitation exercises and data interpretation training on exposure judgment accuracy in this desktop study can further refine exposure judgment studies performed in the field. The implementation of a rule of thumb or other heuristic training and analysis of other determinants that affect accuracy can help identify ways to continuously improve judgment efficiency and accuracy. It is hoped that the findings in following professional judgment studies will provide hygienists with tools to more efficiently and effectively manage potential exposure, providing benefits to workers and companies alike. This especially includes field studies where physical–chemical modeling and other factors affecting qualitative exposure judgment accuracy could be analyzed and published for the benefit of workers, companies and the occupational hygiene profession.


    FUNDING
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 FUNDING
 APPENDIX 1. DATA INTERPRETATION...
 ACKNOWLEDGEMENTS
 REFERENCES
 
National Institute for Occupational Safety and Health of the Centers of Disease Control and Prevention. (1RO1 OH008513).


    APPENDIX 1. DATA INTERPRETATION TRAINING
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 FUNDING
 APPENDIX 1. DATA INTERPRETATION...
 ACKNOWLEDGEMENTS
 REFERENCES
 
The data interpretation training consisted of a short (<30 min; <40 slides) overview, in which it was suggested that (i) the log-normal distribution is an appropriate statistical model for occupational exposure data, (ii) the 95th percentile exposure is an appropriate decision statistic for the exposure limits used in the exercises and (ii) the 95% upper confidence limit for the sample 95th percentile can be used to determine with high confidence the range in which the true 95th percentile might lie. The last component of the training consisted of an introduction to (i) several rules of thumb for rapidly estimating an approximate low, middle and high estimate of the 95th percentile without the aid of a calculator or computer program and (ii) guidance on using these estimates to pick the AIHA exposure control category that most likely contains the true 95th percentile.

The following rules of thumb for estimating the 95th percentile were presented:

  • If n is small (e.g. ≤6) and one or more measurements exceed the OEL, then the exposure rating should be Category 4.
  • Otherwise, estimate the median exposure and use it as a surrogate of the sample GM: sort the data and determine the median (the median is the middle value if n is odd and the average of two middle values if n is even).
  • Multiply the median by three multipliers: 2, 4 and 6.

The results comprise an approximate low, middle and high estimate of true 95th percentile. The objective is to compare the low, middle and high estimates to the AIHA exposure control categories with a goal of picking the category that most likely contains the true 95th percentile.

The basis for the three multipliers is the relationship between the true GM and true 95th percentile for a log-normal distribution. The true GM is the median of log-normal distribution. If the true GSD is low (e.g. 1.5), the true 95th percentile will be approximately twice the GM. If the true GSD is large (e.g. 3), the true 95th percentile will be approximately six times the GM. For intermediate GSDs (e.g. 2 and 2.5), the true 95th percentile will be roughly four times the GM. These rules of thumb use the sample median as a surrogate estimate of the GM.

During the training on the rules of thumb, additional guidance was offered: (i) if the variability within the data set is small (e.g. the low and high values differ by no more than a factor or 2 or 3), then the focus should be on the location of the low and medium estimates of the 95th percentile and (ii) if the variability within the data set is large (e.g. the low and high values differ by a factor that approaches or exceeds 10), then the focus should be on the location of the medium and high estimates of the 95th percentile.

The following eight data sets were used in one of the pretraining DIT exercises (OEL = 100 p.p.m.):

A: X = {2, 5, 10, 11, 13, 34}
B: X = {8}
C: X = {9, 18, 24, 43}
D: X = {82}
E: X = {1, 1, 2, 5}
F: X = {2, 11, 26, 35, 60, 118}
G: X = {6, 11, 28}
H: X = {9, 15, 19, 23, 36, 54}

The participants were initially required to inspect the data and decide, using their own set of decision rules, which AIHA exposure control category most likely contains the true 95th percentile. Presumably, at the posttraining stage of the study, the participants would be inclined to use the statistical training and rules of thumb, although there was no requirement that they do so. The correct or reference exposure category was assumed to be the most likely category determined using Bayesian decision analysis.

The following table contains the rule of thumb calculations for the above data sets as well as the assigned exposure category determined by majority vote by one of the participant groups at the end of the statistics training. In this instance, the exposure category chosen by the participants matched the category selected using Bayesian decision analysis for all eight data sets, demonstrating that the rules of thumb are reasonably accurate when used to select the AIHA exposure category. Note that the purpose of the training was not to suggest that these rules of thumb be used for all future data analyses and interpretations, but instead was to provide the participants, within the context of the study, access to a common distributional model, decision statistic and set of decision rules.


Data set Median 95th percentile

Exposure category
Low Medium High

A 10.5 21 42 63 2
B 8 16 32 48 2
C 21 42 84 126 3
D 82 164 328 492 4
E 1.5 3 6 9 1
F 30.5 61 122 183 4
G 11 22 44 66 2
H 21 42 84 126 3

According to the rules of thumb, Data Set F should automatically be given a Category 4 exposure rating due to the fact that one of the six measurements exceeded the OEL. Data Sets B and D consisted of a single measurement. An inspector-type decision rule would invariably result in an assigned exposure category of 1 and 3, respectively. Application of the rules of thumb resulted in the selection of the next higher exposure category.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 FUNDING
 APPENDIX 1. DATA INTERPRETATION...
 ACKNOWLEDGEMENTS
 REFERENCES
 
A special thanks to the study participants who volunteered their time and energy and provided valuable feedback for this study.


    FOOTNOTES
 
The free full text of this article can be found in the online version of this issue.

Received September 22, 2008; in final form January 19, 2009


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 FUNDING
 APPENDIX 1. DATA INTERPRETATION...
 ACKNOWLEDGEMENTS
 REFERENCES
 

American Conference of Governmental Industrial Hygienists. Threshold limit values and biological exposure indices (2005) Cincinnati, OH: ACGIH.

Baron J. Thinking and deciding (2000) 3rd. Cambridge, UK: Cambridge University Press.

Cock J, Kromhout H, Heederik D, et al. Experts' subjective assessment of pesticide exposure in fruit growing. Scand J Work Environ Health (1996) 22:425–32.[Web of Science][Medline]

Connolly T, Arkes HR, Hammond KR. Judgment and decision making: an interdisciplinary reader (2000) Cambridge, UK: Cambridge University Press.

Corn M, Esmen NE. Workplace exposure zones for classification of employee exposures to physical and chemical agents. Am Ind Hyg Assoc J (1979) 40:47–57.[Web of Science][Medline]

Friesen MC, Demers PA, Spinelli JJ, et al. Validation of a semi-quantitative job exposure matrix at a Soderberg aluminum smelter. Ann Occup Hyg (2003) 47:477–84.[Abstract/Free Full Text]

Gigerenzer G, Hoffrage U. How to improve Bayesian reasoning without instruction: frequency formats. Psychol Rev (1995) 102:684–704.[CrossRef][Web of Science]

Gigerenzer G, Hoffrage U. Overcoming difficulties in Bayesian reasoning: a reply to Lewis and Keren (1999) and Mellers and McGraw (1999). Psychol Rev (1999) 106:425–30.[CrossRef][Web of Science]

Gigerenzer G, Todd PM, ABC Research Group. Simple heuristics that make us smart (1999) New York, NY: Oxford University Press.

Gilovich T, Griffin D, Kahneman D. Heuristics and biases: the psychology of intuitive judgment (2002) Cambridge, UK: Cambridge University Press.

Griffin D, Tversky A. The weighing of evidence and the determinants of confidence. Cogn Psychol (1992) 24:411–35.[CrossRef][Web of Science]

Hawkins NC, Evans JS. Subjective estimation of toluene exposures: a calibration study of industrial hygienists. Appl Ind Hyg (1989) 4:61–8.

Hewett P, Ganser G. A comparison of several methods for analyzing censored data. Ann Occup Hyg (2007) 51:611–32.[Abstract/Free Full Text]

Hewett P, Logan P, Mulhausen J, et al. Rating exposure control using Bayesian decision analysis. J Occup Environ Hyg (2006) 3:568–81.[CrossRef][Web of Science][Medline]

Hubbard DW. How to measure anything—finding the value of "intangibles" in business (2007) John Wiley & Sons.

Ignacio J, Bullock B, eds. A strategy for assessing and managing occupational exposures (2006) 3rd. Fairfax, VA: AIHA Press.

Jayjock MA. Uncertainty analysis in the estimation of exposure. Am Ind Hyg Assoc J (1997) 58:380–2.[Web of Science]

Kahneman D, Slovic P, Tversky A. Judgment under uncertainty: heuristics and biases (1982) Cambridge, UK: Cambridge University Press.

Kahneman D, Tversky A. On the psychology of prediction. Psychol Rev (1973) 80:237–251.[CrossRef][Web of Science]

Kahneman D, Tversky A. Prospect theory: an analysis of decision under risk. Econometrica (1979) 47:263–91.[CrossRef][Web of Science]

Kromhout H, Oostendorp Y, Heederik D, et al. Agreement between qualitative exposure estimates and quantitative exposure measurements. Am J Ind Med (1987) 12:551–62.[Web of Science][Medline]

Macaluso M, Delzell E, Rose V, et al. Inter-rate agreement in the assessment of solvent exposure at a car assembly plant. Am Ind Hyg Assoc J (1993) 54:351–9.[Web of Science][Medline]

Meyer MA, Booker JM. Eliciting and analyzing expert judgment—a practical guide (2001) Society for American Statistical Association–Industrial and Applied Mathematics.

Mulhausen J, Damiano J, eds. A strategy for assessing and managing occupational exposures (1998) 2nd. Fairfax, VA: AIHA Press.

Nicas M. An impractical emphasis. The Synergist (2003) 47–52.

Nicas M, Jayjock M. Uncertainty in exposure estimates made by modeling versus monitoring. Am Ind Hyg Assoc J (2002) 63:275–83.

Otway H, von Winterfeld D. Expert judgment in risk analysis and management: process, context and pitfalls. Risk Anal. 12:83–93.

Plous S. The psychology of judgment and decision making (1993) McGraw, NY: McGraw-Hill.

Post W, Kromhout H, Heederik D, et al. Semiquantitative estimates of exposure to methylene chloride and styrene: the influence of quantitative exposure data. Appl Occup Environ Hyg (1991) 6:197–204. (ref shows that category judgment accuracy increases with more data).

Ramachandran G. Toward better exposure assessment strategies—the new NIOSH initiative. Ann Occup Hyg (2008) 52:297–301.[Abstract/Free Full Text]

Ramachandran G, Banerjee S, Vincent JH. Expert judgment and occupational hygiene: application to aerosol speciation in the nickel primary production industry. Ann Occup Hyg (2003) 47:461–75.[Abstract/Free Full Text]

Rock JC. Can professional judgment be quantified? Am Ind Hyg Assoc J (1986) 47:A370.[Web of Science]

Surowiecki J. The wisdom of crowds (2005) New York: Random House Anchor Books.

Teschke K, Hertzman C, Dimich-Ward H, et al. A comparison of exposure estimates by worker raters and industrial hygienists. Scand J Work Environ Health (1989) 15:424–9.[Web of Science][Medline]

Walker KD, Catalano P, Hammitt JK, et al. Use of expert judgment in exposure assessment: Part 2. Calibration of expert judgments about personal exposures to benzene. J Expo Anal Environ Epidemiol (2003) 13:1–16.[CrossRef][Web of Science][Medline]

Walker KD, Macintosh D, Evans JS. Use of expert judgment in exposure assessment: Part I. Characterization of personal exposure to benzene. J Expo Anal Environ Epidemiol (2001) 11:308–22.[CrossRef][Web of Science][Medline]

Wild P, Sauleau EA, Bourgkard E, et al. Combining expert ratings and exposure measurements: a random effect paradigm. Ann Occup Hyg (2002) 46:479–87.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
53/4/311    most recent
mep011v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Logan, P.
Right arrow Articles by Hewett, P.
PubMed
Right arrow PubMed Citation
Right arrow Articles by Logan, P.
Right arrow Articles by Hewett, P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?