Skip Navigation


Annals of Occupational Hygiene Advance Access originally published online on December 21, 2005
Annals of Occupational Hygiene 2006 50(3):305-321; doi:10.1093/annhyg/mei068
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
50/3/305    most recent
mei068v2
mei068v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by LAVOUÉ, J.
Right arrow Articles by GÉRIN, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by LAVOUÉ, J.
Right arrow Articles by GÉRIN, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?


© 2005 British Occupational Hygiene Society Published by Oxford University Press


Original Article

Statistical Modelling of Formaldehyde Occupational Exposure Levels in French Industries, 1986–2003

JÉRÔME LAVOUÉ1, RAYMOND VINCENT2 and MICHEL GÉRIN1,*

1 Groupe de recherche interdisciplinaire en santé (GRIS), Département de santé environnementale et santé au travail, Faculté de médecine, Université de Montréal, PO Box 6128, Main Station, Montreal (QC), Canada H3C 3J7; 2 Institut national de recherche et de sécurité, Département de métrologie des polluants, Vandoeuvres-les-Nancy, France

* Author to whom correspondence should be addressed. Tel: +1 (514) 343-6134; fax: +1 (514) 343-2200; e-mail: michel.gerin{at}umontreal.ca


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Occupational exposure databanks (OEDBs) have been cited as sources of exposure data for exposure surveillance and exposure assessment in epidemiology. In 2003, an extract was made from COLCHIC, the French national OEDB, of all concentrations of formaldehyde. The data were analysed with extended linear mixed-effects models in order to identify influent variables and elaborate a multi-sector picture of formaldehyde exposures. Respectively, 1401 and 1448 personal and area concentrations were available for the analysis. The fixed effects of the personal and area models explained, respectively, 57 and 53% of the total variance. Personal concentrations were related to the sampling duration (short-term higher than TWA levels), decreased with the year of sampling (–9% per year) and were higher when local exhaust ventilation was present. Personal levels taken during planned visits and for occupational illness notification purpose were consistently lower than those taken during ventilation modification programmes or because the hygienist suspected the presence of significant risk or exposure. Area concentrations were related to the sampling duration (short-term higher than TWA levels), and decreased with the year of sampling (–7% per year) and when the measurement sampling flow increased. Significant within-facility (correlation coefficient 0.4–0.5) and within-sampling campaign correlation (correlation coefficient 0.8) was found for both area and personal data. The industry/task classification appeared to have the greatest influence on exposure variability while the sample duration and the sampling flow were significant in some cases. Estimates made from the models for year 2002 showed elevated formaldehyde exposure in the fields of anatomopathological and biological analyses, operation of gluing machinery in the wood industry, operation and monitoring of mixers in the pharmaceutical industry, and garages and warehouses in urban transit authorities.

Keywords: COLCHIC • determinants of exposure • formaldehyde • mixed-effects models • occupational exposure database


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Formaldehyde is an irritant gas recently classified as carcinogenic to humans (International Agency for Research on Cancer, 2005Go). Exposure in a wide array of workplaces is mainly due to its presence in amino and phenolic resins used in several products such as varnishes, glues, and plastics. Formaldehyde is also found in sanitizing products, as an intermediate product in chemical synthesis, in histological fixative products, and embalming fluids. The recent change in the IARC classification of formaldehyde from group 2A (probably carcinogenic to humans) to group 1 (Carcinogenic to humans), based on sufficient evidence that it causes cancer in humans, constitutes an incentive for improved exposure surveillance in workplaces where formaldehyde is present.

The COLCHIC database is the French national occupational exposure data bank (Carton and Goberville, 1989Go). Set up in 1987, it contains the results of measurements taken since 1986 by eight French regional health insurance fund interregional chemical laboratories and the laboratories of the Institut de recherche et de sécurité (INRS). As of 2001, COLCHIC contained >400 000 measurement results taken in 14 000 facilities, corresponding to 600 different substances (Vincent and Jeandel, 2001Go). In 1995, Carton presented a succinct summary of the 2700 formaldehyde measurements then available in COLCHIC (Carton, 1995Go).

The potential of occupational exposure databases [occupational exposure databank (OEDB)] as a source of exposure data has already been mentioned in the literature for purposes including mainly exposure surveillance (Goldman et al., 1992Go; LaMontagne et al., 2002Go), exposure assessment for epidemiology (Stewart and Rice, 1990Go) or regulatory impact assessment (Botkin and Conway, 1995Go). Several limits of nation-wide OEDBs have also been discussed in the literature (Ulfvarson, 1983Go; Stewart and Rice, 1990Go; Gomez, 1997Go). The main objective of our study was to explore the extent to which the ancillary information provided in COLCHIC allows predicting the exposure levels in this OEDB, in order to gain insight on the usefulness of this OEDB for exposure assessment and produce a multi-sector picture of formaldehyde exposures.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The COLCHIC database has been described in detail previously (Carton and Goberville, 1989Go; Carton, 1995Go; Vincent and Jeandel, 2001Go). COLCHIC covers only the workplaces under the authority of the French national security general insurance scheme, thus excluding State services (e.g. defense, education), agriculture, mines, energy production and national mass transit. For the purpose of this study, all formaldehyde concentrations recorded in COLCHIC since 1986 up to September 2003 were extracted from the database. The extract was refined by eliminating data without the following characteristics: quantitative personal or area measurement and the sampling device used is a sampling tube. The analytical method used in the COLCHIC formaldehyde data was presented previously (INRS, 2003Go). Results reported only as ‘detected’ or ‘superior to’ were excluded (n = 126). A total of 65 records for which the concentration was >10 mg m–3 were investigated by contacting the institution having made the measurements, which caused the exclusion or correction of 7 records.

The resulting dataset was split into ‘area’ and ‘personal’ datasets. Within each dataset, values reported as under a detection limit (concentration < x) were replaced by the detection limit divided by two (x/2) (Hornung and Reed, 1990Go). Values reported as ‘non-detected’ were replaced by the median of the reported detection limits divided by two [median(x)/2].

Each measurement in COLCHIC can be related to variables indicative of economic activity and task. The ECA variable is a codification of economic activities used to classify companies for occupational illness compensation purposes in France. It is based on the NAF classification with the addition of one character to allow for more precision. The NAF classification is the French four-character classification of economic activities (NAF, Rev. 1) (Ministère de l'économie des finances et de l'industrie, 2003Go) and is related to European NACE classification [European community (EC), 2002Go]. The TASK variable is a five-character code (with ~700 categories), specific to COLCHIC, which identifies the task or workstation corresponding to a sample (Carton and Goberville, 1989Go). In order to improve stratified analyses with these variables a partial aggregation of data across categories was performed, yielding the modified variables Mod-ECA and Mod-TASK (Table 1). Hence, when <40 values were available for a five-character category, one character was taken out to create a new, broader category. The process was repeated until the category contained 40 values or the code was reduced to two characters. This procedure allowed maximizing the number of data per strata while keeping a precise codification for categories with more values. A peculiarity of this classification algorithm is that when results are presented for a broadened category (e.g. code reduced to two characters), the results exclude data in this broad category which belong to finer categories containing >40 data. The variable ECA-TASK was obtained by combining Mod-ECA and Mod-TASK.


View this table:
[in this window]
[in a new window]
 
Table 1. Variables tested in the statistical models

 
Statistical modelling
Separate statistical modelling was performed with the personal and area datasets. The response variable selected for the analysis was the natural logarithm of formaldehyde concentrations. The data were analysed with extended linear mixed-effects models, which allow modelling complex variance–covariance structures in the data (Pinheiro and Bates, 2000Go). The model framework used in this study can be described by the following equation:

Formula 1(1)

Formula 1
where there are M regional laboratories, Mi facilities corresponding to the ith lab, Mij sampling campaigns in the jth facility corresponding to the ith lab and Mijk observations in the kth sampling campaign in the jth facility corresponding to the ith lab. The total number of observations is Formula 1 is the logarithm of the lth observation in the kth sampling campaign in the jth facility corresponding to the ith lab. The model assumptions are that 1) (Lab.effect), (Facility.effect) and (Campaign.effect), are distributed normally with mean 0, with respective standard deviations {sigma}lab, {sigma}facility and {sigma}campaign; 2) (Lab.effect), (Facility.effect), (Campaign.effect) and (Error) are statistically independent; and 3) (Error) follows a multinormal distribution with mean 0.

All fixed and random effects tested for inclusion in the models are presented in Table 1, along with descriptive statistics for the continuous variables.

Several modelling approaches were used in order to address specific complexities in our data. First, the Mod-ECA and Mod-TASK were not tested simultaneously for inclusion in the statistical models. Indeed their simultaneous presence would imply that for each industrial category the model could predict exposure for every kind of task, which is not possible since many tasks are industry-specific. Therefore, for both the personal and area datasets, three models were constructed, one in which Mod-ECA was tested along with the other potential predictors, one with the variable Mod-TASK and one with the variable ECA-TASK.

Moreover, depending on the classification used, the number of data belonging to categories with sufficient data to permit estimation was much variable. To address this issue and allow comparison of the different models, a dataset common to all industrial classification schemes was used. This dataset only included data belonging to categories of ECA-TASK with at least 30 values.

Finally, the measurement durations in COLCHIC ranged from a few minutes to several hours. Peak, short-term and TWA measurements might not be influenced by the same determinants. Therefore, the predictions of the final models were compared with those obtained with the same models fitted to a dataset restricted to sample durations >1 h.

The sample duration was tested both as a continuous (LENGTH; Table 1) and a nominal (TYPE; Table 1) variable. The categories of TYPE, initially defined by the cut-points 15, 30, 60 and 120 min. were refined during the modelling by comparison with models fitted with aggregated categories. For all nominal variables, estimates of the model coefficients for categories containing <10 values were not reported. Data which were classified in the ‘unknown’ or ‘other’ category of a nominal variable were excluded. The random effect structure tested in our model was a nested structure ‘campaign in facility in lab’ (equation 1). The correlation between measurements taken in the same group of the first, second and third levels of classification were estimated as follows:

  1. Correlation of measurements taken in the same lab but different facilities and sampling campaigns

    Formula 2(2)
    Formula 2 is the residual variance.

  2. Correlation of measurements taken in the same facility but during different sampling campaigns

    Formula 3(3)

  3. Correlation of measurements taken during the same sampling campaign

    Formula 4(4)

The lme function of the S-plus software, which was used in our study, provides estimates of the different intergroup variabilities (e.g. Formula 4). This implies variable intragroup correlation coefficients (e.g. {rho}campaign) when {sigma}W is modelled as dependant on other variables. The intragroup correlation coefficients presented in the results section were therefore calculated for an average residual standard deviation.

The potential dependency of {sigma}W on other variables (i.e. heteroscedastic model) was tested in several ways in our study: {sigma}W modelled as different for each category of a nominal variable (all nominal variables in Table 1 were tested) as illustrated in equation 5; {sigma}W varying linearly (equation 6) or exponentially (equations 7 and equations 8) with a continuous variable (the response and all continuous variables in Table 1 were tested).

Formula 5(5)
with ßi to be estimated, i = 1 to the number of categories of the nominal variable

Formula 6(6)

Formula 7(7)

Formula 8(8)
with C and ß to be estimated and X the continuous variable tested. Only first-order interactions were tested for inclusion in the models.

REML optimization was used to choose the random effects and residual variance structures, and estimate the final model parameters. ML optimization was used to compare models with different fixed-effects structures. Two series of models were constructed, one using the Akaike information criterion (AIC) and the other using the Bayesian information criterion (BIC). Model building was performed firstly by means of a manual forward stepwise routine using the discriminating criterion for the fixed-effects structure. Then the best random effect structure was added by comparing the criteria of the possible models. Finally the variance structure for the residual variance was assessed in the same way. The next step consisted in retesting the fixed effects for removal or addition of variables. The random effects and variance structure were adjusted again if the fixed effect model had changed.

In order to maximize the power of the analysis, the datasets were not parted between a ‘model building set’ and a ‘validation set’. Rather, the whole datasets were used for the construction of the models. Internal validation was conducted by graphical assessment of residuals and estimates of random effects.

In order to illustrate the quantitative influence on exposures of the fixed effects coded as nominal variables, relative indices of exposure (RIE; equation 9) were calculated (Lavoué et al., 2005Go). For each variable, the category corresponding to the highest number of observations (the reference category) was assigned the value 100%.

Formula 9(9)
with RIElevelA the relative index of exposure for level A of the variable in question, CoefflevelA the estimated coefficient corresponding to the category A and CoefflevelRef the estimated coefficient corresponding to the reference category. CoefflevelRef is 0 when the reference category is included in the intercept. Thus, relative to the reference category, exposure levels associated with other categories are estimated as percentages.

All analyses were conducted using the statistical software S-plus 6.1 professional edition for windows Release 1 (Insightfull Corp., Seattle, WA).


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Descriptive analysis
The extract from COLCHIC contained 7392 formaldehyde measurements corresponding to the preliminary criteria mentioned in the methods, with 44 and 56% of personal and area measurements, respectively. These were taken between 1986 and 2003 in 692 facilities covering 259 five-character economic activity codes. Respectively, 52 and 43% of the personal and area data belonged to categories of ECA-TASK with >30 values, leading to modelling datasets with 1401 and 1448 values.

During the compilation of the datasets, it was found that each result in COLCHIC corresponds to one analytical result instead of an ‘exposure’ result. Hence, several results may actually correspond to one ‘full shift exposure’ evaluated with several consecutive samples (e.g. one personal full shift exposure evaluated with a ‘morning’ and an ‘afternoon’ sample). Since there is no variable in COLCHIC allowing the automated pairing of this type of result, all data were taken as separate measures of exposure. Respectively, 2.7 and 4.3% of the personal and area measurements were reported as not detected or under a limit of detection.

Statistical modelling
Six models were developed to accommodate the separate analyses of Mod-ECA, Mod-TASK and ECA-TASK for both personal and area measurements. Table 2 summarizes the main features of the final models and presents the differences between models built based on AIC and BIC. The AIC models are described in detail below.


View this table:
[in this window]
[in a new window]
 
Table 2. Main features of the six final mixed-effects models for the TWA measurements

 
For both area and personal measurements the model based on ECA-TASK explained the most variance (53 and 57%, respectively) and had the lowest AIC value (it had generally not the lowest BIC however because of the substantial number of additional parameters). In the following section only numerical results from the area and personal ECA-TASK models are presented, except when significant differences existed with models based on Mod-ECA or Mod-TASK. Tables 3 and 4 show, for personal and area measurements, respectively, the observed influence on exposure levels of all fixed effects except the ECA-TASK classification, reported as rates of change for continuous variables and as RIEs for nominal variables.


View this table:
[in this window]
[in a new window]
 
Table 3. Effects on exposure of the fixed effects in the personal ECA-TASK model

 

View this table:
[in this window]
[in a new window]
 
Table 4. Effects on exposure of the fixed effects in the area ECA-TASK model

 
In the personal model, both variables representing the sample duration (LENGTH and TYPE) were related to the response. The best fit was obtained with TYPE as a dichotomous variable defining short-term (<15 min) and TWA (>15 min) measurements, and interacting with LENGTH. This corresponds to different intercepts for short-term and TWA measurements as well as a different slope for the influence of the sample duration as a continuous variable. The TYPE variable also interacted with ECA-TASK, the reason of sampling (AGMOTIV) and the type of local exhaust ventilation (COLPROT). Estimated at 10 min. for the short-term measurements and 100 min for the TWA measurements (the respective median sample durations for these categories), the median ratio of short-term to TWA concentrations was 2.5 (varying between 1 and 4 across categories of ECA-TASK). The variable ECA-TASK explained the highest proportion of the total variability compared with the other fixed effects (partial r2 0.32).

For the area model, a structure similar to the personal models was found to best represent the influence of sample duration on concentrations: a combination of LENGTH and TYPE. In this case, refining the TYPE variable also led to a dichotomous variable but the cut point that best differentiated short-term and TWA area measurements was 60 min. The TYPE variable interacted with ECA-TASK, LENGTH and FLOW. Estimated at 30 min. for the short-term measurements and 100 min. for the TWA measurements, the median ratio of short-term to TWA concentrations was 1.9 (varying between 0.1 and 20 across categories of ECA-TASK). The variable ECA-TASK explained the highest proportion of the total variability compared with the other fixed effects (partial r2 0.38).

Inclusion of a random effect structure resulted in an improvement of the fit for all models with an average reduction of the AIC and BIC of 13%. The variables identifying the sampling campaign and the facility were included in personal and area models. The estimated within-facility correlation coefficients were 0.5 and 0.4, respectively, for the personal and area models. Within-campaign correlation coefficients were 0.8 for both models. Inclusion of the variable lab as a random effect resulted in an improvement of the fit only for the area models, with an estimated within-lab correlation coefficient of 0.08.

Addition of a heteroscedastic structure for the error term in the models also improved the models fit with an average reduction in the AIC and BIC of 5%. For the personal model the residual variance was different for each category of ECA-TASK and TYPE (equation 5) and varied exponentially with the sample duration (equation 7). The residual standard deviation ({sigma}W) decreased when the sample duration increased at a rate of 24% per 5 min. for the short-term measurements and 5% per 60 min. for the TWA measurements. In addition to a different decrease rate for the two categories, the short-term measurements had their residual standard deviation higher than that of the TWA measurements by a factor of 2. This resulted in global GSDs, across categories of ECA-TASK, varying between 3 and 4.7 for the short-term measurements, and between 2.9 and 3.3 for the TWA measurements. For the area model {sigma}W was different for each category of ECA-TASK (equation 5) and varied exponentially with the sample flow (equation 7). {sigma}W decreased by different amounts depending on the sample duration when the sampling flow increased by 0.1 litre min–1: 1.9% for <15 min, 6.4% for 15–30 min, 9.2% for 30–60 min, 9.4% for 60–120 min and 9.9% for >120 min. This resulted in global area GSDs, across categories of ECA-TASK, varying between 3.4 and 8.4 for the short-term measurements, and between 3.4 and 6.7 for the TWA measurements.

The personal and area combined models were selected to predict personal and area concentrations for different combinations of economic sectors and tasks. They were chosen because they compare favourably with other models and because their interpretation, by allowing taking into account both the industry and task variables, seems more adequate in the framework of occupational exposure assessment. For the personal model, annual population GMs and global and within-facility GSDs were estimated for the short-term measurements for the year 2002, with a sampling duration of 10 min (the median sample duration of short-term measurements), the influence of the type of LEV taken as an average effect calculated by weighing the coefficients by their corresponding proportion of measurements in the modelling datasets and the purpose of the sampling chosen as ‘systematic survey’. This category was chosen because it is believed by the authors to be associated with the monitoring of more representative and ‘average’ exposure conditions than the other categories. The TWA personal measurements were predicted in the same conditions but for a sample duration of 100 min (the median duration of TWA measurements). For the area model, the GMs and GSDs for the short-term measurements were estimated for the year 2002, with a sampling duration of 30 min (median duration of area short-term samples), a sampling flow rate set at the median of the values of the modelling dataset (0.98 litre min–1), the influence of the absence/presence of mechanical ventilation taken as an average effect calculated by weighing the coefficients by their corresponding proportion of measurements in the modelling datasets. The TWA area measurements were predicted in the same conditions but for a sample duration of 100 min (the median duration of TWA measurements). Population AMs and probabilities of exceedance of the French short-term exposure limit (1.2 mg m–3) for short-term measurements and of the 8 h exposure limit (0.6 mg m–3) for TWA measurements were also calculated. These estimates, along with the sample sizes and the raw GMs and GSDs for each category, are presented in Tables 5 and 6 for short-term and TWA personal measurements and in Tables 7 and 8 for area measurements, respectively.


View this table:
[in this window]
[in a new window]
 
Table 5. Short-term personal exposure predictions for the year 2002 in combinations of industries and tasks

 

View this table:
[in this window]
[in a new window]
 
Table 6. TWA personal exposure predictions for the year 2002 in combinations of industries and tasks

 

View this table:
[in this window]
[in a new window]
 
Table 7. Short-term area exposure predictions for the year 2002 in combinations of industries and tasks

 

View this table:
[in this window]
[in a new window]
 
Table 8. TWA area exposure predictions for the year 2002 in combinations of industries and tasks

 
For both personal and area measurements 24 ECA-TASK categories were available for comparison of the predictions for TWA data of the final models with the predictions of models fitted to data restricted to >1 hour. The personal and area ‘>1 hour’ models were fitted, respectively, to 628 and 856 values. With the ‘>1 hour’ prediction as the reference, the comparison for the personal measurement yielded an average bias of 17% and a rank correlation of 0.95. These values were, respectively, –13% and 0.91 for the area measurements.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Statistical modelling
The industrial classification scheme we used in this study led to the rejection of ~50% of the data for the statistical modelling. This was caused by the use of the combination of industry and task as the main ‘activity’ classification, the exclusion of data in categories with <30 values and, to a lesser extent, the algorithm used to refine the classifications. As an illustration, the use of only a two-digit industrial classification would have yielded a sample size of 1925 for the personal data, compared with the 1401 in our models. However, Tables 5GoGo8 show several cases (e.g. health sector) where the need for a combination of industry and task and for a fine classification are demonstrated. Moreover, models built by replacing our classification by 2 digit variables (4 for the combination of industry and task) always showed a worse fit in terms of AIC (and in several cases in terms of BIC). Finally testing modelling datasets not excluding categories with few data resulted in hundreds of additional parameters to estimate, which caused convergence issues for the ML and REML estimation methods.

As expected, Table 2 shows that BIC models are ‘included’ in the AIC models. We chose the AIC models as the final models because of the absence of the interaction between sample type and industry/task in the BIC models. Hence, although the differences between short-term and TWA samples across industries might not be of sufficient amplitude compared with the number of added parameters to be retained by BIC, we thought that their inclusion would provide more accurate predictions. Moreover it is plausible that short-term and TWA measurements do not relate in the same way across occupational settings.

The percentages of total variability explained by the fixed effects of personal and area models (ranging from 35 to 57%; Table 2) compare favourably with similar studies in the field of occupational exposure assessment (ranging between 20 and 70%) (Burstyn and Teschke, 1999Go).

The measurement duration was a major determinant of exposure levels in our study. The test of several ways of classifying measurements according to their duration yielded a dichotomous structure for both personal and area measurements, respectively, <15 min versus >15 min and <60 min versus >60 min, with short-term measurements associated in average with exposures two times higher than TWA measurements. The difference between personal and area samples might be related to different strategies used to choose the sampling duration. Formaldehyde concentrations also decreased continuously with the sample duration at variable rates (~10% per hour for TWA measurements and 10% per 5 min for short-term measurements). A similar influence was observed by Raaschou et al. (a decrease of 50% corresponding to an increase of 108 min) and Lavoue et al. (a decrease of 5% corresponding to an increase of 60 min) (Raaschou-Nielsen et al., 2002Go; Lavoué et al., 2005Go). Hence, longer sampling times for personal measurements are more likely to include ‘no exposure’ periods while for area measurements it is plausible that hygienists tend to sample for shorter periods in situations in which they know the ambient levels to be elevated. The higher influence of duration on short-term measurements can be explained by the progressive inclusion of ‘no peak’ periods in samples lasting from a few minutes to more than 10 min. Comparison of the model predictions for the TWA data with those of similar models fitted to data restricted to sample durations >1 h showed good agreement. This suggests that our models adequately reflect differences between the short-term and TWA measurements in COLCHIC. However, it remains unsure how our observations for TWA measurements relate to full shift exposures since 80% of the data in COLCHIC correspond to durations <2 h.

A significant decrease of exposure levels over time was present in all models, ranging from 7% per year to 9% per year. This pattern is consistent with the generic temporal trends in occupational exposures reported by Symanski et al. (1998Go, 2001)Go.

The variable identifying the type of LEV was influent in the personal models, with exposures associated with LEV higher than those associated with no LEV. These observations are explained by the fact that it is likely that LEV is implemented in contaminated workplaces compared with those without LEV. TWA measurements with enclosing LEV tended to be lower than those with non-enclosing LEV but this trend was inversed for short-term measurements. The absence of this variable from the area models is plausible since it is likely that personal exposures rather than ambient levels are prevented by LEV. The variable identifying the presence of mechanical ventilation was present only in the area models with significantly higher exposure associated with the absence of any mechanical ventilation. These results are plausible and tend to show that mechanical ventilation is influent, especially since the effect is observed in spite of the ‘selection bias’ mentioned for the LEV variable. This conclusion is however somewhat hampered by the absence of any observable effect of the general ventilation on personal concentrations.

The reason of sampling was related to formaldehyde personal levels. Visits corresponding to suspicion of health risk, suspicion of exposure and modification of the ventilation were associated with higher exposure levels than those corresponding to systematic surveys, notification of occupational illness and observation of health effects. Suspicion of exposure represented 63% of the analysed data. Higher exposures linked with modification of the ventilation are not unexpected since it is plausible that these changes would take place in contaminated workplaces. A parallel might be made between the higher levels observed for ‘suspicion of exposure’ in our study and several reports that IMIS concentrations measured after ‘complaints’ are higher than those obtained during ‘planned visits’(Froines et al., 1990Go; Stewart and Rice, 1990Go; Melville and Lippmann, 2001Go). Hence it is likely that exposure deemed potentially significant by the hygienist (in the case of COLCHIC) or the employees (in the case of the IMIS) would be actually higher than the concentrations measured during visits for which no preliminary assessment hinted at potentially high exposures. The conclusion that might be drawn from our observations are however limited by the fact that the variable was absent from the area models and would not have been included in the personal models if BIC had been used instead of AIC. Short-term measurement differed notably from TWA measurements only for the ‘suspicion of health risk’ category, with lower exposure relative to the ‘suspicion of exposure’ category.

The sampling flow rate was present in the area models, with an increase of 0.1 litre min–1 associated with a 6% decrease in TWA concentrations and almost no influence on short-term measurements. Interpretation of these results is not straightforward, especially given the fact that the effect was observed independently of the effect of the sample duration. Indeed, the potential collinearity between these variables was explored in several ways. Hence, the Pearson correlation coefficient calculated, respectively, for the area and personal measurement were –0.30 and –0.37 between duration and flow. Taking the variable flow out of the area models containing both flow and duration only modified marginally the effect of duration. We therefore conclude that the moderate correlation between the sampling flow and sampling duration did not cause significant confounding between these variables in our models. The sampling flow varied equally between and within laboratories and had an interquartile interval of (0.5–1 litre min–1). The recommended sampling flow rates for the analytical method are between 0.2 and 1 litre min–1. The effects of the sampling flow were not changed when the analysis was restricted to sampling flows within these limits. Since the sampling method for formaldehyde in COLCHIC involves derivatization of formaldehyde with 2,4-dinitrophenylhydrazine (DNPH), a possible explanation might be incompleteness of the derivatization reaction due to a limited reaction rate. However, while this phenomenon has been documented for the sampling method involving derivatizaton with (2-hydroxymethyl) piperidine (NIOSH, 1994Go; US Department of Labor, 1994Go), no similar observations were found for the DNPH method. The most plausible explanation, according to the authors, is related to the sampling strategy. Hence, hygienists would tend to increase the sampling flow rate when they suspect low concentrations in order to insure that a sufficient quantity of the substance is retained on the tube.

The area and personal models yielded very similar nested random effect structures, with a coefficient of correlation between measurements taken during the same sampling campaign of ~0.8 and of 0.4 between measurements taken in the same plant but during different sampling campaigns. For the area models, a weak correlation of 0.08 was detected between measurements taken by the same laboratory but in different sampling campaigns and plants. Lavoué et al. measured within-sampling campaign correlations of 0.17 and 0.56 for area and personal formaldehyde measurements in the wood panel industry in Quebec (Lavoué et al., 2005Go). Teschke et al. report a correlation of 0.31 between wood dust levels measured during the same inspection in data taken from OSHA's occupational exposure database IMIS (Teschke et al., 1999Go). The high intrasampling campaign observed in our study may be partly due to the fact that most sampling campaigns lasted 1 day. Moreover, it was impossible to combine ‘analytical results’ corresponding to tubes used sequentially to evaluate exposure for a longer period. This probably also caused an increase of the observed correlation. The estimated correlation between measurements taken in the same plant shows that plant specific determinants of exposure not accounted for by the variables present in our model are influent on formaldehyde exposures. The very low within-laboratory correlation observed in our study permits to conclude that no strong ‘region’ specific differences exist in the sampling strategies used by the teams collecting data for COLCHIC.

Significant structures of heteroscedasticity of the error term were found in both area and personal models. In all models, the variable identifying the industrial activity was strongly predictive of the residual variance. Such differences in exposure variability across industries have already been reported (Kromhout et al., 1993Go). Short-term personal measurements were more variable than personal TWA measurements. In addition to this absolute difference, the variability of short-term measurements decreased continuously with the sample duration, this influence being almost non existant for the TWA measurements. The influence of sample duration on the variability of exposure levels has a theoretical basis and has already been observed (Preat, 1987Go; Kumagai and Matsunaga, 1995Go). In the area models, an increase of the sampling flow caused a decrease of the residual variability, with a progressively greater influence for longer measurements. No plausible interpretation was found for this observation. The variability of exposure levels determines the number of samples necessary to assess an exposure situation with adequate precision. Hence, the estimated within-facility GSDs presented in Tables 5GoGo8 will be useful for industrial hygienists to help determine sample sizes a priori when devising sampling strategies in the sectors represented. Furthermore, with regard to statistical modelling, estimates of other parameters in the model depend on the variance–covariance structures in the case of unbalanced data. In our study, not taking into account the correlation structure would have caused an underestimation of the standard errors on the model coefficients by a factor of 2–3. Therefore it appears important to take into account and explore such structures of variability when modelling occupational exposures.

Validity of the statistical models
The internal validity of our models appears satisfactory considering the results of the graphical assessments of residuals and estimates of random effects. We thus conclude that the models developed in our study adequately reflect formaldehyde exposure levels found in the COLCHIC database.

The relevance of our results regarding actual exposure conditions in the general workplace is limited by the potential biases present in this OEDB, caused by the different strategies used to select the workplaces visited and the jobs, workers and tasks monitored. Very few authors have compared OEDBs exposure data with external exposure sources to assess the extent of their inherent biases. Olsen et al. found that the levels of exposure to solvents in the furniture industry collected with a random sampling strategy were lower than those found in the Danish OEDB ATABAS, which in turn were similar to measurements taken in a random sample of facilities but during specific exposure-generating tasks (Olsen et al., 1991Go). Vinzents et al. compared xylene exposure data from five European OEDBs in three industrial sectors (Vinzents et al., 1995Go). They report lower median levels in the two databases for which the data is collected for insurance purpose as opposed to compliance to exposure limits for the three other banks. Lavoue et al. observed that formaldehyde exposure data collected by governmental hygienists in the wood panel industry in Quebec were consistently higher than data collected in the same plants by a research team (Lavoué et al., 2005Go). Several studies about the US OEDB IMIS have reported insights on the potential biases contained in this bank, mostly gained from the analysis of the influence on exposure levels of variables identifying the purpose of the sampling or characteristics of the workplace (Oudiz et al., 1983Go; Froines et al., 1986Go, 1990Go; Stewart and Rice, 1990Go; Gomez, 1997Go; Teschke et al., 1999Go; Coble et al., 2001Go;). The use of statistical models in our study allowed estimation of the simultaneous influence of several variables on the concentrations stored in COLCHIC, permitting to compensate to some extent for the ‘strategy’ and ‘selection’ biases in this OEDB.

Estimation of current exposure levels from the models
Tables 5GoGo8 illustrate an attempt to draw a global portrait of exposure levels present in COLCHIC by taking into account influent variables (e.g. decreasing temporal trend) and by correcting for the ‘strategy’ biases inherent to this OEDB. The estimated GMs are consistently lower than the raw GMs, by median factors from 2 to 3.5 depending on the table. This decrease is expected since exposures were estimated for the year 2002 and a significant decreasing trend over years was observed. Personal predicted GMs are further decreased by our assumption that data collected during systematic surveys are less prone to upward biases than data collected when potential elevated exposure was suspected by the hygienists. Therefore the agreement, or lack thereof, between predicted and raw GMs should not be seen as a way of evaluating the predictive ability of the models but rather as an illustration of the controlling by the models for the variables identified as determinants.

The predicted exposure levels in Tables 5GoGo8 show several industries and activities associated with elevated exposure levels relative to the French limits, mainly anatomopathological and biological analyses in the health sector and several operations of gluing machinery in the wood carpentry and wood panel industry for personal exposures. In addition ambient exposures were also elevated for the operation and monitoring of mixers in the pharmaceutical and chemical industry, and in garages in the urban public transportation. Most of these sectors have been mentioned in reviews on occupational exposure to formaldehyde (International Agency for Research on Cancer, 1995Go; Niemelä et al., 1997Go).

While we believe that the estimates presented in Tables 5GoGo8 constitute a step towards the possibility to use exposure levels in OEDBs for exposure assessment, several limits prevent their direct use as a job-exposure matrix. Hence, as discussed previously, the models presented in our study were not validated with exposure data external to the COLCHIC database. Moreover, the validity of the correction we used to account for a potential ‘sampling strategy’ bias still has to be confirmed with studies on other substances in COLCHIC. Furthermore, COLCHIC does not provide identification (even anonymous) of individuals, which prevents estimation of within- and between-worker variances. Finally, COLCHIC does not cover all workplaces in France and our analysis was restricted to a subset of the economic activities with formaldehyde data (those with the most data) in COLCHIC. Other sources would have to be used to complete the portrait of formaldehyde occupational exposure (Valiante et al., 1992Go).

Several authors have underlined the potential usefulness of OEDBS for exposure surveillance, exposure assessment for epidemiology or regulatory impact assessment (Stewart and Rice, 1990Go; Goldman et al., 1992Go; Gomez, 1993Go; Botkin and Conway, 1995Go; LaMontagne et al., 2002Go). Our results support the idea that the analysis of exposure levels stored in OEDBs with refined statistical tools may significantly facilitate their interpretation within these frameworks.


    CONCLUSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Through statistical modelling of area and personal exposure measurements contained in the French occupational exposure database COLCHIC, several determinants of occupational exposure to formaldehyde were identified and a multi-industry exposure portrait was elaborated. Short-term measurements were higher than TWA measurements. Personal exposure decreased over time, were higher in workplaces with local exhaust ventilation compared with no LEV, were inversely correlated with the sampling duration and depended on the reason of sampling, with samples taken during systematic surveys lower than those taken because exposure or health risk were suspected. Area measurements also decreased over time, were inversely correlated with the sampling flow and were lower when general mechanical ventilation was present. The use of extended linear mixed-effects models allowed the identification of a strong correlation between measurements taken during the same sampling campaign and a moderate correlation between measurements taken in the same plants. The elaboration of a multi-sector picture of formaldehyde exposure from COLCHIC by correcting for variables potentially linked to the inherent sampling biases present in this OEDB constitute an important step towards a potential wider use of exposure databanks for exposure assessment. Further studies using other substances in COLCHIC or comparing other sources of formaldehyde exposure data with our estimates are needed.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The authors would like to thank the Institut national de recherché et de sécurité (INRS) for providing access to the COLCHIC database. The authors would also like to thank Brigitte Jeandel and Marilyne L'Huillier for their help in extracting and managing the formaldehyde data in COLCHIC. J.L. was supported by the Institut de recherche Robert-Sauvé en santé et en sécurité du travail (IRSST).

Received May 30, 2005; in final form October 28, 2005


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Botkin A, Conway H. (1995) Relevance of exposure data to regulatory impact analyses: overcoming availability problems. Appl Occup Environ Hyg; 10: 383–90.

Burstyn I, Teschke K. (1999) Studying the determinants of exposure: a review of methods. Am Ind Hyg Assoc J; 60: 57–72.[Web of Science][Medline]

Carton B. (1995) COLCHIC Chemical Exposure Database: information on lead and formaldehyde. Appl Occup Environ Hyg; 10: 345–50.

Carton B, Goberville V. (1989) La base de données COLCHIC. Cahiers de notes documentaires Securite et hygiene du travail; 134: 29–38.

Coble JB, Lees PS, Matanoski G. (2001) Time trends in exposure measurements from OSHA compliance inspections of the pulp and paper industry. Appl Occup Environ Hyg; 16: 263–70.[Medline]

European Community (EC). (2002) Comission regulation (EC) No 29/2002 of 19 december 2001. Official Journal of the European Community; L63. Available at: http://europa.eu.int/eur-lex/en/consleg/pdf/1990/en_1990R3037_do_001.pdf.

Froines JR, Wegman DH, Dellenbaugh CA. (1986) An approach to the characterization of silica exposures in US industry. Am J Ind Med; 10: 345–61.[Medline]

Froines JR, Baron S, Wegman DH et al. (1990) Characterization of the airborne concentrations of lead in US industry. Am J Ind Med; 18: 1–17.[Web of Science][Medline]

Goldman LR, Gomez L, Greenfield S et al. (1992) Use of exposure databases for status and trend analysis. Arch Environ Health; 47: 430–8.[Medline]

Gomez MR. (1993) A proposal to develop a national occupational exposure databank. Appl Occup Environ Hyg; 8: 768–74.

Gomez MR. (1997) Factors associated with exposure in occupational safety and health administration data. Am Ind Hyg Assoc J; 58: 186–95.[Web of Science][Medline]

Hornung R, Reed LD. (1990) Estimation of average concentration in the presence of nondetectable values. Appl Occup Environ Hyg; 5: 46–51.

INRS. (2003) Fiche Métropol 001: Analyse des aldéhydes. Institut national de recherche et de sécurité, Vandoeuvre. Available from: http://www.inrs.fr

International Agency for Research on Cancer. (1995) IARC Monographs on the evaluation of carcinogenic risks to humans. Vol. 62: Wood dust and formaldehyde. Lyon: World Health Organization.

International Agency for Research on Cancer. (2005) IARC Monograph on the evaluation of carcinogenic risks to humans Vol. 88: Formaldehyde, 2-Butoxyethanol and 1-tert-Butoxy-2-propanol. Lyon: World Health Organization.

Joint ACGIH-AIHA Task Group on Occupational Exposure Databases. (1996) Data elements for occupational exposure databases: guidelines and recommendations for airborne hazards and noise. Appl Occup Environ Hyg; 11: 1294–311.

Kromhout H, Symansky E, Rappaport SM. (1993) A comprehensive evaluation of within- and between-worker components of occupational exposure to chemical agents. Ann Occup Hyg; 37: 253–70.[Abstract/Free Full Text]

Kumagai S, Matsunaga I. (1995) Changes in the distribution of short-term exposure concentration with different averaging time. Am Ind Hyg Assoc J; 1: 24–31.

LaMontagne AD, Herrick RF, Van Dyke MV et al. (2002) Exposure databases and exposure surveillance: promise and practice. Am Ind Hyg Assoc J; 63: 205–12.

Lavoué J, Beaudry C, Goyer N et al. (2005) Investigation of past and current exposures to formaldehyde in the reconstituted wood panels industry in Quebec. Ann Occup Hyg; 49: 587–600.[Abstract/Free Full Text]

Melville R, Lippmann M. (2001) Influence of data elements in OSHA air sampling databse on occupational exposure levels. Appl Occup Environ Hyg; 16: 884–899.

Ministère de l'économie des finances et de l'industrie. (2003) Décret n° 2002-1622 du 31 décembre 2002 portant approbation des nomenclatures d'activités et de produits. Journal officiel de la république française; Janvier 2003, 34.

Niemelä RI, Priha E, Heikkila P. (1997) Trends of formaldehyde exposure in industries. Occup Hyg; 4: 31–46.

NIOSH. (1994) Formaldehyde by GC: method 2541. Cincinnati, OH: National Institute for Occupational Safety and Health.

Olsen E, Laursen B, Vinzents PS. (1991) Bias and Random Errors in Historical Data of Exposure to Organic Solvents. Am Ind Hyg Assoc J; 52: 204–11.[Web of Science][Medline]

Oudiz J, Brown JW, Ayer HE, Samuels S. (1983) A report on silica exposure levels in United States foundries. Am Ind Hyg Assoc J; 44: 374–6.[Medline]

Pinheiro JC, Bates DM. (2000) Mixed-effects models in S and S-plus. New York: Springer-Verlag.

Preat B. (1987) Application of geostatistical methods for estimation of the dispersion variance of occupational exposures. Am Ind Hyg Assoc J; 48: 877.[Medline]

Raaschou-Nielsen O, Hansen J, Thomsen BL et al. (2002) Exposure of Danish workers to trichloroethylene, 1947–1989. Appl Occup Environ Hyg; 17: 693–703.[CrossRef][Medline]

Rajan B, Alesbury R, Carton B et al. (1997) European proposal for core information for the storage and exchange of workplace exposure measurements on chemical agents. Appl Occup Environ Hyg; 12: 31–9.

Stewart PA, Rice C. (1990) A source of exposure data for occupational epidemiology studies. Appl Occup Environ Hyg; 5: 359–63.

Symanski E, Kupper LL, Rappaport SM. (1998) Comprehensive evaluation of long-term trends in occupational exposure: Part 1. Description of the database. Occup Environ Med; 55: 300–9.[Abstract/Free Full Text]

Symanski E, Chan W, Chang C-C. (2001) Mixed-effects models for the evaluation of long-term trends in exposure levels with an example from the nickel industry. Ann Occup Hyg; 45: 71–81.[Abstract/Free Full Text]

Teschke K, Marion SA, Vaughan TL et al. (1999) Exposure to wood dust in U.S. industries and occupation, 1979 to 1997. Am J Ind Med; 35: 581–9.[CrossRef][Web of Science][Medline]

Ulfvarson U. (1983) Limitations to the use of employee exposure data on air contaminants in epidemiologic studies. Int Arch Occup Environ Health; 52: 285–300.[CrossRef][Web of Science][Medline]

US Department of Labor. (1994) Acroleine and/or formaldehyde. Salt Lake City, UH: Occupational Safety and Health Administration, Organic Methods Evaluation Branch, OSHA Analytical laboratory.

Valiante D, Richards TB, Kinsley KB. (1992) Silicosis surveillance in New Jersey: targeting worplace using occupational disease and exposure data. Am J Ind Med; 21: 517–26.[Medline]

Vincent R, Jeandel B. (2001) COLCHIC-occupational exposure to chemical agents database: current content and development perspectives. Appl Occup Environ Hyg; 16: 115–21.[CrossRef][Medline]

Vinzents PS, Carton B, Fjeldstad P et al. (1995) Comparison of exposure measurements stored in european databases on occupational air pollutants and definitions of core information. Appl Occup Environ Hyg; 10: 351–4.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
ANN OCCUP HYGHome page
T. OGDEN
Annals of Occupational Hygiene at Volume 50: Many Achievements, a Few Mistakes, and an Interesting Future
Ann. Hyg., November 1, 2006; 50(8): 751 - 764.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
50/3/305    most recent
mei068v2
mei068v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by LAVOUÉ, J.
Right arrow Articles by GÉRIN, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by LAVOUÉ, J.
Right arrow Articles by GÉRIN, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?