Skip Navigation


Annals of Occupational Hygiene Advance Access originally published online on January 27, 2009
Annals of Occupational Hygiene 2009 53(2):173-180; doi:10.1093/annhyg/men085
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
53/2/173    most recent
men085v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Lavoué, J.
Right arrow Articles by Droz, P. O.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lavoué, J.
Right arrow Articles by Droz, P. O.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?


© The Author 2009. Published by Oxford University Press on behalf of the British Occupational Hygiene Society

Multimodel Inference and Multimodel Averaging in Empirical Modeling of Occupational Exposure Levels

J. Lavoué* and P. O. Droz

Department of Work Environment, Institute for Work and Health, Universities of Lausanne and Geneva, Bugnon 19, Lausanne 1005, Switzerland

* Author to whom correspondence should be addressed. Tel: +1 (514) 890 8000 #15913; fax: +1 (514) 412 7106; e-mail: jerome.lavoue{at}umontreal.ca


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MODELING DATA SET
 LINEAR MODELS FOR OCCUPATIONAL...
 IMPLEMENTING MULTIMODEL...
 DISCUSSION
 SUPPLEMENTARY MATERIAL
 ACKNOWLEDGEMENTS
 REFERENCES
 
Empirical modeling of exposure levels has been popular for identifying exposure determinants in occupational hygiene. Traditional data-driven methods used to choose a model on which to base inferences have typically not accounted for the uncertainty linked to the process of selecting the final model. Several new approaches propose making statistical inferences from a set of plausible models rather than from a single model regarded as ‘best’. This paper introduces the multimodel averaging approach described in the monograph by Burnham and Anderson. In their approach, a set of plausible models are defined a priori by taking into account the sample size and previous knowledge of variables influent on exposure levels. The Akaike information criterion is then calculated to evaluate the relative support of the data for each model, expressed as Akaike weight, to be interpreted as the probability of the model being the best approximating model given the model set. The model weights can then be used to rank models, quantify the evidence favoring one over another, perform multimodel prediction, estimate the relative influence of the potential predictors and estimate multimodel-averaged effects of determinants. The whole approach is illustrated with the analysis of a data set of 1500 volatile organic compound exposure levels collected by the Institute for work and health (Lausanne, Switzerland) over 20 years, each concentration having been divided by the relevant Swiss occupational exposure limit and log-transformed before analysis. Multimodel inference represents a promising procedure for modeling exposure levels that incorporates the notion that several models can be supported by the data and permits to evaluate to a certain extent model selection uncertainty, which is seldom mentioned in current practice.

Keywords: exposure assessment • exposure determinant • linear models • model selection • multimodel inference


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MODELING DATA SET
 LINEAR MODELS FOR OCCUPATIONAL...
 IMPLEMENTING MULTIMODEL...
 DISCUSSION
 SUPPLEMENTARY MATERIAL
 ACKNOWLEDGEMENTS
 REFERENCES
 
The last two decades have seen extensive use of empirical statistical models to report summaries of exposure data sets in the industrial hygiene literature. These models typically attempt to establish statistical links between the measured exposure levels and individual and environmental variables documented at the time of measurement. They can be used to predict exposure for a particular combination of variables or identify influent predictors, the so-called exposure determinants. The most popular models in occupational hygiene are certainly linear multiple regression models (Burstyn and Teschke, 1999).

Most analyses of this kind involve a number of measurements complemented by a set of potential predictors, such as job, type of process or the presence/absence of ventilation. The output of the analysis is a subset of these variables being identified as determinants while the others are deemed non-influential, based on their presence or not in the final chosen model. The strategy leading to choose the final model has therefore a fundamental impact on the conclusions drawn from the analysis. A common feature of exposure determinant–exploration studies is that the modeler has usually no specific hypothesis to test but rather is looking for influent predictors in a group of plausible candidates. This is especially the case when analyzing exposure data banks (e.g. the National Exposure Database, NEDB, in the UK), for which the analyst has no control on the type of ancillary information accompanying measurement results. He therefore has to rely on some procedure measuring goodness-of-fit to the data that will help select a ‘best’ model. This is different for instance from some epidemiology studies in which the model is specified in advance by the researcher based on knowledge of disease mechanism and confounding agents.

Data-driven model selection, i.e. choice of a model based on some manipulation of the data, even for the simplest linear regression, is still an open research area. It is not the scope of this paper to review this topic, which is extensively covered, for example in Leinhart and Zucchini or McQuarrie and Tsai (Linhart and Zucchini, 1986; McQuarrie and Tsai, 1998). Briefly, the main issue is that data are used at the same time to help choose a model and estimate parameters. This tends to yield results that reflect the data at hand but are not robust to external validation, i.e. do not reflect the population of interest. Despite ongoing debate on particular techniques, there seem to be consensus that hypothesis testing-based (using P-values) stepwise methods, the most popular in applied fields, are also the most problematic and susceptible to lead to spurious relationships and underestimated uncertainty (Harrel, 2001). A typical example of such procedure would start from the model containing the variable with the lowest P-value in univariate analyses. Then all models containing that variable and one other would be fitted, and again the variable with the lowest P-value would be kept, and so on until no additional variable achieve the 0.05 cutoff. Variations of stepwise procedure include starting from a model with all variables and progressively removing the non-significant ones or testing addition and removal of variables at each step.

Another limitation of these approaches is that a single best model will be chosen in the end, regardless of how far the ‘second best’ model actually was in terms of performance. Indeed, it is possible that several competing models were quite close, perhaps close enough that the difference in the goodness-of-fit criterion did not represent any meaningful evidence. In addition to loosing interesting information, it is quite possible, in circumstances of several models close to one another, that another sample of data would have yielded another best model. This uncertainty, linked to the modeling strategy rather than parameter estimation, is seldom mentioned in published analyses.

In the last decade, a new approach attempting to address the issue of modeling uncertainty was described coming from the field of Bayesian statistics. The principle involves averaging predictions across a set of models defined a priori with a weighting associated to their quality, which is expressed as the probability of being the best model (Raftery et al., 1997). The estimated probabilities sum to 1 across the model set and can be used to appraise the amount of evidence in favor of a model (or a subset of models) compared to another. In particular, the ‘distance’ between the best model and the next ones can be quantified. For example, in a set of 100 models, two models could have probabilities of 0.50 and 0.45, with the others with much lower probabilities summing to 0.05. The analyst would conclude that the two models represent the majority of the evidence (i.e. 95%), but that there is no reason to choose one over the other: the model with the highest probability being only 0.50/0.45 = 1.1 times more likely to be the best model than the one with the second highest. The procedure, labeled Bayesian model averaging (BMA), has since found use in the field of environmental epidemiology, for the derivation of dose–response relationship (Clyde, 2000; Martin and Roberts, 2006), but not, in our knowledge, in the field of occupational exposure assessment. The development on BMA methods has been somewhat hampered by the computer intensive calculations required, the lack of implementation in standard statistical packages and of widespread availability of the required training.

More recently, Burnham and Anderson have proposed a procedure similar to but simpler than BMA for calculating model weights (Burnham and Anderson, 2002, 2004). In their method, so-called ‘Akaike weights’ are calculated for each model, being interpreted as the probability of being the best approximating model given the data at hand and the model set. Burnham and Anderons's approach to multimodel inference, initially proposed in the field of ecology (Poeter and Anderson, 2005; Hollister et al., 2008), also appeared in economy (Hansen, 2007), genetics (Posada and Buckley, 2004; Abdueva et al., 2006), social sciences (Burnham and Anderson, 2004) and psychology (Wagenmakers and Farrell, 2004), and Moon et al. have proposed a variation in the field of risk analysis for estimating effective doses for microbial infection (Moon et al., 2005). In addition, like BMA, to attempting to take into account modeling uncertainty, the approach presented by Burnham and Anderson can be readily implemented using standard statistical packages.

The main objective of this paper is to advocate the use of methods that explicitly account for the uncertainty linked to model selection in data-based empirical modeling in the field of occupational exposure assessment. To this end, we describe the simple and intuitive framework proposed by Burnham and Anderson and illustrate its use with a simplified analysis of retrospective multi-industry exposure measurements collected over the years at the Institute for Work and Health. We begin by presenting the illustrating data set and the linear model framework then proceed to describing the multimodel method alongside its practical implementation with the data set.


    MODELING DATA SET
 TOP
 ABSTRACT
 INTRODUCTION
 MODELING DATA SET
 LINEAR MODELS FOR OCCUPATIONAL...
 IMPLEMENTING MULTIMODEL...
 DISCUSSION
 SUPPLEMENTARY MATERIAL
 ACKNOWLEDGEMENTS
 REFERENCES
 
Since 1986, the Institute for Work and Health has maintained an exposure database containing all measurements made by its hygienists in various workplaces. There are currently ~8500 measurements in the data bank including biomonotoring results, volatile organic compounds (VOCs) and dust measurements. Information accompanying each measurement includes sampling and analytical method, date of sampling, type of measurement (source sampling, general area, personal...), sampling time, industry and job as coded by the Swiss Federal Institute of Statistics (OFS), origin of the measurement (request from private companies, in-house research projects) and a code identifying the person having measured.

For the purpose of this study, we set to explore the extent to which variables in the IST database could explain the variations of the recorded VOC concentrations. Since no single agent had enough data to permit such kind of analysis, we pooled all agents and standardized the concentrations by their Swiss 8-h occupational exposure limit (OEL) (each concentration was divided by the relevant exposure limit), thus creating a ‘compliance index’. After cleaning of the data set, we kept for analysis only data corresponding to VOCs with an OEL in the list of OELs enforced in Switzerland by the Swiss Accident Insurance Fund (SUVA) (n = 2588). Further restriction to sample duration between 30 min and 12 h and to combinations of industry and job with at least 10 data yielded personal and area data sets of 698 and 722 measurements, respectively.


    LINEAR MODELS FOR OCCUPATIONAL EXPOSURE
 TOP
 ABSTRACT
 INTRODUCTION
 MODELING DATA SET
 LINEAR MODELS FOR OCCUPATIONAL...
 IMPLEMENTING MULTIMODEL...
 DISCUSSION
 SUPPLEMENTARY MATERIAL
 ACKNOWLEDGEMENTS
 REFERENCES
 
We modeled the log-transformed standardized concentrations as a linear function of other variables using the linear models framework (Neter et al., 1996). The chosen model structure can be described using equation (1). This structure has been the most frequently used in the literature for reporting relationships between occupational exposure levels and determinants (Burstyn and Teschke, 1999).

Formula (1)
i = 1,...,M where there are M measurements and p fixed effects included in the model. Ln(Ci) is the natural logarithm of the ith standardized concentration. The main model assumption is that (Error) follows a normal distribution independent of the fixed effects. As an illustration, for a model with industry as a categorical predictor and year of sampling as a continuous predictor, equation 1 would translate to:

Formula (2)
Where there are j = 1,...,K industry category. βindustry(j)is the coefficient for the jth industry category, and βyear is the slope of the linear temporal trend.


    IMPLEMENTING MULTIMODEL INFERENCE USING THE FRAMEWORK PRESENTED BY BURNHAM AND ANDERSON (BURNHAM AND ANDERSON, 2002)
 TOP
 ABSTRACT
 INTRODUCTION
 MODELING DATA SET
 LINEAR MODELS FOR OCCUPATIONAL...
 IMPLEMENTING MULTIMODEL...
 DISCUSSION
 SUPPLEMENTARY MATERIAL
 ACKNOWLEDGEMENTS
 REFERENCES
 
Definition of the model set
The first step in implementing multimodel inference consists in defining a set of plausible models. This step is fundamental since all results of the multimodel analyses are conditional on the model set, i.e. all conclusions are drawn ‘given the model set’. According to Burnham and Anderson, knowledge of the subject matter should play a considerable role in limiting the size of the initial model set. In particular, they recommend that analyses where more models are fitted than the sample size be regarded as exploratory. This situation can be easily met in cases where a number of variables are available, with no specific knowledge of which should be excluded or kept or which interaction terms should be considered. For example, testing all possible combinations of 10 variables would correspond to 210 = 1024 possible models, not counting potential interactions.

For our analysis, we selected the model set by taking into account two main constraints: the need to have a sufficient number of data per estimated parameter (we arbitrarily set a target at 10 data per parameter) and to limit the number of models relative to the sample size. Moreover, we selected an ‘all combinations’ scheme, i.e. all possible models for a set of variables without interactions, in order to be able to estimate multimodel-averaged effects of predictors and to quantify the relative importance of all variables compared to each other (see below for the constraints linked to each procedure). Table 1 presents the seven variables that were selected for the analysis along with descriptive statistics, representing 128 different models to be fit to the data (compared to ~700 data in both area and personal data sets).


View this table:
[in this window]
[in a new window]

 
Table 1. Variables tested in the empirical statistical models

 
The full model (i.e. containing all variables in Table 1) explained 40.5 and 36.8% of the variations of the log-transformed personal and area compliance indices, respectively. These results are similar to other multi-industry modeling studies (see for example, Teschke et al., 1999) and show that the model set includes relevant predictors. The multimodel procedure is then going to help identify which submodels provide a good approximation of the data without containing superfluous predictors (i.e. parsimonious models).

Selecting a performance criterion: introducing the Akaike information criterion and Bayesian information criterion
Recognition of the issues associated with the use of hypothesis tests to build models has led to the development of alternative procedures. Among the wide array of published methods, so-called ‘information criteria’ have enjoyed much popularity. These quantities are calculated for each model and allow their comparison with each other, the model with the lowest value being usually favored. The most widespread information criteria are the Akaike information criterion (AIC) and the Bayesian information criterion (BIC), the latter also known as the Schwartz information criterion (Kuha, 2004). AIC (equation 3) was derived by Akaike as an asymptotic estimator quantifying information loss when a model is used to approximate the truth. Minimizing AIC therefore minimizes the information loss over a set of models.

Formula (3)
where Formula is the likelihood of the model given the data and K is the number of parameters of the model. Formula is a standard output of most statistical packages for numerous model structures. When the number of parameters of the largest model is such that n/K <40, n being the sample size, Burnham and Anderson recommend the use of a modified version of AIC:

Formula (4)

BIC (equation 5) was derived in a Bayesian framework as an approximation of quantities measuring the odds of a model being the true model given the data.

Formula (5)

Both criteria include a measure of goodness-of-fit to the data (the likelihood) and a penalty term for the number of parameters. For BIC, the number of parameters is more penalizing than for AIC, therefore models selected with BIC tend to contain fewer variables than those selected with AIC.

Calculating model weights
The second step of this procedure involves fitting each model to the data and calculating model weights (equations 1 and 2 in the supplementary data, available at Annals of Occupational Hygiene online) which, noted ‘wi’, can be interpreted as the probability that the model is the best approximating model given the data at hand and the initial model set. While they can be calculated using different performance criteria (see for example, Hjort and Claeskens, 2003 or Hansen, 2007), Burnham and Anderson advocate the use of AIC or AIC.c (Burnham and Anderson, 2004).

The calculated model weights provide a way of ranking models in the set, and the weight of evidence favoring a model over another can be estimated with the ratio of their respective weights. Specific subsets of models can be compared to other subsets in the same way (e.g. all models containing an interaction term versus those not containing it). Using guidelines for evidence ratios quoted by Lukacs et al., a ratio ~10 corresponds to limited-to-moderate support while >100 is required to report ‘strong support’ (Lukacs et al., 2007). The model weights can also be used to define a confidence set of models representing the majority of evidence. For example, if the sum of the five highest wi values is >0.95, the corresponding models form a 95% confidence set, and one could decide to base inferences only on these five models.

Table 2 presents model weights calculated using AIC.c for the five personal and area models with the highest weights. As can be seen from Table 2, the models with the lowest AIC.c in the personal and area data sets are 1.6 (0.35/0.22) and 1.7 (0.46/0.28) times more likely to be the best fitting model than the models with the second lowest values. In both cases, the five models with the highest wi values represent ~90% (88 and 93% for personal and area data, respectively) of the evidence (i.e. the sum of their wi is ~90%). The model weights in Table 2 illustrate the kind of information that is lost when using a ‘single-model’ approach. Hence, the best fitting models for area and personal data represent <50% of the evidence and are only ~1.5 times more likely to be the best approximating model compared to the second best models. There is therefore no overwhelming reason to choose them as the only useful models.


View this table:
[in this window]
[in a new window]

 
Table 2. Akaike weights for the five best personal and area models

 
Identifying influential predictors
In empirical modeling of occupational exposure data, assessing the relative influence of variables has a major importance since it permits identifying exposure determinants. In traditional analyses, potential predictors tend to be declared either influential or not based on their presence in the single model chosen by the modeling strategy. The framework described by Burnham and Anderson allows quantifying the importance of variables relative to each other for a particular structure of the model set: the variables to be compared should be tested with a single parameterization (e.g. duration as a continuous or category variable), should be all included in the same number of models, should not be involved in interactions and the models should have the same basis (e.g. all linear models). When these conditions are met, the weights of the models containing variable xj can be summed, yielding a quantity noted W+(j) and called relative importance weight for variable J. W+(j) can then be compared with W+(k) for another variable. As an illustration, for two variables J and K with W+(j) = 0.50 and W+(k) = 0.25, one might say that variable J is twice as important as variable K. These values therefore provide a measure of the weight of evidence supporting the presence of an actual relationship between a variable and the response relative to other variables tested (i.e. given the model set). Table 3 presents the W+(j) for the seven variables tested in this analysis, again calculated using AIC.c. For the personal data in Table 3, industry/job, year, duration and volatility appear clearly as the most important predictors, while season, sample type and reason have lower W+(j) values. For the area data, season and volatility have lower W+(j) values than other variables.


View this table:
[in this window]
[in a new window]

 
Table 3. Relative importance of the seven variables tested in the models

 
Multimodel-averaged inference
The model weights also form the basis on which to perform inference, i.e. predicting the response for a specific set of conditions (e.g. job A during year X with a sampling duration of Y...) not conditional on a single model but over the whole model set. The predictions are made from each model and then averaged, each prediction being weighted using its wi value. For each prediction, a variance can be estimated that is not conditional on a particular model and comprises within- and between-model (i.e. modeling uncertainty) components (see equation 3 and 4 in the supplementary data, available at Annals of Occupational Hygiene online). Based on limited simulation, Burnham found that confidence intervals calculated with these variance estimates generally were wider but more realistic than those obtained with traditional methods, which tended to yield less than desirable results.

We do not present here predictions of the compliance index for specific industries in the Swiss database because discussion of these predictions and detailed analysis of the usefulness of the IST data bank to reflect occupational exposure in Switzerland is outside the scope of the present paper. Moreover, the predictions would have reflected temporal trends we believe are mainly due to a time-changing selection bias in the database. We nevertheless report here that the unconditional standard errors for the personal predictions, calculated to estimate 95% confidence intervals, were between 5 and 20% higher than the conditional standard errors obtained by using only the best fitting model (between 2 and 12% for the area data). They illustrate the added uncertainty caused by including model selection as a process subject to variability.

Multimodel-averaged estimators of effects
Multimodel-averaged estimates of effects (i.e. model coefficients) can also be calculated with the model weights (see equations 5–7 in the supplementary data, available at Annals of Occupational Hygiene online). The calculation is similar to making predictions but requires the models to be linear models (such as those presented here) and the coefficients to keep the same interpretation across the model set or at least across the subset over which the averaging is to be performed.

In the particular case of ‘all combinations’ model sets, Burnham and Anderson propose another way of calculating multimodel coefficient estimates that takes into account the relative importance weight of the variable of interest. Hence, the coefficients are averaged over the whole model set, being taken as 0 for models not including the variable. Using this approach, the multimodel-averaged coefficient will be ‘shrunk’ toward zero compared to the previous calculations (see equation 8–10 in the supplementary data, available at Annals of Occupational Hygiene online). The extent of the shrinkage will depend on the cumulated weight of the models without the variable. This method of estimation is appealing because only averaging over models in which a variable is present is likely to yield to an upward model selection-related bias.

Multimodel-averaged estimates of effects of all variables but industry/job (which included >20 categories), shrunk and unshrunk are presented in the supplementary data (available at Annals of Occupational Hygiene online). We limit our presentation here to the effect of duration and volatility for the area data set for illustration purpose. For the area data, the compliance index was estimated to decrease by 14 and 15% with the shrunk and unshrunk methods, respectively, for a 50% increase in the sampling time. Both estimates are very close because duration had a relative importance weight very close to 1 (0.94). On the other hand, unshrunk estimates for volatility, with the ‘gas’ category at 100% exposure, were ‘high’ 183%, ‘medium’ 252% and ‘low’ 121%. Even with wide confidence intervals, these values, close to those obtained with the full model (i.e. containing all variables), would suggest an odd relationship between exposure and volatility. Because this variable had a low relative importance weight, the shrunk estimates are reduced to 97, 103 and 93%, almost equivalent to no effect. Burnham and Anderson underline the need for further work in this area of their approach (i.e. adequacy of W+(j) as shrinkage factor, exact variance of the shrunk estimator), but the general use of shrinkage to account for model selection bias (i.e. over estimation of effects due to the use of the same data set for model building and effect estimation) is well established in the statistical literature (Harrel, 2001).

Comparison between AIC, AIC.c and BIC
We compared the relative importance weights of all variables determined using AIC, AIC.c and BIC and also calculated the same weights using a bootstrap procedure, described in details in the supplementary data (see also Table 2 of the supplementary data, available at Annals of Occupational Hygiene online). The bootstrap weights were for the most part very close to their ‘analytical’ counterpart. The weights based on BIC favored simpler models than AIC or AIC.c (which yielded close results), reflected in a smaller relative importance weight for most variables. AIC models are known to yield generally more complex models than BIC (Burnham and Anderson, 2004; Kuha, 2004). With regards to the AIC versus BIC question, we support the view of Kuha, who evocated the possibility, when AIC and BIC disagree to use the corresponding models as ‘bounds for a range of acceptable models’ (Kuha, 2004).

All analyses were conducted with versions 6.1 and 7.0 of the statistical software S-plus Professional Edition for Windows (Insightful Corp., Seattle, WA).


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MODELING DATA SET
 LINEAR MODELS FOR OCCUPATIONAL...
 IMPLEMENTING MULTIMODEL...
 DISCUSSION
 SUPPLEMENTARY MATERIAL
 ACKNOWLEDGEMENTS
 REFERENCES
 
We begin this section by quoting what Chatfield regarded as his main message in a paper read before the Royal Statistical Society in 1995 (Chatfield, 1995): ‘When a model is formulated and fitted to the same data, inferences made from it will be biased and overoptimistic when they ignore the data analytic actions which preceded inference. Statisticians must stop pretending that model uncertainty does not exist and begin to find way of coping with it’.

In this paper, we presented a methodology that attempts to include modeling uncertainty in the analysis in a simple and intuitive way and that has already found its way into applied work in other scientific fields. While we find Burnham and Anderson's approach appealing, we in no way advocate exclusive use of their proposition as a universal panacea to model selection issues. An obvious limitation is indeed the fact that results from such analyses, albeit unconditional on a particular model, are conditional on the model set, and the modeling uncertainty is therefore only approximated. There is still, and probably will continue to be debate on how empirical modeling should be approached, which multimodel method is most adequate and in particular whether AIC, BIC, other ML-based criterion or some form of Bayesian or bootstrap-based model averaging should be preferred (Guthery et al., 2005; Richards, 2005; Link and Barker, 2006; Ward, 2008). What seems important to us, as underlined by Ye et al., is that the principle of multimodel averaging is more crucial in improving reliability of the results than the question of which particular multimodelling approach to use (Ye et al., 2008).

In conclusion, we believe empirical modeling studies in the field of occupational exposure assessment should attempt to account for modeling uncertainty. In this regard, multimodel averaging using the approach of Burnham and Anderson provides in our view an easy to implement and intuitive methodology.


    SUPPLEMENTARY MATERIAL
 TOP
 ABSTRACT
 INTRODUCTION
 MODELING DATA SET
 LINEAR MODELS FOR OCCUPATIONAL...
 IMPLEMENTING MULTIMODEL...
 DISCUSSION
 SUPPLEMENTARY MATERIAL
 ACKNOWLEDGEMENTS
 REFERENCES
 
Supplementary data can be found at http://annhyg.oxfordjournals.org/


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 INTRODUCTION
 MODELING DATA SET
 LINEAR MODELS FOR OCCUPATIONAL...
 IMPLEMENTING MULTIMODEL...
 DISCUSSION
 SUPPLEMENTARY MATERIAL
 ACKNOWLEDGEMENTS
 REFERENCES
 
The authors would like to thank Drs Pascal Wild and Michel Grzebyk, of the Institut national de recherche et de sécurité (INRS, France) for their very helpful comments on earlier drafts of the present paper.

Received July 15, 2008; in final form October 10, 2008


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MODELING DATA SET
 LINEAR MODELS FOR OCCUPATIONAL...
 IMPLEMENTING MULTIMODEL...
 DISCUSSION
 SUPPLEMENTARY MATERIAL
 ACKNOWLEDGEMENTS
 REFERENCES
 

Abdueva D, Skvortsov D, Tavare S. Non-linear analysis of GeneChip arrays. Nucleic Acids Res (2006) 34:e105.[Abstract/Free Full Text]

Burnham KP, Anderson DR. Model selection and multimodel inference (2002) 2nd. New York, NY: Springer Science+Business Media Inc.

Burnham KP, Anderson DR. Multimodel inference: understanding AIC and BIC in model selection. Sociol Methods Res (2004) 33:261–304.[Abstract]

Burstyn I, Teschke K. Studying the determinants of exposure: a review of methods. Am Ind Hyg Assoc J (1999) 60:57–72.[Web of Science][Medline]

Chatfield C. Model uncertainty, data mining, and statistical inference. J R Stat Soc (1995) 158:419–66.[CrossRef]

Clyde M. Model uncertainty and health effects studies for particulate matter. Environmetrics (2000) 11:745–63.[CrossRef][Web of Science]

Guthery FS, Brennan LA, Peterson MJ, et al. Information theory in wildlife science: critique and viewpoint. J Wildl Manage (2005) 69:457–65.[CrossRef]

Hansen BE. Least squares model averaging. Econometrika (2007) 75:1175–89.[CrossRef]

Harrel FEJ. Regression modeling strategies—with applications to linear models, logistic regression, and survival analysis (2001) New York, NY: Springer.

Hjort NL, Claeskens G. Frequentist model average estimators. J Am Stat Assoc (2003) 98:879–99.[CrossRef][Web of Science]

Hollister JW, August PV, Paul JF, et al. Predicting estuarine sediment metal concentration and inferred ecological conditions: an information theoretic approach. J Environ Qual (2008) 37:234–44.[Abstract/Free Full Text]

Kuha J. AIC and BIC: comparisons of assumptions and performance. Sociol Methods Res (2004) 33:188–229.[Abstract]

Linhart H, Zucchini W. Model selection (1986) New York, NY: John Wiley & Sons.

Link WA, Barker RJ. Model weights and the foundations of multimodel inference. Ecology (2006) 87:2626–35.[CrossRef][Web of Science][Medline]

Lukacs PM, Thompson WL, Kendall WL, et al. Concerns regarding a call for pluralism of information theory and hypothesis testing. J Appl Ecol (2007) 44:456–60.[CrossRef]

Martin MA, Roberts S. Bootstrap model averaging in time series studies of particulate matter air pollution and mortality. J Expo Sci Environ Epidemiol (2006) 16:242–50.[CrossRef][Web of Science][Medline]

McQuarrie ADR, Tsai CL. Regression and time series model selection (1998) Hackensack, NJ: World Scientific Publishing Company.

Moon H, Kim HJ, Chen JJ, et al. Model averaging using the Kullback information criterion in estimating effective doses for microbial infection and illness. Risk Anal (2005) 25:1147–59.[CrossRef][Web of Science][Medline]

Neter J, Kutner M, Nachtsheim CJ, et al. Applied linear statistical models (1996) New York, NY: WCB McGraw-Hill.

Poeter E, Anderson D. Multimodel ranking and inference in ground water modeling. Ground Water (2005) 43:597–605.[CrossRef][Web of Science][Medline]

Posada D, Buckley TR. Model selection and model averaging in phylogenetics: advantages of akaike information criterion and bayesian approaches over likelihood ratio tests. Syst Biol (2004) 53:793–808.[Abstract/Free Full Text]

Raftery AE, Madigan D, Hoeting A. Bayesian model averaging for linear regression models. J Am Stat Assoc (1997) 92:179–91.[CrossRef][Web of Science]

Richards SA. Testing ecological theory using the information-theoretic: examples and cautionary results. Ecology (2005) 86:2805–14.[CrossRef][Web of Science]

Teschke K, Marion SA, Vaughan TL, et al. Exposure to Wood Dust in U.S. Industries and Occupation, 1979 to 1997. Am J Ind Med (1999) 35:581–9.[CrossRef][Web of Science][Medline]

Wagenmakers EJ, Farrell S. AIC model selection using Akaike weights. Psychon Bull Rev (2004) 11:192–6.[Abstract/Free Full Text]

Ward EJ. A review and comparison of four commonly used Bayesian and maximum likelihood model selection tools. Ecol Model (2008) 211:1–10.[CrossRef][Web of Science]

Ye M, Meyer PD, Neuman S. On model selection criteria in multimodel analysis. Water Resour Res (2008) 44:W03428.[CrossRef]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
53/2/173    most recent
men085v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Lavoué, J.
Right arrow Articles by Droz, P. O.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lavoué, J.
Right arrow Articles by Droz, P. O.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?