Mere information

The two main branches of receptor models are Chemical Mass Balance (CMB) models and multivariate factor analytic models. CMB models assume knowledge of the number of required sources and their composition. It is a problem using CMB models that this requirement is often not completely fulfilled. Factor analysis, by artful mathematics, attempts to apportion the sources and determine their composition on the basis alone of a series of observations at the receptor site. Factor analysis is a commonly used tool, because the choice of the model dimension and the search for non-negative solutions by axis rotations can be based entirely on mathematical criteria. Nevertheless, it is a problem that factor analysis attempts to get more information out of the data than is really there. In general the solution is not reliable and cannot be used for a source apportionment, although some resemblances with existing sources in the real world may be recognized.

All receptor models, including factor analysis, are based on the assumption that the original receptor site concentrations can be adequately explained by a linear combination of contributions from various relevant sources with constant source profiles. With the Constrained Physical Receptor Model (COPREM) (Wåhlin, 2003) the corresponding bi-linear equation is solved iteratively by a weighted least square method in which the chi-square (c²) is minimised within the limits imposed by constraints. Chi-square is the total squared distance between the measurements and the model values, measured in units of the uncertainties. Built-in model constraints exclude non-physical solutions (negative components in the source profiles and source strengths), and additional constraints can be included to fix profile components in constant ratios, partly or entirely. An initial profile matrix is set up in which the source vectors have the main characteristics of known sources, and the additional constraints are set up to maintain these characteristics, or to prevent the profiles from mixing together during the iteration. In this way any a priori knowledge about the character of the sources can be used to achieve a polarised solution. By the weighted least square method COPREM takes into account the uncertainties, which is of particular importance for measurements near the detection limit. Outliers, or other data that you might want to exclude from the fit, can be marked with infinite uncertainties.

A code has been written in C-language and compiled as a 32-bit Windows executable program (called from the command prompt) to perform the iteration. As inputs are used three matrices: A profile matrix containing the initial source vectors, a form matrix containing information about the constraints on the profile matrix elements in fixed (0) or free (1) format, and a data matrix containing the measured values and their absolute uncertainties. A negative uncertainty is interpreted by the code as an infinite uncertainty and can be used to exclude selected data from the fitting process. Output is a non-negative source strength matrix, a non-negative profile matrix, plus chi-square and the degree of freedom. The strengths values of the individual sources in the source strength matrix are normalized to unity in average, and, in consequence, the profile matrix elements will be the average source contributions. Furthermore, a ‘one-factor’ analysis is performed on the residues to reveal any ignored source, and the result is expressed as an extra row in the source strength matrix and an extra column in the source profile matrix. As a new supplement to COPREM a multiple weighted linear regression analysis, in which all constraints are ignored, is performed after the last iteration step. The calculated source strengths of the three sources are used as the independent variables, and the measured data as the dependent variable. The uncertainties of the data used in the regression analysis are modified with constant factors for each compound, so the uncertainties comply statistically with the chi-square values found by COPREM. The results are source profiles with uncertainties. The calculated uncertainties represent lower bound values, because the rotational ambiguity and the uncertainties of the independent variables are ignored.

A part of the COPREM package is a MS Office Excel sheet (“Coprem_from_excel.xls”) with Visual Basic macros that can be used to create the input data tables, do the call of the COPREM program, and draw charts showing the results.

Revideret 14.02.2024