Source identifications of airborne fine particles
using positive matrix factorization and U.S. Environmental Protection
Agency positive matrix factorization.
by Kim, Eugene^Hopke, Philip K.
ABSTRACT
The widely used source apportionment model, positive matrix
factorization (PMF2), has been applied to various air pollution data.
Recently, U.S. Environmental Protection Agency (EPA) developed EPA
positive matrix factorization (PMF), a version of PMF that will be
freely distributed by EPA. The objectives of this study were to conduct
source apportionment studies for particulate matter less than 2.5
[micro]m in aerodynamic diameter ([PM.sub.2.5]) speciation data using
PMF2 and EPA PMF (version 1.1) and to compare identified sources between
the two models. In the present study, ambient [PM.sub.2.5] compositional
datasets of 24-hr integrated samples collected at EPA Speciation Trends
Network monitoring sites in Chicago, IL, and Portland, OR, were
analyzed. Both PMF2 and EPA PMF extracted eight sources for the Chicago
data and 10 sources for the Portland data. The model-resolved source
profiles were similar between two models for both datasets. However, in
several sources, the average contributions did not agree well and the
time series contributions were not highly correlated. The differences
between PMF2 and EPA PMF solutions were caused by the different
least-square algorithm and the different nonnegativity constraints. Most
of the average source contributions resolved by both models were within
5-95% uncertainty provided by EPA PMF, indicating that the sources
resolved by both models were reproducible.
INTRODUCTION
There is growing interest in source apportionment studies for the
airborne particulate matter (PM) with an increased focus on the control
of the sources of airborne PM, because the statistical association
between PM and adverse health effects was shown in many studies. (1-5)
Among many source apportionment methods, positive matrix
factorization (PMF2) (6) has been shown to be a powerful alternative to
traditional multivariate receptor modeling of airborne PM. (7-10) PMF2
requires one to purchase a license. PMF2 has been used to analyze
ambient PM measurements in the Arctic, (11) in Hong Kong, (12) in
Thailand, (13) in Toronto (Ontario, Canada), (14) in Vietnam, (15) and
in the United States. (16-21)
A more flexible tool to fit multilinear models, the multilinear
engine (ME), (22) was developed to solve any problem that can be
expressed as a sum of products. It has been used to analyze the standard
bilinear factor analysis model (23,24) and multiway models. (25-27)
Recently, as one of the efforts to provide free source apportionment
tools for the development and implementation of air quality standard,
U.S. Environmental Protection Agency (EPA) developed EPA positive matrix
factorization (PMF; version 1.1), (28) adopting a bilinear model solved
by ME with a graphical user interface platform.
The objective of this study was to examine the source apportionment
results using PMF2 and EPA PMF. In this study, the major sources of PM
less than 2.5 [micro]m in aerodynamic diameter ([PM.sub.2.5]) were
identified, and their contributions were estimated for two selected EPA
Speciation Trends Network (STN) sites located in Chicago, IL, and
Portland, OR. The identified source compositions and source
contributions were compared for each site.
EXPERIMENTAL WORK
Data Collection
STN [PM.sub.2.5] samples were collected on a one-in-three-day
schedule with a Mass Aerosol Speciation Sampler (URG) and Spiral Aerosol
Speciation Samplers (Met One Instruments) at the monitoring sites
located in Chicago and Portland, respectively. The Chicago monitoring
site (Aerometric Information Retrieval System [AIRS] site code:
170310076; latitude: 41.754; longitude: -87.714) is located in urban
residential area approximately 10 km southwest of downtown Chicago.
Highways are situated around the monitoring site. The Portland
monitoring site (AIRS site code: 410510246; latitude: 45.561; longitude:
-122.668) is located at the urban residential area approximately 5 km
southwest of the Portland International Airport. Highway 5 is located
750 m west of the site.
[PM.sub.2.5] samples were collected on three different filters: the
Teflon filter was used for mass concentrations and for the elemental
analysis via energy dispersive X-ray fluorescence (XRF) spectrometers.
The Nylon filter was analyzed via ion chromatography (IC) for sulfate
(S[O.sub.4.sup.2-]), nitrate (N[O.sub.3.sup.-]), ammonium
(N[H.sub.4.sup.+]), sodium ([Na.sup.+]), and potassium ([K.sup.+]). The
quartz filter was analyzed for organic carbon (OC) and elemental carbon
(EC) via the National Institute for Occupational Safety and
Health/Thermal Optical Transmittance protocol. (29)
Because the reported particulate OC concentrations were not blank
corrected, (30) and carbon denuders were not used in the sampling line
with the quartz filter, the integrated OC blank concentrations,
including trip and field blank, as well as OC-positive artifact, were
estimated using the intercept of the regression of OC concentrations
against [PM.sub.2.5]. (21,31,32) The estimated OC blank values 1.38
[micro]g/[m.sup.3] at Chicago and 0.87 [micro]g/[m.sup.3] at Portland
were subtracted from the reported STN OC concentrations before further
analyses.
Receptor Modeling
PMF2 is a multivariate receptor model providing source profiles and
their contributions based on a weighted least-square method that uses
uncertainties for each measurement as the data point weights. (6) ME
provides an approach to the fitting process that is more general and
flexible to solve a variety of receptor modeling problems. (22) ME uses
a structural equation input along with a set of constraints and can
solve widely different multilinear and quasimultilinear problems. In EPA
PMF, ME solves the standard bilinear model. Detailed explanations and
equations are presented in previous publications. (26,33)
There are infinite numbers of possible solutions to the factor
analysis problem because of the free rotation of matrices. (34) Both
PMF2 and EPA PMF use nonnegativity constraints on the factors, which
decrease this rotational freedom. (6,22) Also, PMF2 estimates
uncertainties associated with source contributions and profiles from
alternating regression fits where each row of source contribution is
determined while source profile keeps constant, and each column of
source profile is determined while keeping source contribution constant.
EPA PMF estimates uncertainties associated with its solutions using a
bootstrapping method based on the base case solution that is selected
from several random model runs. These uncertainties also include
uncertainties originated from the rotational freedom.
[FIGURE 1 OMITTED]
Because before July 2003 the STN data were not accompanied by
analytical uncertainties, a comprehensive set of analytical uncertainty
structures estimated by Kim et al. (32) was used in this study. Based on
the reported analytical uncertainties or estimated analytical
uncertainties, the input data and associated uncertainty matrices were
estimated. The measured concentrations below method detection limit
(MDL) values were replaced by half of the MDL values, and their
uncertainties were set at five sixths of the MDL values. Missing
concentrations were replaced by the geometric mean of the
concentrations, and their accompanying uncertainties were set at four
times this geometric mean concentration.
In this study, samples for which [PM.sub.2.5] or OC data were not
available or were below zero or for which [PM.sub.2.5] or OC data had an
error flag were excluded from the datasets. The samples collected on
July 4, 2002, and July 5, 2003, at Chicago and July 5, 2004, at Portland
that were highly affected by fireworks displays were excluded in this
study. Overall, 18% and 8% of the original data were not included at
Chicago and Portland, respectively. IC S[O.sub.4.sup.2-] was excluded
from the analyses to prevent double counting of mass concentrations,
because XRF S and IC S[O.sub.4.sup.2-] showed good correlations (slope =
2.9, [r.sup.2] = 0.97, for Chicago data; slope = 2.6, [r.sup.2] = 0.95,
for Portland data). Also, IC [Na.sup.+] and IC [K.sup.+] were chosen
because of the higher analytical precision compared with XRF Na and XRF
K. Chemical species that have values above 90% below MDL (Cd, Ce, Cs,
Hf, In, Ir, La, Hg, Nb, P, Rb, Ag, Tb, Y, and Zr at Chicago; Sb, Cd, Ce,
Cs, Co, Eu, Ga, Au, Hf, In, Ir, La, Hg, Mo, Nb, Rb, Sm, Ag, Ta, Tb, Sn,
Y, and Zr at Portland) were excluded. Chemical species that have
signal-to-noise (S/N) ratios below 0.2 (As, Ba, and Sc at Portland) were
considered bad variables and were excluded as recommended by Paatero and
Hopke. (35) Thus, a total of 210 samples and 36 species and 269 samples
and 26 species including [PM.sub.2.5] mass concentrations collected
between May 2001 and November 2003 and October 2002 and April 2005 were
used for Chicago and Portland, respectively.
COPYRIGHT 2007 Air and Waste Management
Association Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2007, Gale Group. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.