COPD usually progresses slowly over decades, and we really don’t know much about how COPD progresses with respect to important aspects of the disease such as emphysema and airways involvement. Most of what we currently know about the epidemiology of COPD development comes from large population-based studies with longitudinal spirometry (studies like the Normative Aging Study, the Framingham Heart Study, the Lung Health Study, Copenhagen City Heart Study, and others). These studies show a smooth decline in lung function starting at around age 30, with notable variability in both the rate of decline and in the level of maximal lung function achieved. Both factors (rate of lung function loss and maximal lung function attained) are now recognized to be important in accounting for the development of COPD. However, because there has been essentially no information on longitudinal changes in lung CT scans, we know very little about how different types of COPD develop. The purpose of this brief report is to review some data from COPDGene describing the progression of smokers without COPD into two distinct types of COPD, emphysema and airway-predominant COPD.
COPDGene is the largest COPD-focused study to have longitudinal measurements of chest CT and spirometry, so here we analyze data for 1,084 subjects in GOLD Stage 0 or 1 at baseline. We’ll use data from the baseline and 10-year visit to answer the following questions:
- How many subjects developed COPD over this 10-year period?
- How many developed airway-predominant COPD (APD) and emphysema-predominant COPD (EPD)?
- How well could these subjects be identified at baseline? Can we predict the development of COPD and APD/EPD over a 10-year period?
Some definitions: Airway-predominant COPD = GOLD stage 2,3, or 4 + CT Emphysema <5% (by LAA-950).
- Airway-predominant COPD (APD) = GOLD stage 2,3, or 4 + CT Emphysema <=5% (by LAA-950).
- Emphysema-predominant COPD (EPD) = GOLD stage 2,3, or 4 + CT Emphysema >=10% (by LAA-950).
The folks in between we refer to as Mixed COPD. This was the least common group and isn’t included in this analysis. The link to the paper establishing these definitions is here. One additional complication is that the COPDGene 10-year visit CT scans were obtained at a lower dose than previous visits due to changes in imaging standards. As a result, LAA-950 measures are not available at the 10-year visit and we had to modify the definition to use Perc15 CT Emphysema cutoff values that were selected to approximate the 5% and 10% LAA-950 thresholds.
We also need to describe the characteristics of the subjects at baseline, including some basic details about disease progression as well. The significantly different characteristics between groups are shown below.

Keeping in mind that these subjects were selected to be in GOLD spirometric stage 0 or 1 at baseline, we can see that the subjects destined to progress to APD or EPD have lower FEV1 % predicted values and greater pack-years exposure. Specifically for EPD, these subjects have markedly more emphysema and lower BMI. APD subjects do have thicker airways, and the EPD subjects have smaller airways than the “Stable” group that will not progress.
Now let’s take a look at how these subjects look 10 years later.

Finally, let’s look specifically at measures of 10-year change.

We should remember that some of these changes are built into the subject selection – we are specifically focusing on subjects that started in GOLD 0 or 1 and progressed to GOLD >=2 COPD of the APD or EPD subtype. So the fact of certain changes such as loss of lung function is less interesting than how things have changed. The most notable thing here is that the APD subjects have lost FEV1 without gaining much emphysema or airway wall thickness (these are segmental airways, FYI). The other notable thing is that the EPD group has developed more emphysema, and more interesting their airway walls have thickened the most. Take home lessons:
- Emphysema begets emphysema.
- As subjects enter into COPD, emphysema begets airway wall thickness (big caveat is this is segmental airways).
- The airway-predominant subjects lose lung function without gaining emphysema or having obvious increased thickening of the larger airways. We don’t know what’s driving this progression.
In the image below we see these subjects projected into the 3D space defined by FEV1 % predicted, emphysema, and airway wall thickness. Subjects who progress to EPD are blue, APD is red, and stable subjects remaining in GOLD 0/1 are white.

The 10-year progression to APD/EPD is 80 subjects out of 1084, that’s 7% of the study sample. As a reminder, the other progression group to COPD is the Mixed group (emphysema between 5-10%), though this group was smaller than either APD or EPD (n=14 subjects). If we extrapolate those numbers to the population of heavy smokers over 50, this translates to a lot of people at the population level. At the moment, these data are one our best current approximations of how the APD and EPD subtypes develop in the population of older smokers. For the development of preventive treatments, the billion dollar question is whether we could actually identify these subjects prior to disease progression when they could be treated early in disease when they have lots of nice lung left to save.
To address the question of early identification we used the elastic net approach to construct models to predict APD, EPD, or Stable status at 10 years using baseline predictor data. Since we only have 28 EPD and 52 APD instances, we weighted each point by the inverse of its frequency in the dataset (for example for EPD each point receives the weight: 1/(28/1084)) and we used five fold cross validation in the model building process. We only considered 12 predictors, but still these results are at risk of overfitting and will require validation or ideally re-estimation when more data on 10-year progression to APD/EPD become available. The final coefficients of the models for each subtype are shown below (FEV1pp_utah = FEV1 % predicted, FEV1_FVC_utah = FEV1/FVC, SmokgCigNow = current smoking status at baseline, pctEmph_Thirona = LAA-950, AWT_seg_Thirona = segmental airway wall thickness, BDR_pct_FEV1 = bronchodilator response).

Using standard clinical variable plus CT measures of emphysema and segmental airway wall thickness, model performance in cross-validation was decent but not great. Given the class imbalance we’ll report the AUPR curve values (area under the precision-recall curve), which were 0.09, 0.34, and 0.86 for APD, EPD, and Stable subjects, respectively (one versus all). A more intuitive representation is to report the percentage of incorrect classifications that would be required to recover 50% of each class. These numbers are 92% and 78% for APD and EPD, respectively. It helps to recognize that in this study the baseline prevalence of the “destined for APD” and “destined for EPD” GOLD 0/1 subjects was 5% and 3%, respectively. So we’re talking about roughly 2-fold and 7-fold enrichment from these models, but the reality is that one would need substantially more effective enrichment in order to have a clinical trial where say 50% of the enrolled subjects would progress to the desired subtype (over 10 years).