The Analytic Garden: Omicron and Delta

The omicron variant of concern (VOC) has been circulating world wide for about two months. I thought it would be interesting to look at its growth rate compared to the currently predominate delta strain.

On December 29 2021, I downloaded metadata for 6,553,878 sequences from GISAID. A question arises about whether GISAID sequence data is a representative sample of the world COVID-19 situation. Here are the top ten countries submitting sequences:

Country	Count
USA	2077490
United Kingdom	1586855
Germany	309670
Denmark	278104
Canada	230413
Japan	185716
France	174125
Sweden	138151
Switzerland	102008
India	99823


temp <- meta_data %>%

    select(Location) %>% 
    separate(Location, c("Region", "Country", "State"), sep = "\\s*/\\s*", extra = "drop", fill = "right") %>% 
    select(-c("Region", "State")) %>%
    group_by(Country) %>% 
    count() %>% 
    rename(Count = n)

arrange(temp, -Count)[1:10, ]

The USA and the United Kingdom make up about 56% of the total sequencing effort. For the most part the top ten are WEIRD countries (Western, Educated, Industrialized, Rich and Democratic). Note the absence of China. That's the data we have, so that's the data we use even though it's not representative of the whole world.

Here are plots of the log Counts by Variants of Concern (VOC) and Variants of Interest (VOI) for the entire data set and for the USA.

Notice the green dots on the right side of the plots. They represent the omicron variant. The gold points that level off at the top are the delta variants. The green dots rise rapidly. Since this is a log plot, the steep linear growth indicates exponential growth with a larger exponent than that seen with delta.

Exponential Growth

The straight lines on the log plot indicate exponential growth. It might be interesting to put some numbers to those growths.

It appears that delta started to rise linearly around April 2021 and leveled, i.e. became completely dominant at the beginning of August 2021. I say appears. I should test that assumption against other ranges and find the proper date ranges. Maybe later. But right now, I just want to get a rough measure of growth.

delta_fit <-subsequence_fit(variants_2021_12_19, "VOC Delta", start = "2021-04-01", end = "2021-08-01", title = "Delta")

The code to produce delta_fit and variants_2021_12_19 can be found at https://github.com/analytic-garden/omicron_and_delta.

subsequence_fit produces a plot of the data and returns the fitted model.

> summary(delta_fit$fit)

Call:
lm(formula = log(Count) ~ Collection.date, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.8220 -0.1270  0.0498  0.1784  0.5980 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)     -7.257e+02  1.296e+01  -55.99   <2e-16 ***
Collection.date  3.906e-02  6.902e-04   56.59   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2718 on 121 degrees of freedom
Multiple R-squared:  0.9636,	Adjusted R-squared:  0.9633 
F-statistic:  3202 on 1 and 121 DF,  p-value: < 2.2e-16

>

We can obtain similar results for omicron.

> summary(omicron_fit$fit)

Call:
lm(formula = log(Count) ~ Collection.date, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.20756 -0.17818  0.06492  0.28299  1.08133 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)     -4.024e+03  1.228e+02  -32.76   <2e-16 ***
Collection.date  2.125e-01  6.479e-03   32.80   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4839 on 38 degrees of freedom
Multiple R-squared:  0.9659,	Adjusted R-squared:  0.965 
F-statistic:  1076 on 1 and 38 DF,  p-value: < 2.2e-16

>

The slopes of the lines, the Collection.date coefficient, tell us the exponential rate. Using the coefficients we can get the doubling rate for each variant during their growth phases.

> log(2)/delta_fit$fit$coefficients[2]
Collection.date 
        17.7457 
> log(2)/omicron_fit$fit$coefficients[2]
Collection.date 
       3.261681 
>

The count of delta sequences doubled about every 18 days while omicron doubles in about three days.

It's difficult to translate the count of sequences in the database to transmissibility. There is a relationship but it's unclear.

Code for producing the figures for this post is available at https://github.com/analytic-garden/omicron_and_delta.

The Analytic Garden

Omicron and Delta

Exponential Growth

No comments:

Post a Comment

Labels

Contributors

wfmu