Omicron and Delta

The omicron variant of concern (VOC) has been circulating world wide for about two months. I thought it would be interesting to look at its growth rate compared to the currently predominate delta strain. 

On December 29 2021, I downloaded metadata for 6,553,878 sequences from GISAID. A question arises about whether GISAID sequence data is a representative sample of the world COVID-19 situation.  Here are the top ten countries submitting sequences:

Country

Count

USA

2077490

United Kingdom

1586855

Germany

309670

Denmark

278104

Canada

230413

Japan

185716

France

174125

Sweden

138151

Switzerland

102008

India

99823



temp <- meta_data %>%
    select(Location) %>% 
    separate(Location, c("Region", "Country", "State"), sep = "\\s*/\\s*", extra = "drop", fill = "right") %>% 
    select(-c("Region", "State")) %>%
    group_by(Country) %>% 
    count() %>% 
    rename(Count = n)

arrange(temp, -Count)[1:10, ]

The USA and the United Kingdom make up about 56% of the total sequencing effort. For the most part the top ten are WEIRD countries (Western, Educated, Industrialized, Rich and Democratic). Note the absence of China. That's the data we have, so that's the data we use even though it's not representative of the whole world.

Here are plots of the log Counts by Variants of Concern (VOC) and Variants of Interest (VOI) for the entire data set and for the USA.


Notice the green dots on the right side of the plots. They represent the omicron variant. The gold points that level off at the top are the delta variants. The green dots rise rapidly. Since this is a log plot, the steep linear growth indicates exponential growth with a larger exponent than that seen with delta.

Exponential Growth

The straight lines on the log plot indicate exponential growth. It might be interesting to put some numbers to those growths. 

It appears that delta started to rise linearly around April 2021 and leveled, i.e. became completely dominant at the beginning of August 2021. I say appears. I should test that assumption against other ranges and find the proper date ranges. Maybe later. But right now, I just want to get a rough measure of growth.

delta_fit <-subsequence_fit(variants_2021_12_19, "VOC Delta", start = "2021-04-01", end = "2021-08-01", title = "Delta")

The code to produce delta_fit and variants_2021_12_19 can be found at https://github.com/analytic-garden/omicron_and_delta.

subsequence_fit produces a plot of the data and returns the fitted model.



> summary(delta_fit$fit)

Call:
lm(formula = log(Count) ~ Collection.date, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.8220 -0.1270  0.0498  0.1784  0.5980 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)     -7.257e+02  1.296e+01  -55.99   <2e-16 ***
Collection.date  3.906e-02  6.902e-04   56.59   <2e-16 ***
---
Signif. codes:  0***0.001**0.01*0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2718 on 121 degrees of freedom
Multiple R-squared:  0.9636,	Adjusted R-squared:  0.9633 
F-statistic:  3202 on 1 and 121 DF,  p-value: < 2.2e-16

> 

We can obtain similar results for omicron.



> summary(omicron_fit$fit)

Call:
lm(formula = log(Count) ~ Collection.date, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.20756 -0.17818  0.06492  0.28299  1.08133 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)     -4.024e+03  1.228e+02  -32.76   <2e-16 ***
Collection.date  2.125e-01  6.479e-03   32.80   <2e-16 ***
---
Signif. codes:  0***0.001**0.01*0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4839 on 38 degrees of freedom
Multiple R-squared:  0.9659,	Adjusted R-squared:  0.965 
F-statistic:  1076 on 1 and 38 DF,  p-value: < 2.2e-16

> 

The slopes of the lines, the Collection.date coefficient,  tell us the exponential rate.  Using the coefficients we can get the doubling rate for each variant during their growth phases.

> log(2)/delta_fit$fit$coefficients[2]
Collection.date 
        17.7457 
> log(2)/omicron_fit$fit$coefficients[2]
Collection.date 
       3.261681 
>  

The count of delta sequences doubled about every 18 days while omicron doubles in about three days.
It's difficult to translate the count of sequences in the database to transmissibility. There is a relationship but it's unclear.

Code for producing the figures for this post is available at https://github.com/analytic-garden/omicron_and_delta.


No comments:

Post a Comment