Update 10 | 2020.06.17

Yale SARS-CoV-2 Genomic Surveillance Initiative

Leer esta página en Español

What’s new

This week, we have added 38 new genomes that were collected in early April from Connecticut (CT).  35 of the new genomes are clustered in the NY-clade near other CT genomes, which indicates the same pattern we have seen since mid-March: local transmission linked with New York. This pattern can be seen in figure 1. The genomes in our analysis predate the SARS-CoV-2 peak in CT, which occurred in late April. Since this peak, the number of new cases reported per day has consistently declined.

⚠️ WARNING: These results should be considered preliminary, as they may change in light of new data.


We previously showed that outbreaks in CT were likely seeded by domestic introductions from the West Coast and New York (NY). We also showed that most genetic sequences collected from COVID-19 cases in southern CT up to late April were more closely related to other sequences in CT and NY than those of viruses introduced from international or other domestic sources. This is supported by travel and case data.

Genomes collected in CT can be separated into two groups, called clades, based on their ancestral history. Within clades, groups of closely related viruses form lineages. Clades and lineages are labeled in the interactive phylogeny on our nextstrain page. The viruses that first circulated on the West Coast are part of the ‘A’ lineage (found in a group referred to as the WA-Clade, because the earliest detected virus in this group was collected in Washington state). These genomes were collected from some of the earliest cases identified in CT. Genomes collected in March and early to mid-April are in the ‘B.1’ lineage in the NY-Clade (referred to as the NY-Clade, because New York is the most likely recent origin of this group). As we collect more sequences, we are finding more and more genomes from the ‘B.1’ lineage. Within the 171 sequences we sampled within this lineage, there appeared to be a number of separate introductions followed by sustained transmission clusters later in March and in early April.

SARS-CoV-2 lineages in CT

LineageSub-lineageIntroduced to CT fromSampling DatesNumber of sequences
AA.1Initially introduced to the Western U.S./Canada at least twice, now sustained in CTMarch 8 to April 923 (1 new)
NoneEast Asia/Oceania to U.S. NortheastMarch 13 to April 66 (2 new)
BB.1 (mostly B.1.3 and B.1.1)New YorkMarch 11 to April 17171 (35 new)
B.2Southeastern Asia to U.S. northeastMarch 6 to April 1033

New genomes in CT

We added 38 new sequences from CT to our analysis and found that 35 of the sequences are closely related to other sequences sampled in NY and CT; as can be seen in figure 2. The continued grouping of sequences also highlights persistent local transmission that is occurring in CT. This is evidenced by increasing proportions of new CT genomes clustering with each other instead of out-of-state genomes, which indicates local transmission. These 35 sequences are part of the NY-clade which indicates that the NY-clade has become the predominant clade in CT. We also know that NY and other places seeded early CT outbreaks, but this finding reinforces NY’s central role in the continuing of CT outbreaks. The other three new genomes are most closely related to Asia/Oceania sequences that circulate in the U.S. Northeast. All 38 new sequences are from before the peak of SARS-CoV-2 in CT. Since the peak, there has been a decrease in the number of new cases.

Public Health Significance

Analysis of the CT genomes collected mainly in March and April indicate a significant amount of local transmission occurring during those months. As in previous updates, we observed that many of the newly sequenced COVID-19 cases in CT trace back to interstate spread from New York. However, as the outbreak progresses and we gather more genomic data, we have seen increasing proportions of new sequences grouping with other CT genomes, rather than out-of-state genomes. This indicates that most infections are likely the result of local transmission in Connecticut.

Policy implications

Our earlier analyses demonstrated the vital importance of inter-municipality and inter-state cooperation and coordination with regards to testing, contact tracing, and social distancing interventions. It is clear that viruses spread across municipal and state boundaries, meaning state and municipality level policies have implications for their neighbors. As policymakers relax social distancing measures, it is important to work together across geographic boundaries and coordinate interventions to limit further spread. With the decrease in new COVID-19 cases in CT, understanding the importance of public health measures in policy and their implementation will help prevent new spikes in the number of COVID-19 cases.  

As we sequence more genomes, we will be looking to understand at what geographic level does the most spread occur (e.g. within cities, a single state, a few states, nationwide, or new international introductions). This data will be coupled with case numbers to determine how longer-term intervention strategies are working and where the strategies can improve.


We analyzed 165 genomes that we previously sequenced and added 38 newly sequenced genomes. The 203 total genomes were generated using either the Nanopore MinION platform or the Miseq Illumina platform. We used a targeted amplicon sequencing approach following an adapted ARTIC network. To perform preliminary analysis, we also downloaded other 515 genomes available on GISAID from around the world and the US to uncover recent patterns of viral spread within and from Northeastern USA in the past weeks. Sequence alignment and phylogenetic analysis were performed using a nextstrain pipeline. Geographic information for each sequence was aggregated by zip code areas with more than 50,000 inhabitants, mostly matching existing CT towns and county borders.

Data availability

The directories consensus_genomes and metadata in our GitHub repository contain all of our current SARS-COV-2 genomes and metadata. The directory auspice contains a JSON file that was produced using the nextstrain pipeline. A list of GISAID accession numbers of genomes used in this report can be downloaded on a link at the bottom of our nextstrain page.


Mary Petrone, Anne Wyllie, Chantal Vogels, Ed Courchiane, Sarah Prophet, Isabel Ott, and Chaney Kalinich performed the viral RNA extractions. Tara Alpert and Joseph Fauver prepared samples for sequencing and assembled the SARS-CoV-2 genomes. Cole Jensen and Anderson Brito performed the phylogenetic analysis. Cole Jensen, Chaney Kalinich, Mary Petrone, Anderson Brito, and Nathan Grubaugh wrote and reviewed this report. Chaney Kalinich and Peter Neugebauer developed and maintain this COVIDTrackerCT website. Mario Peña-Hernández leads all Spanish translation. Nathan Grubaugh leads the Yale SARS-CoV-2 Genomic Epidemiology Initiative. Finally, we also thank the authors of the genomes in our complementary dataset for making their data freely available to other researchers: a full list of authors is provided at the bottom of our dedicated nextstrain page.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s