Yale SARS-CoV-2 Genomic Surveillance Initiative
Recent outbreaks in Danbury, CT
We are continuing to monitor the spread of a SARS-CoV-2 outbreak in Danbury, CT that began in late July 2020. Twenty-eight new SARS-CoV-2 genomes were sequenced from Danbury samples collected between September 12th and 19th. These samples complement the 27 genomes collected in Update 12 between August 14th and 17th. As we previously reported, the samples from August identified four transmission chains, the largest of which arises from a lineage circulating in the New York and Connecticut region since March. A total of 19 of the 28 new genomes being reported here join that cluster and continue to support the conclusion that this outbreak was primarily caused by a resurgence of local lineages. We continue to see small clusters of 1-2 genomes likely of European and Asian origin and a cluster of 3 genomes which likely originated in June in Brazil. There remain large gaps in sampling from recent months which limit the conclusions that we can draw from these data, such as whether Brazilian variants of SARS-CoV-2 traveled directly from Brazil to the U.S. or whether these variants have been circulating in other nations or US states more recently before arrival to Connecticut. Measurements of social distancing utilize anonymized mobile device location data to track how frequently people come into close contact. These data show a large number of social contacts in certain areas of Danbury in June, just prior to the beginning of the outbreak. The preponderance of genomes belonging to a local lineage and the increased frequency of contacts both suggest that the outbreak in Danbury is a result of changes in adherence to public health measures such as social distancing.
⚠️ WARNING: These results should be considered preliminary, as they may change in light of new data.
Our laboratory received 38 patient samples collected in September from Quest via the Connecticut Department of Health for the continued investigation of the outbreak in Danbury, CT. We were able to obtain 28 new SARS-CoV-2 genomes using our rapid sequencing approach. Combining these new virus genomes with other SARS-CoV-2 genomes sequenced by our group, including 27 from Danbury (collected in August), 234 from other regions of CT, 10 from NY, as well as 5857 from all over the world, we used phylogeographic analysis to uncover the possible origins of viruses causing recent outbreaks in Danbury.
We reported in Update 12, that the samples from August formed 4 distinct clusters. The majority of these cases clustered together, arising from a lineage that was introduced in New York in February or March 2020 (cluster #1, Figure 1A) and continues to circulate widely in the Northeast of the United States. A total of 19 new genomes from September joined cluster #1 (Figure 1B) indicating the majority of cases continue to arise from this local lineage.
Figure 1. Phylogeny for all Danbury genomes shows one large cluster and several smaller groupings. (A) The global tree shows the distribution of Danbury genomes from both Update 12 and this current update. The recent genomes were recruited to existing clusters (#1 and #4) and one new cluster was formed (#5). Three genomes do not cluster with any other Danbury samples. (B). A subtree of cluster #1 that belongs to a lineage from the Northeast of the U.S.
Cluster #4, which originated in the United Kingdom, still appears to be circulating in Danbury as supported by the 3 new genomes which joined this cluster. The remaining genomes were distributed among lineages that we had not seen in our previous data from Danbury in August. Three genomes formed a new cluster (cluster #5, Figure 2), and the remaining 3 genomes separated into distinct lineages. Cluster #5 has early origins in Brazil, suggesting a travel-based introduction into Connecticut. Due to the sparse sampling between May and September, we can not rule out the possibility that these variants were introduced to CT earlier and are only now being detected.
Of the three remaining genomes (Figure 1A, single genomes), one is related to the same Northeastern lineage as cluster #1, but groups more closely with a different set of U.S. genomes in NY and CA. Two genomes are both related to lineages originating from the United Kingdom that have also been circulating throughout many areas of Europe and the Middle East.
Figure 2. Subtree showing the new cluster of Danbury genomes. Three genomes formed a new cluster (#5, red box) which has early origins in Brazil in April and May. The paucity of genomes from May-August creates uncertainty about when the virus traveled to CT and whether it traveled via an intermediary country.
Our data show at least five transmission chains involved in the Danbury, CT outbreak. There is continued transmission of lineages from the Northeast and the U.K. (cluster #1 and cluster #4) as well as newly identified cases relating to Brazil (cluster #5). Only with continued sampling can we determine if the single genomes in Figure 2 are isolated events or the beginning of new transmission chains. These cases could represent new introductions in Connecticut, or lineages that have been circulating cryptically in our community and are only detected during a large outbreak.
Public Health Significance
By sequencing viral genomes from recent outbreaks we detected new transmission chains not yet identified in previous months. The coronavirus SARS-CoV-2 can circulate cryptically for weeks or months before it is identified in a major outbreak. The virus does not acknowledge geographic borders, and continues to spread from state to state within the US, via short or long range commuting (flights). Our phylogenetic results confirm this pattern and highlight the vital importance of inter-municipality and inter-state cooperation to curb the viral spread, via testing, contact tracing, and social distancing interventions.
The sequencing of SARS-CoV-2 genomes is particularly beneficial when sampling is done continuously (weekly). With viruses being sequenced regularly, we would be able to fill the gaps, and get a better understanding about the geographic level at which most spread occurs (e.g. within cities, a single state, a few states, nationwide, or new international introductions). With samples and analyses being released in near-real time, as done in March and April, genomic epidemiology can provide valuable information for planning direct interventions to prevent further spread.
The 28 new genomes were generated using a Nanopore MinION platform. We used a targeted amplicon sequencing approach following an adapted ARTIC network. To perform preliminary analysis, we also downloaded other genomes available on GISAID and NCBI, from around the world and the US, to uncover recent patterns of viral spread in Connecticut. Sequence alignment and phylogenetic analysis were performed using a nextstrain pipeline. Geographic information for each sequence from CT was aggregated by zip code areas with more than 50,000 inhabitants, mostly matching existing CT towns and county borders.
The directories consensus_genomes and metadata in our GitHub repository contain all of our current SARS-COV-2 genomes and metadata. The directory auspice contains a JSON file that was produced using the nextstrain pipeline. A list of GISAID/NCBI accession numbers of genomes used in this report can be downloaded on a link at the bottom of our nextstrain page.
Tara Alpert processed and sequenced the samples and sequencing data. Anderson Brito performed the phylogenetic analysis. Tara Alpert, Anderson Brito, and Nathan Grubaugh wrote and reviewed this report. Chaney Kalinich and Peter Neugebauer developed and maintain this COVIDTrackerCT website. Nathan Grubaugh leads the Yale SARS-CoV-2 Genomic Epidemiology Initiative. Finally, we also thank the authors of the genomes in our complementary dataset for making their data freely available to other researchers: a full list of authors is provided at the bottom of our dedicated nextstrain page.