Abstract
In December 2019, a novel respiratory disease was identified in Wuhan, China that rapidly spread across the globe. The etiological agent was identified as the coronavirus, SARS-CoV-2 which is now known to cause COVID-19. Since its identification, over 511 billion positive cases and over 6 billion deaths have been reported as of 1 May 2022. For my thesis, I worked to retrospectively investigate the evolution and diversity of the virus in Nebraska between 19 March 2020 and 31 March 2022. I focused on outbreaks and intrahost variation by performing whole viral genome sequencing. Overall, I sequenced 4,845 PCR positive patient viral samples collected from the CHI Heath Creighton University Medical Center Laboratory. From 19 March 2020 to 31 March 2021, 960 samples were sequenced for genomic analysis. Seventy three lineages were identified beginning with A.1. Samples progressed over the year, with B.1 and B.1.2 making up the majority of the lineages. This study period concluded with the first appearance of Variant of Concern, Alpha. Variant analysis indicated mutation rates of 2.1 mutations per month. We also analyzed patient medical data to better understand factors affecting patient outcomes. Using patient data collected from 12,690 patient medical records, we were able to evaluate patient demographics and treatments. This pinpointed that patients with increased age (>65), high BMIs, history of diabetes, and history of strokes were at higher risk of death from COVID-19. From 1 April 2021 to 31 March 2022, 3,461 samples were sequenced and 103 lineages were identified. Alpha samples continued through the beginning of the study period before the arise of Delta, followed by Omicron. Variant analysis showed increase mutation rates concentrated to the 3’ end of the genome but an overall decrease of 0.04 mutations per month.