Notes to The Human Genome Project
1. The Bermuda Accord, agreed to in February 1996 by the world’s major sequencing laboratories which at the time included Venter’s TIGR, required the public release of sequence data every 24 hours (see Reardon 2017; Jones et al. 2018).
2. The heterochromatic gene-poor regions that make up about 10 percent of the genome (0.2 of 3.1 billion base pairs) were too difficult to sequence with available technologies because of the highly repetitive stretches of DNA they contain. These regions are located especially at the tips and centromeres of the chromosomes. The euchromatic gene-rich regions of the genome (2.9 of 3.1 billion base pairs) were 99 percent completed, to an accuracy rate of 99.99 percent. About 400 gaps remained with average fragment sizes of more than 27 million bases; these gaps were due to difficulties with their amplification for sequencing, perhaps because of unusual shape or toxicity to the bacteria used (Anonymous 2003; Wade 2003). The “human reference genome assembly” has been updated on an ongoing basis since 2003; efforts are made not only to fill in missing sequences in the genome but to represent population genomic diversity (Schneider et al. 2017). In 2020, GRCh38.p12, the current version of the human reference sequence, was found to have 191 euchromatic gaps and 592 non-euchromatic gaps (Zhao et al. 2020). Technological advances have increased scientists’ abilities to sequence the highly repetitive, heterochromatic regions of the genome, with the first gapless, telomere-to-telomere assembly of a human chromosome, the X chromosome, completed in 2020 (Miga 2021).