Article
作者: Olsen, Hugh E ; Wenger, Aaron M ; Haukness, Marina ; Eichler, Evan E ; Murphy, Terence D ; Rhie, Arang ; Chin, Chen-Shan ; Mc Cartney, Ann M ; Cechova, Monika ; Shafin, Kishwar ; O'Neill, Rachel J ; Phillippy, Adam M ; Miga, Karen H ; Thibaud-Nissen, Françoise ; Schatz, Michael C ; McNulty, Brandy M ; Hook, Paul W ; Hunt, Sarah E ; Harris, Robert ; Guarracino, Andrea ; Jain, Miten ; Rautiainen, Mikko ; Medvedev, Paul ; Tomaszkiewicz, Marta ; Nurk, Sergey ; Surapaneni, Likhitha ; Potapova, Tamara ; Gershman, Ariel ; Taravella Oill, Angela M ; Walenz, Brian P ; McDaniel, Jennifer ; Weissensteiner, Matthias H ; Lucas, Julian K ; Shepelev, Valery A ; Asri, Mobin ; Makalowski, Wojciech ; Hansen, Nancy F ; Harvey, William T ; Fungtammasan, Arkarachai ; Markovic, Christopher ; Olson, Nathan D ; Martin, Fergal J ; Diekhans, Mark ; Zook, Justin M ; Formenti, Giulio ; Gerton, Jennifer L ; Hubley, Robert M ; Sedlazeck, Fritz J ; Wilson, Melissa A ; Alexandrov, Ivan A ; Hwang, Stephen ; Ryabov, Fedor ; Hourlier, Thibaut ; Zarate, Samantha ; Makova, Kateryna D ; Grady, Patrick G S ; Sauria, Michael E G ; Haggerty, Leanne ; Paulin, Luis F ; McCoy, Rajiv C ; Kesharwani, Rupesh K ; Bzikadze, Andrey V ; Shumate, Alaina ; Watwood, Allison C ; Altemose, Nicolas ; Garrison, Erik ; Li, Heng ; Koren, Sergey ; Hoyt, Savannah J ; Salzberg, Steven L ; Taylor, Dylan J ; Hartley, Gabrielle A ; Allen, Jamie ; Chen, Nae-Chyun ; Garcia Giron, Carlos ; Halabian, Reza ; Vollger, Mitchell R ; Flicek, Paul ; Porubsky, David ; Storer, Jessica M ; Zhu, Yiming ; Logsdon, Glennis A ; Munson, Katherine M ; Lewis, Alexandra P ; Mikheenko, Alla ; Heinz, Jakob ; Timp, Winston
The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.