25-Aug-2022

Long Read Sequencing Helps HLA Typing

Summary

Author Name: Dianna Gellar

Editor: Dianna Gellar Last Updated: 26-Aug-2022

HLA, human leukocyte surface antigen, a series of tightly interlocking motifs on the short arm of human chromosome 6, is the central basis for the immune system to recognize and differentiate between allogeneic substances. It is highly polymorphic and corresponds to a complex acquired immune system. It is importantly associated with a variety of autoimmune diseases, tumors, and infectious diseases. Also, HLA plays a crucial role in transplants such as organ and bone marrow transplant and is associated with serious adverse reactions to many drugs. Therefore, HLA typing is beneficial for research on immune-related diseases, screening of vaccine and drug targets, research of evolution, and transplantation.

However, due to its high degree of polymorphism, HLA forms a complex antigen typing system, resulting in no effective method to perform molecular typing with high throughput, high accuracy, and low cost for a long time. In particular, various sequencing manufacturers, in order to test the stability and accuracy of their services, have used HLA regions as detection targets for sequencing to demonstrate the advantages of their technologies. However, both serological screening and first- and next-generation sequencing are unable to accurately achieve the "gold standard", and ambiguous or erroneous results are still ineluctable.

Based on long-read sequencing technology, whole human genome sequencing can accurately develop the genetic variation of DNA sequences between samples and reference genome or between individuals, such as structural variation (SV) and copy number variation (CNV), by using the obtained 10~20kb long reads to compare with the reference genome, which cannot be detected by next-generation resequencing.

With its advantages of long read length and no PCR amplification (to avoid errors introduced by PCR amplification), long read human resequencing has become a brand-new strategy for mining genetic variation information of human genome. It can directly span large segments of structural variation, tandem repeat regions, rich GC regions, highly homologous regions, and highly polymorphic regions.

To achieve high-precision typing of HLA genes, sequencing technologies are needed to satisfy several conditions. The first one is high accuracy, which requires achieving accurate determination of SNPs. The second is complete haplotype analysis, which requires long-read sequencing for the analysis of haplotypes, and any assembly ambiguity or uneven coverage can lead to typing failure.

Long-read sequencing (PacBio's SMRT and Oxford Nanopore sequencing), together with the development of long-read amplicon sequencing technology, solves this molecular typing challenge and achieves targeted sequencing of HLA. Long amplicon sequencing spans the complete HLA class I and class II gene, allowing for HLA genotyping with high-precise, which enables unambiguous allele segregation without haplotype putative filling. HLA Typing by long-read sequencing detects variants within the 5' UTR, introns and 3' UTR regulatory regions, fully characterizes minor variants in polyclonal samples, e.g., cancer, and obtains direct evidence of novel HLA alleles by de novo assembly.

Currently, single-molecule real-time sequencing technology can be well applied to high-throughput HLA typing, enabling more accurate identification.