![]() ![]() Sequencing-based methods take several conceptual approaches to derive information about SVs from short read sequencing data. Although sequencing requires more time and money than the array-based method, it would be necessary for detecting a broad range of SVs to adopt the sequencing-based methods, as in recent projects aimed at identifying SVs on a population scale. Array-based methods are advantageous for high-throughput analysis, but they only detect certain types of SVs, have a lower sensitivity for small SVs, and have a lower resolution for determining breakpoints (BPs) than the sequencing-based methods. Two types of methods have been used to detect SVs: (1) array-based detection, including microarray comparative genome hybridization (array CGH), and (2) sequencing-based computational methods. ![]() Accordingly, SVs are associated with a number of human diseases, including neurodevelopmental disorders and cancers. Consequently, SVs could have higher impacts on gene functions and phenotypic changes than do SNVs and short indels. The genomic difference between individuals caused by SVs has been estimated to be 3–10 times higher than that by SNVs. SVs are largely responsible for the diversity and evolution of human genomes at both individual and population level. INSs are categorized into several classes based on the insertion sequences: mobile element insertions (MEIs), nuclear insertions of mitochondrial genome (NUMTs), viral element insertions (VEIs referred to in this study), and insertions of unspecified sequence. Imbalanced deletions (DELs) and duplications (DUPs) are also referred to as copy number variations (CNVs), with DUPs comprising tandem and interspersed types depending on the distance between the duplicated copies. INVs and TRAs are balanced forms, with no net change in a genome, and the remaining SVs are imbalanced forms. SVs are often considered separately from small variants, including single nucleotide variants (SNVs) and short insertions, and deletions (indels), as these are often formed by distinct mechanisms. Genomic structural variations (SVs) are generally defined as deletions (DELs), insertions (INSs), duplications (DUPs), inversions (INVs), and translocations (TRAs) of at least 50 bp in size. The selection of specific pairs of algorithms for overlapping calls promises to effectively improve the SV detection accuracy. These results suggest that careful selection of the algorithms for each type and size range of SVs is required for accurate calling of SVs. The results demonstrate that both the precision and recall for overlapping calls vary depending on the combinations of specific algorithms rather than the combinations of methods used in the algorithms. To improve the accuracy of SV calling, we systematically evaluate the accuracy of overlapping calls between possible combinations of algorithms for every type and size range of SVs. We enumerate potential good algorithms for each SV category, among which GRIDSS, Lumpy, SVseq2, SoftSV, Manta, and Wham are better algorithms in deletion or duplication categories. The results highlight a subset of algorithms that accurately call SVs depending on specific types and size ranges of the SVs and that accurately determine breakpoints, sizes, and genotypes of the SVs. ![]() We comprehensively evaluate the performance of 69 existing SV detection algorithms using multiple simulated and real WGS datasets. Although a number of existing SV detection algorithms can detect many types of SVs using whole genome sequencing (WGS) data, no single algorithm can call every type of SVs with high precision and high recall. Structural variations (SVs) or copy number variations (CNVs) greatly impact the functions of the genes encoded in the genome and are responsible for diverse human diseases. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |