Scientists from deCODE genetics, a subsidiary of Amgen, and the University of Iceland have developed a new artificial intelligence method that enhances the accuracy of predicting RNA splicing, a fundamental process in gene regulation. The approach, named Spliceformer-45k, utilizes advanced transformer-based AI models and shows improved performance over existing tools like SpliceAI.
RNA splicing is a critical step in gene expression, contributing to genetic diversity by allowing a single gene to produce multiple proteins.
Misregulation of splicing is associated with various diseases, including cancer and neurodegenerative disorders. Accurate prediction of splicing patterns is essential for understanding genetic diseases and developing targeted therapies.
Using data from nearly 18,000 Icelandic RNA samples and the Genotype-Tissue Expression (GTEx) project, Spliceformer-45k demonstrated superior performance in detecting splice junctions and disease-related splicing variants. Key findings include:
Higher Precision: Achieved a PR-AUC score of 0.834, compared to 0.820 by SpliceAI-10k.
Improved Disease Variant Detection: Outperformed SpliceAI in identifying pathogenic variants from the ClinVar database.
This advancement provides valuable tools for genetic research and clinical diagnostics, aiding in the study and treatment of splicing-related diseases.