Multigene testing using NGS (next-generation sequencing) provides a large amount of information and can detect multiple molecular alterations. Subsequent clinical interpretation is a time-consuming process necessary to select a treatment strategy. Existing databases often contain inconsistent information and are not regularly updated. The use of ESCAT levels of evidence requires a deep understanding of the nature of alterations and does not answer the question of which therapy option to select when multiple biomarkers with the same level of evidence are detected. To address these issues, we created the Clinical Relevance of Alterations in Cancer (CRAC) database on the relevance of detected alterations in specific genes, which are often analyzed as part of NGS panels. The team of oncologists and biologists assigned a CRAC score from 1 to 10 to each biomarker (a type of genomic alteration characteristic of specific genes) for 15 malignancies; an average score was entered into the database. CRAC scores are a numerical reflection of the following factors: therapy availability and the prospects of drug treatment with experimental drugs for patients with a particular type of tumor. A total of 134 genes and 15 of the most common tumor types have been selected for CRAC. The biomarker-nosology associations with CRAC scores in the range of 1-3 are the most frequent (n=2719 out of 3495; 77.8%), the least frequent ones (n=52 out of 3495; 1.5%) are with the highest CRAC scores 9 and 10. To estimate the practical effectiveness of the CRAC database, 208 reports on comprehensive molecular profiling were retrospectively analyzed; the applicability of CRAC was compared with the ESCAT level of evidence system. The highest CRAC scores corresponded to the ESCAT maximum levels of evidence: the range of scores 8-10 corresponded to evidence levels I and II. No biomarker within the same level of evidence was represented by the same CRAC score; the largest range of CRAC scores was observed for biomarkers of levels evidence IIIA and IV - from 2 to 10 and from 1 to 9, respectively. The use of CRAC scores allowed to identify additional 95 alterations with CRAC scores of 1-5 in the studied patients. The developed database is available at: https://crac.oncoatlas.ru/.
The proteogenomic search pipeline developed in this work has been applied for reanalysis of 40 publicly available shotgun proteomic datasets from various human tissues comprising more than 8000 individual LC-MS/MS runs, of which 5442 .raw data files were processed in total. This reanalysis was focused on searching for ADAR-mediated RNA editing events, their clustering across samples of different origins, and classification. In total, 33 recoded protein sites were identified in 21 datasets. Of those, 18 sites were detected in at least two datasets, representing the core human protein editome. In agreement with prior artworks, neural and cancer tissues were found to be enriched with recoded proteins. Quantitative analysis indicated that recoding the rate of specific sites did not directly depend on the levels of ADAR enzymes or targeted proteins themselves, rather it was governed by differential and yet undescribed regulation of interaction of enzymes with mRNA. Nine recoding sites conservative between humans and rodents were validated by targeted proteomics using stable isotope standards in the murine brain cortex and cerebellum, and an additional one was validated in human cerebrospinal fluid. In addition to previous data of the same type from cancer proteomes, we provide a comprehensive catalog of recoding events caused by ADAR RNA editing in the human proteome.