About the data
2000 patients and 500 healthy controls.
All patients have a manually curated ICD10 diagnosis code.
Data Modalities:
The clinical lab data:
between 0 to 30 terms and parameters per subject with abnormal findings being translated into HPO-Terms and introduced into knowledge graph.
Blood Test Measurements
Urine Test Measurement.
Unstructured data
Medical History Questionnaire (about 900 questions)
Doctor’s Letters
Genomics data
- Top 10-20 prominent genes selected from about 6000 mutations per subject. Genetic variations that showed a CADD-score of 6 or higher were selected and only the variant with highest value per gene was chosen.
Proteomics data
- Between 1000-2000 proteins with quantitative values for each subject.
Last modified November 12, 2024: changed about the data (9fdf52a)