Our website uses cookies. We use cookies to remember settings and to help provide you with the best experience we can. We also use cookies to continuously improve our website by compiling visitor statistics. Read more about cookies

Big Data Core

Biomedical research has rapidly turned into a data intensive research field. Sequencing of complete cancer genomes has provided a wealth of information and provides ample opportunities for both research and healthcare. Institute-wide coordination concerning data stewardship as well as data and computational infrastructures is pivotal for making optimal use of these data for pediatric cancer research and healthcare. The Big Data Core, part of the Kemmeren group, consolidates these activities and provides bioinformatics analyses for the Máxima biobank and diagnostic lab.
What we offer
Data stewardship
The data intense activities within the Máxima also require a firm commitment in data stewardship for proper data life cycle management, sharing and reuse of data based on FAIR (Findable, Accessible, Interoperable and Reusable) principles. The Big Data Core coordinates all data stewardship activities for the preclinical research domain within the Máxima. These include providing data management plans (DMPs) for research projects, advising on compliancy of Máxima data collections with regard to research data management (RDM) policies, alignment of DMPs with FAIR data principles, creating awareness for RDM and liaison with other institute-wide data management policies.

Data infrastructure
In collaboration with and support from IDT, the “research data integration platform” has been setup to harmonize research data structures and share research data that can be used by all data intensive research within the Máxima. There are a number of crucial components within this infrastructure that are essential for biobanking and data intensive research projects: unique patient and sample identifiers across data sources, FAIR data resources and tooling, standardized API’s for modular extension of the platform and automation purposes. The platform also has a clear separation between the clinical and research domain through a pseudonymization layer, thereby facilitating differences in development speed in research vs. healthcare. This allows for more flexibility in research, while at the same time facilitating translation of research findings to clinical utility.

Biobank bioinformatics
The biobank bioinformatics team is responsible for providing standardized analyses for biobank WES, WGS and RNA-seq samples. The activities not only involve providing standardized analyses for detecting somatic variants (SNVs, indels, CNVs and SVs) in tumor and organoid samples, but also include setting up and coordinating the entire infrastructure for data management, high-throughput computing and workflow management for genomics biobank data.

Translational bioinformatics
Together with Dr. Tops, head of the diagnostics lab, the translational bioinformatics team is responsible for providing standardized bioinformatics analyses within a diagnostic setting. By using the same infrastructure components as setup for the biobank bioinformatics activities, we have developed a unique platform shared between diagnostics and research that allows us to quickly transform key findings and technological improvements from research to diagnostic and clinical applications, thereby directly benefitting patient care. We provide WES and RNA sequencing based diagnostics for all patients within the Máxima.
Personnel
Coordination & data stewardship
Patrick Kemmeren, Principal Investigator & head Big Data Core
Jet Zoon, data steward research

Translational Bioinformatics team
Jayne Hehir - Kwa, Co PI & team lead
Eugène Verwiel, bioinformatician
Douwe van der Leest, bioinformatician

Biobank Bioinformatics team
Hinri Kerstens, senior post-doc & team lead
Shashi Badloe, bioinformatician
Alex Janse, bioinformatician

Data science & bioinformatics
John Baker-Hernandez, bioinformatician

Contact 

Patrick Kemmeren



Big Data Core