CTO Clinical Proteome Informatics Training

November 3, 2025

Flamingo Paradise Beach Hotel, Protaras, Cyprus

Welcome to the CTO Clinical Proteome Informatics Training Day, a hands-on introductory training event on selected topics in omics data analysis for researchers, clinicians, and students. During the training participants will go through practical applications of mass spectrometry data analysis, variant-aware protein identification, and multi-omics integration.

During the training session we will have bring-own-data breaks, in which participants will have the chance to discuss with the experts their own data, data analysis, and ask for advice and possible solutions.

This day is structured to provide both foundational knowledge and advanced techniques relevant to clinical and translational research.

Training Objectives
By the end of the day, participants will be able to:

· Perform DDA and DIA proteome analysis

· Perform proteogenomics data analysis

· Explore multi-omics data integration strategies

Program

Training Lead: David L. Tabb
Affiliation: University Medical Center of Groningen
Estimated Duration: around 2 hours
Overview
Proteomics technologies such as LC-MS/MS (liquid chromatography coupled with tandem mass spectrometry) are a natural fit for clinical biology because these technologies powerfully detect changes in protein concentration and modification.  This hands-on training will introduce the key steps required to quantify and differentiate proteins, matching tandem mass spectra to potential peptide sequences and inferring proteins from them, integrating chromatograms to estimate quantities for peptides and proteins, and statistically testing for differences in protein quantities between two cohorts.
Training Dataset and software
This study will use data from PXD044981 (https://doi.org/10.1021/acs.jproteome.4c00048), which applied both DDA (the classic LC-MS/MS identification technique) and DIA (a method intended to quantify thousands of peptides at once) to measure a yeast background with human proteins spiked at different quantities.  Dr. Tabb will guide the students through the process of identifying and quantifying these data sets in FragPipe (https://fragpipe.nesvilab.org/), followed by statistics and visualization in the FragPipe Analyst web interface.
Preparation & Requirements
Participants should bring laptops to be able to carry out this training on their own computers; for best results, we suggest that the computers feature at least 8GB of RAM. FragPipe can be used in Microsoft Windows and Linux.  Prior to the training, we recommend that students try to install FragPipe on their laptops in case they need administrator passwords for this process.
Training Lead: Yassene MohammedAffiliation: Center for Proteomics and Metabolomics, Leiden University Medical Center, the Netherlands; Gerald Bronfman Department of Oncology, McGill University, Canada
Estimated Duration: around 2 hours
Overview
Proteogenomics combines proteomics and genomics to improve protein identification and uncover novel protein-coding regions, splice variants, and mutations relevant to disease. This training session introduces participants to the core concepts and workflows of proteogenomics, including:
-              Generating customized protein databases
-              Use PEFF to encode known variants and PTMs
-              Searching mass spectrometry data against these databases
-              Perform variant-aware searches using the Comet search engine
-              Identifying variant peptides and variant proteins
-              Interpret search results in the context of proteogenomics
-              Developing Quantitative Targeted Proteogenomics assays
Training Dataset
Participants will work with a curated proteomics dataset from deep proteomics on 9 cancer cell lines that was acquired in our laboratory. The datasets will allow the participant to perform
-              Canonical and non-canonical protein identifications
-              Variant peptides derived from somatic mutations
-              Integration of transcriptomic evidence to support proteomic discoveries
Software & Workflow
We will use Comet in the Trans Proteomics Pipeline and guide the participants through:
-              Database generation
-              Mass spectrometry search with customized databases
-              Annotation and visualization of variant peptides
Preparation & Requirements
-              Bring own laptop
-              Please install Trans Proteomics Pipeline – TPP (http://www.tppms.org/) in advance
Training Lead: George M. Spyrou
Training Lecturers: Efi Athieniti, PhD candidate, and George Spyrou, PhD

Affiliation: Bioinformatics Department, The Cyprus Institute of Neurology and Genetics
Estimated Duration: Around 2 hours (1 hour Lecture, 1 hour Tutorial)
Overview
The advent of omics technologies has provided the stepping-stone for the emergence of Systems Bioinformatics. These technologies provide a spectrum of information ranging from genomics, transcriptomics and proteomics to epigenomics, pharmacogenomics, metagenomics and metabolomics. Systems Bioinformatics is the framework in which systems approaches are applied to such data, setting the level of resolution as well as the boundary of the system of interest and studying the emerging properties of the system as a whole rather than the sum of the properties derived from the system’s individual components. Key approaches in Systems Bioinformatics leverage the construction of multiple networks representing each level of the omics spectrum and their integration, leading to more meaningful downstream analysis regarding disease underlying mechanisms, candidate biomarkers and repurposed drugs.
Participants will learn how to:
-              Construct and analyze networks from different types of molecular and clinical data
-              Apply methods of data integration
-              Visualize and interpret integrated omics profiles
-              Apply advanced pathway analytics
-              Explore use cases in mechanism understanding, biomarker discovery, and drug repurposing
Training Dataset
Participants will work with available datasets, that will be used to demonstrate integrative workflows and highlight biological insights that emerge from network-based and other integrative methods.
Software & Workflow
The session will include hands-on exercises using:
-              Cytoscape
-              Web-based bioinformatics servers
-              MOFA2
-              Preparation & Requirements
Bring your own laptop with installed Cytoscape and MOFA2 R package.

Bring Your Data Breakouts:

Between sessions, we offer extended breaks with “Bring Your Data” breakout groups. These informal, small-group consultations allow PIs, postdocs, and PhD students to discuss their own datasets and analysis challenges with domain experts