Skip to content

Google Summer of Code 2017 Wrap up

Ino de Bruijn edited this page Oct 15, 2019 · 1 revision

Google Summer of Code 2017 Wrap-up

The cBioPortal for Cancer Genomics is a resource designed to provide broad community access to cancer genomic data. It provides a unique user­-friendly and biology­-centric computational user interface, with the goal of making genomic data more easily accessible to translational scientists, biologists, and clinicians. The public instance of cBioPortal is now one of the most popular online resources for cancer genomics data and attracts more than 3,000 unique visitors (cancer researchers and clinicians) per day. In addition, there are dozens of local instances installed in medical centers, universities, government institutions, and pharmaceutical companies around the globe.

The cBioPortal software is available under an open source license via GitHub. The software is now developed and maintained by a multi-institutional team, consisting of Memorial Sloan Kettering Cancer Center (MSK), the Dana Farber Cancer Institute, Princess Margaret Cancer Centre in Toronto, Children's Hospital of Philadelphia, The Hyve in the Netherlands, Bilkent University in Ankara, Turkey, and Weill Cornell Medicine.

We have had a very exciting couple of months again mentoring students through Google Summer of Code (GSoC). This time we had 7 students, 6 of which did an awesome job. Our students came from India, Sri Lanka, Turkey and USA. At the end of the summer they all enthusiastically presented their work for our global team on Google Hangouts. Here we present a summary of their work.

Project 1: Visualizing circulating tumor DNA results in the cBioPortal patient view

ctDNA overview

Pamela Wu, a PhD student in computational biology in Dr. David Fenyo’s lab at the NYU Medical Center returned this summer after doing a very successful project with us last year. This time she helped us to improve our visualizations for circulating tumor DNA (ctDNA). ctDNA is DNA that has escaped from tumor cells and is free-floating in the bloodstream, meaning that genetic information about the developing tumor can be obtained without the need to invasively extract tumor tissue. One patient can now have a series of biopsies over time instead of just one or two whenever they get surgery. One can imagine monitoring cancer progression as well as detecting cancer early using this approach. In his project Pamela developed several prototypes to display the insights obtained from these “liquid biopsies” for cBioPortal’s existing Patient View Page.

For more info read Pamela Wu’s blog post: https://pambot.github.io/posts/gsoc2017-cbioportal

Project 2: Improved genome overview in patient view page

genome overview

In another project covering the Patient View Page, Ishu Kalra, a first year Bachelor’s student at the Indian Institute of Technology at Guwahati, worked on improving the genomic overview. The current genomic overview shows all genomic changes across the genome, but doesn’t allow for user interaction through e.g. zooming. In this project Ishu made several improvements to the popular open source igv.js viewer to support our use case in cBioPortal. See the pull request of his work here: https://github.com/JiaoJiao123/igv.js/pull/11

Project 3: Integrating MolecularMatch clinical trials data into cBioPortal

matching trials

The third and final project of the Patient View Page, covered one of the most sought after features. A way to link the genomic profile of a patient to a clinical trial. In this project, Sathya Bandara, an undergraduate student from the Department of Computer Science and Engineering at University of Moratuwa, implemented the improvements required to send the mutational information of a patient to an external service. She worked together with people from MolecularMatch to connect to their API as a proof of concept. See her blog post for more information: https://medium.com/@technospace/gsoc-2017-integrating-molecularmatch-clinical-trials-data-into-cbioportal-fd32da44ea9b

Project 4: Collaborative construction of cancer pathways with PathwayMapper

pathwaymapper

cBioPortal has a network view give insights into cancer pathways, but these are automatically generated. To get more informative and comprehensible pathways an extra step of manual curation is often necessary. PathwayMapper is an open source tool for interactive creation, editing, and sharing of cancer pathways. Leonard Dervishi, a Master’s student at Bilkent University in Turkey helped us over the summer to make several substantial improvements to the tool including, but not limited to: searching for genes, grid guidelines and resizing of nodes. For a comprehensive list of the added features read Leonard Dervishi’s summary: https://docs.google.com/document/d/1JElWO3knKrANMkvtp231tpH6N2Er5CSQM3WRHT-75gs/edit

Project 5: Standardizing clinical data from cBioPortal for easier comparison and analysis

clinical data analysis

We collect a lot of cancer genomics data from a variety of studies in cBioPortal. One big problem with clinical data is how to standardize all this data coming from different institutions and hospitals. In this project, Brian Sirovetz, a doctoral student in Computational Chemistry at Rice University in Texas, developed several analysis scripts to show the commonality between clinical data from different cancer genomics studies that can help inform cBioPortal’s curation team. The result from his summer’s work can be found in this pull request: https://github.com/cBioPortal/clinical-data-normalization/pull/1/

Project 6: Implement a Pipeline to Extract and Transform GDC Data

A popular source of cancer genomics data is the Genomic Data Commons Portal (GDC). Many researchers send cancer genomics data here. To facilitate easy analysis of this data in cBioPortal, Dixit Patel, who obtained his Bachelor’s in Computer Engineering from Mumbai University, developed a pipeline to import raw data from GDC into the cBioPortal database. His final work can be found on GitHub: https://github.com/cBioPortal/gdc-et-pipeline