A unified Canadian clinical genomic database and community resource for standardizing and sharing genetic interpretations


The purpose of the Canadian Open Genetics Repository (COGR) is to design technologies that will help medical researchers and physicians diagnose, treat and cure both rare and common diseases. This work belongs to genomic medicine, a healthcare field built on our growing knowledge of the DNA comprising the human genome.


Canadian scientists have been making exciting discoveries about the relationship between genetic mutations and disease. However, applying genetic discoveries clinically, in order to improve patient outcomes, requires an extremely complex set of skills and technical tools, many of which are still beyond our grasp. The particular challenge in this project concerns the development of technologies needed to handle the vast amounts of data produced in the analysis of human DNA.

Genomic medicine is still an emerging field. Scientists working at different institutions have developed lab techniques, naming systems and checklists that differ significantly from one to another. These scientists have in turn created specialized databases whose utility is compromised by the many differences between them. The result, until now, is that sharing such data across the biomedical community has been difficult if not impossible.

In order to unlock the potential benefits of these resources, the project team proposes to create a unified, open-access, clinical-grade genetic database, i.e., a large repository based on a commonly shared platform and designed to hold all types of information related to human gene DNA variants and their relationship to disease. This database would draw from the genetic holdings in place at clinical labs and hospitals across Canada.

Activity 1. Design of variant assessment procedures.

To allow clinical and research laboratories alike to classify human genetic variants of all kinds and from all sources in a scalable, robust and automated manner, one key short-term scientific objective is to design and build a variant assessment tool. This tool is freely available to project participants and periodic version updates will be distributed. Having the multiple stakeholders assess variant significance in a systematic, comprehensive, and consistent manner will foster knowledge aggregation from different individuals, institutions, and areas of expertise. The overall effort is to facilitate the process of transforming data-variant holdings into a unified format, while eliminating discrepancies, omissions and duplication of effort. 

Activity 2. Data extraction and transfer.

The project team will devise methods and operating procedures to support the extraction of the variant data currently held within participating laboratories in Canada. Our bioinformatics team will be responsible for working with each laboratory to ensure that their data are transmitted safely and efficiently to a central repository.

Activity 3. Data access and dissemination.

Methods will be developed to make the data holdings both extremely accurate and readily accessible by all interested parties, including participating labs, clinicians, geneticists and scientists engaged in basic research. The tools necessary to carry out this phase will be developed in close collaboration with the US-based NCBI clinical genetics repository (ClinVar). To maximize the value of this resource to the community at large, our project team will put plans in place to encourage adoption of a unified platform, as well as to train and educate stakeholders as necessary.

We are confident this project will produce a wide range of sustainable benefits affecting many aspects of genomic medicine, clinical and basic research, patient advocacy, routine clinical care, and public health policy. Following are the main pieces of evidence we wish to cite in support of our project benefits:

1. Project offers solutions to widely acknowledged problems.

There are many challenges facing medical geneticists and the clinicians and patients who depend on their skill and judgment. These challenges begin with the great complexity and uncertainties surrounding novel variant assessment and genetic interpretation in general.

While genetics and genomics undergo major changes associated with next-generation sequencing technologies, other resources crucial to clinical applications have not kept pace. Similarly, discoveries made about the human genome through both basic and clinical research are not captured and made accessible to the community in a robust and systematic fashion. Yet it is of the very essence of medical research and practice that data be accessioned, stored, interpreted and updated in a consistent manner, based on commonly agreed protocols and operating procedures.

We see a wide consensus in the community that most Canadian and international genetic and genomic databases do not meet this standard, having been developed in silos that effectively prevent sharing and collaboration. Those are exactly the shortcomings we address in this project. The delivery of a robust, comprehensive, and semi-automated variant assessment tool with expert-curated guidelines, as well as approaches to extract data from a variety of knowledge-bases will directly help solve these shortcomings.

2. Project has the support of over 30 Canadian scientists, geneticists and clinicians.

Our supporters represent a wide cross-section of bioinformatic, scientific and clinical interests. The workgroups are dedicated to Data Collection and Standards, Bioinformatics and IT, and Outreach and Patient Advocacy respectively. This group represents every region of Canada, our supporters are familiar with the particular needs and resources of every major clinical laboratory in the country.

3. Project will be carried out in conjunction with parallel U.S. initiative.

We are very pleased to be collaborating with a large-scale, unprecedented study entitled “A Unified Clinical Genomics Database,” for which funding was obtained from the NHGRI of the U.S. National Institutes of Health. The U.S. group had firm commitments from 48 laboratories to contribute data from their own holdings. Indeed, this group has already established a body dedicated to the creation of a centralized, clinical grade genetic database (ClinVar) through the ICCG.

The benefits are two-fold. On one hand, we will have access to genetic data, subject-matter experts and the learning experience already accumulated. That will help ensure our work meets the highest possible standards. On the other hand, this relationship will strengthen our connections to a much broader community of participants and potential end-users. While our first concern is bringing socio-economic benefits to Canadians, we are naturally excited about the prospect of extending benefits to a much wider community, including colleagues in the US and other countries that will be involved in the U.S. project. Indeed, it would make little sense for the Canadian community to build a high-performance database that either duplicated or conflicted with the efforts of our American counterparts.

4. Project will reach and educate a wide cross-section of stakeholders and end-users.

Despite the fact this project is being formulated within the parameters of a highly specialized field (Bioinformatics and Computational Biology), clinical genetics has begun to touch many fields of medicine and biomedical research, and they in turn influence the state of public health as a component of the socio-economic welfare of virtually all of our citizens. We wish to emphasize the main factors affecting how far and how well we will extend the benefits accruing from this project. These factors involve both the range of expertise we have to support the project, and the plans we are formulating to raise awareness of our project, along with vehicles to provide help, education and training for interested parties.

The potential beneficiaries of our project begin with the scientific and medical community, the core group of experts whose work will be directly affected by what we accomplish. We expect many of our collaborators and supporters to act as ambassadors in their geographical regions and among fellow specialists in their fields. The specialties in question include: molecular genetics, cytogenetics and diagnostics, bioinformatics and computational biology, all branches of medicine, clinical genomics, pathology and lab medicine etc.

5. Project is timed to play a role in the development of personalized medicine.

While this is not the setting for discussion of the prospects for personalized medicine, whose future path remains a subject of lively debate, we see the timing of this project as fortuitous. However personalized medicine continues to unfold, it seems certain that clinical genetics and genomics will be crucial in bridging what is still a daunting gap between laboratory findings and their routine clinical applications. Moreover, some consensus has been seen in the community about the role of a unified clinical grade genetic and genomic database in this respect.

We further maintain our project is well positioned to help remove some of the chief obstacles to advancements in personalized healthcare. One of the most pervasive of these obstacles concerns a well-known problem in the community, namely the lack of standardized resources and protocols for interpreting the ever-increasing volumes of patient data being generated by clinical labs, in part due to next-generation sequencing technologies. The community needs more and better technical resources in clinical genetics – for application after data generation. The COGR team includes members with a wide range of expertise needed to promote resources for personalized medicine through research and development in clinical genetics and genomics.

6. Project will support a variety of deliverables versioned for key audiences.

In addition to training and educational materials and papers intended for peer-reviewed journals, we also intend to create versions of our reports for other audiences, including but not limited to health policymakers, patient advocate associations, professional and regulatory bodies, journalists, not-for-profit organizations whose mandate includes public health and/or wellness, and any other end-users who would benefit from learning about our project. Presentations will be available for workshops and webinars for online use.