From The Editor | June 21, 2024

Ten Years In, Project Data Sphere Is Snowballing

Ben Comer_2022_1

By Ben Comer, Chief Editor, Life Science Leader

SeanKhozin_hi-rez
Sean Khozin, MD, MPH

Ten years ago, I reported on the launch of Project Data Sphere, an ambitious clinical trial data-sharing initiative supported by several Big Pharmas, academic research institutions, and government agencies, including the FDA.

Project Data Sphere’s core mission is straightforward: collect de-identified oncology trial data sets from the aforementioned supporters, and make that data available to almost anyone for the purposes of research. SAS, a software and data analytics firm, provides Project Data Sphere users with free tools for analyzing, visualizing, and interrogating trial data.

At launch, Project Data Sphere was focused on collecting and sharing comparator arm trial data exclusively. Comparator arm studies provide a window into disease progression associated with specific cancer types, and biopharmaceutical companies are more willing to share comparator arm data, because it doesn’t expose their IP assets.

Beginning with nine comparator arm data sets contributed by Memorial Sloan Kettering and a handful of Big Pharmas at launch, Project Data Sphere now has data from 252 clinical trials, representing over 250,000 patients. Last February, Sean Khozin, MD, an oncologist, physician scientist, AI enthusiast and founding member of FDA’s Oncology Center of Excellence, became CEO of the CEO Roundtable on Cancer (CEORT), and leader of CEORT’s Project Data Sphere initiative. To find out what Project Data Sphere has accomplished over the last decade, and what Khozin’s priorities are as its new leader, we connected by video in mid-June.

From Data To Deliverables

A growing number of biopharmaceutical companies have contributed data to Project Data Sphere over the last 10 years, and the National Cancer Institute, part of the NIH, also has contributed a significant amount of datasets, says Khozin. Insights gleaned from shared data housed within Project Data Sphere have led to hundreds of publications and several key deliverables.

An early demonstration of the power of shared data came in 2015, when Project Data Sphere collaborated with the DREAM Project, Prostate Cancer Foundation, and Sage Bionetworks to launch the global Prostate Cancer DREAM Challenge. That challenge produced, in a matter of months, a new prognostic model for prostate cancer that “actually beat the standard model at the time,” says Khozin.

Project Data Sphere also began collaborating with the FDA when Khozin still worked at the agency’s Oncology Center of Excellence, and had become the founding executive director of FDA’s Information Exchange and Data Transformation (INFORMED) incubator. Together, Project Data Sphere and the FDA began examining immune-related adverse events (irAEs), or adverse events associated with immunotherapies. Those adverse events are “sometimes idiosyncratic, and can be hard for an oncologist to manage, because there’s very little known about how best to manage these adverse events,” says Khozin.

That effort led to new ICD codes, which will facilitate tracking of irAEs in clinical practice, and will allow oncologists to get reimbursed for managing irAEs. The new ICD codes will be implemented across the country on October 1. “We think that’s going to substantially improve patient outcomes,” says Khozin. Additionally, Project Data Sphere cofounded the Immune-Related Adverse Events Consortium, which involves 28 organizations focused on new interventional studies for immunotherapies, biomarker development, and management of irAEs.

Expanding Precompetitive Initiatives

Project Data Sphere continues to provide open access to its data for anyone that registers on the website. But in the quest for increasingly valuable data, Khozin is expanding Project Data Sphere’s precompetitive activities, and restricting access to those data. Biopharma companies “may not feel comfortable contributing as part of an open access data sharing platform,” says Khozin. “But they are comfortable doing that as part of a precompetitive effort, especially now that we have the power of AI below our wings, and we can extract a lot more insights from the data.”

Developing “foundation AI models” is a key aspect of Project Data Sphere’s expanding precompetitive initiatives. For example, the autoRECIST initiative, which began while Khozin was still at FDA, aims to help expedite clinical development and lower costs by using AI and machine learning to automate tumor assessments according to the RECIST criteria. “The autoRECIST tool that we’re developing right now can significantly streamline clinical development programs,” says Khozin. “If we think about what’s happening in drug development, there’s downward pressure on pricing, and there is inflationary pressure at the other end. So companies have to be lean.”

A major theme of the growing precompetitive work, says Khozin, is a better definition of patient response to treatment. Not all patients respond to a given therapy, and those that do respond, often respond differently, because no two patients are the same. A better understanding of those differences can lead to new biomarkers for clinical development, and identify new patient populations and new drug targets. “That’s a value proposition that we’re going to be surfacing in the news few months,” says Khozin.

That work is valuable broadly, since no single biopharma company, regardless of size, will be able to do it alone, says Khozin. “In fact, if you look at what companies are doing right now, which is kind of paradoxical and ironic, they are pulling back on data science and AI…they’ve shrunk their data science teams,” even as AI interest increases. “A lot of what they thought they could do internally is better done in collaboration with others.”

Ten years in, Project Data Sphere’s efforts and initiatives are expanding. And it remains focused on collecting and analyzing human clinical trial data to improve patient outcomes, and to make drug development more successful. “I’m quite confident that the insights we’ve generated at Project Data Sphere have been part of the calculation of companies in designing clinical trials,” says Khozin. “Companies benefit from contributing data to Project Data Sphere as an open access data sharing platform, and in the precompetitive setting.”