<img height="1" width="1" alt="facebook pixel" style="display:none" src="https://www.facebook.com/tr?id=347601569260523&amp;ev=PageView&amp;noscript=1">

The Hidden Power Source: How Your Cancer Registry Can Fuel Groundbreaking Oncology Research

By Michele Webb, ODS-C

Cancer registry data is like solar energy: it has been used in a form for a long time, but only with recent technological advances has its full potential begun to be realized.

From the first hospital-based registry in 1926 to the establishment of a national registry in 19711 oncology departments have long-since relied on registries to help them understand cancer incidence and improve patient outcomes and care. But with the rise of natural language processing (NLP) and automation, the value of registry data as a research tool is finally being enhanced. 

This article reveals how that shift has occurred and why it is so important for more registries to embrace the untapped power of their data. 

Expect to learn: 

  • The varied ways cancer registries currently support research
  • The reason 55% of cases for the 3 most common forms of cancer in America have incomplete data
  • The hidden value automation can unlock in your registry 

The Hidden Power Source: How Your Cancer Registry Can Fuel Groundbreaking Oncology Research

How Do Researchers Use Cancer Registry Data? 

Cancer registries provide a vast source of “real world” data that helps researchers understand patterns in cancer treatment. One academic overview cites three important use cases:  

1. Measuring the Quality of Care 

Registries are a vital tool to understand the evolving patterns of care and assess the impact of conventional and novel treatments. With large multi-site datasets, it is possible to identify variations in patient outcomes and survivorship and point researchers to further analyze the source of these variations - ultimately helping to improve care over time. 

2. Biomarker and Translational Research 

Researchers increasingly match clinical data collected in registries with biospecimen data to explore and evaluate potential biomarkers that will ultimately help identify cancer cases faster. While interactions between biomarkers are unlikely to be captured in clinical trials, the vast datasets registries supply can reveal meaningful associations - as well as identify clinical trial candidates. 

3. Registry-Based Clinical Trials 

Building on existing registries as a means of randomizing patients, registry-based clinical trials are a solution to the immense lack of randomized controlled trials (RCTs) currently being run in the United States. Registry-based clinical trials are already being used to assess outcomes for specific demographic groups and may evolve to encompass everything from palliative care to consumer-led trials. 

But given the wide-ranging value registries offer, we might ask: why isn’t more research done using registry data? 

3 Challenges for Cancer Data Management 

1. Ethics and Regulation 

Cancer data is extremely sensitive and heavily protected via regulation. But academics and researchers routinely argue that some of these protections get in the way of research efforts. A 2022 article in the Journal of Registry Management states the case bluntly: "State legislation that interferes with data sharing between states, and consequently compromises effective surveillance and research at the national level, is a threat to the common good and therefore unethical.”

Such legislation can also produce a chilling effect that leads registries to resist digitization and novel technologies - even when they are preferable to outdated and unsustainable traditional methods. 

2. Manual Processes 

Manual casefinding, abstracting, and reporting still dominate cancer registries, but these processes are highly inefficient. Registrars tell us they take, on average, 60-120 minutes to complete a single abstracted report 4-5 cases per day. Not only is this not fast enough to manage growing caseloads – it also has implications for data quality and human error. 

A recent analysis found that among the 3 most common forms of cancer in US registries, over 55% of cases had incomplete data. This has serious consequences: 71% of patients with non-small cell lung cancer were missing data for variables of interests - and the 2-year survival rate was 17.6 percentage points lower for patients without this data.

3. Data Volume 

Most registries struggle to handle the volume of cancer data they receive. Researchers expect there to be slightly more than 5,479 new cancer cases each day in 2024, and 58% of registries say that another FTE registrar is a “strong” or “extreme” need. The most up-to-date research reveals that over 60% of registries employ 2 or fewer full-time registrars.* And while we may see this number shift when the NCRA’s new workforce study is published later in the year, it certainly tracks with the fact that 25% of registries now rely on partial outsourcing – even though managers routinely say these workers are less skilled and less invested in the registry’s purpose.

These issues are exacerbated by: 

  • A lack of system interoperability
  • The inability to communicate effectively across disciplines
  • Differing (and frequently changing) data standards 

The net result? Registries face growing backlogs; mounting pressure from accreditation and funding bodies to meet reporting deadlines; and a growing awareness that things will only get worse – unless they find a new way of operating. 

How Leading Organizations Unlock the Hidden Value of Registry Data 

The fundamental challenge for cancer registries is technological: how can they safely and ethically reimagine their abstracting and reporting processes to more efficiently handle massive datasets?  

The answer for a growing number of leading organizations is through oncology informatics.  

Informatics tools such as Inspirata’s AI-driven software empower their registrars and remove the various barriers that hold them back: 

  • Automated workflows can augment the work of a human registrar by casefinding, pre-abstracting, and reporting on cancer cases faster – thus significantly reducing backlogs
  • Artificial intelligence can convert discrete information into clear, usable insights to deliver added value to researchers
  • Extra time that registrars are given back can be reallocated to other activities such as collaborating with researchers and directly supporting clinicians 

So, what does this mean for the next phase of registry-driven research? 

3 Ways Cancer Registries Will Fuel Research in the Future 

1. Natural Language Processing (NLP) 

Cancer registry data is not just underutilized – it is also often hidden within dense, unstructured medical documents such as pathology reports. This makes it particularly time-consuming to extract, and because there is no promise the effort will amount to any significant insight, it is difficult to justify allocating scarce resources to the analysis of these documents.  

However, a growing number of registries are using natural language processing (NLP) to analyze such unstructured data. Because a large proportion of cancer data is now recorded in electronic form (through electronic health records (EHRs)), NLP solutions can be used on vast datasets to fix gaps in registry data and reveal insights that would otherwise be missed. 

2. Increased Data Volume 

As registries face growing caseloads, automated solutions will augment registrars’ workflows and enable them to accelerate reporting. This means a larger volume of quality data can be delivered to researchers – and the timeliness of data will be dramatically increased.  

Researchers and clinicians can leverage these larger datasets in a variety of ways, from assessing the economic implications of specific treatments to providing personalized treatment summaries. But most importantly, delivering this data faster will enable greater cooperation between institutions and research teams. 

3. Expanding Collaboration 

With the notable exception of state-level registries, most detailed cancer data is isolated within individual organizations. While researchers can request access to information, they must go through an Institutional Review Board (IRB) and ethics board – often creating a significant lag. 

This is being changed by a combination of digitization, tools like NLP that unlock large datasets, and a growing consensus around the importance of cross-institutional collaboration. The coming years will be a golden age of collaboration – often between institutions across multiple countries – fueled in large part by cancer registry data. 

Leverage Inspirata to Be at the Forefront of Oncology Care 

Cancer registries have clearly come a long way since their inception, but their evolution from data store to research rocket fuel is far from complete. The transformation we’ve described depends on more registries embracing cancer informatics – and opting for the right technology partners.  

With more than 30 million pathology, radiology, and related clinical reports processed each year by over 275 organizations that trust our software, Inspirata helps leading cancer registries leverage cutting-edge informatics to elevate clinical research.  

Want to see it in action? 

Request a Demo


1. https://ascopubs.org/doi/10.1200/CCI.20.00123

*This NCRA report was published in 2011. We are anxiously awaiting the release of the 2024 NCRA report and will update this article with our findings as soon as possible. 

Tags: cancer research, oncology informatics