To mechanize the extraction of data from cancer pathology reports established from a nationwide network of cancer registry programs, the researchers from ORNL has used new deep learning techniques.
The said registry programs were able to gather demographic and clinical information associated to the diagnosis, treatment, and history of cancer incidences in the US, which are also used by physicians as a consultation instrument for a wider examination of cancer.
The ORNL researchers used a dataset that is composed of 1,976 pathology reports. They further trained a deep-learning algorithm in order to multitask. It means that said dataset was made to concurrently carry out two different but closely related tasks for the extraction of information. During the first task, the algorithm was used to identify the key location of the cancer. In the second, on the other hand, it acknowledged which side of the human body the cancer can be found.
In this method, a neural network was created to be able to identify not only the meaning of words but the contextual relationships among them, as well. It revealed that the algorithm carried out substantially better than the other methods wherein the data related was not utilized.
The Director of the Health Data Sciences Institute at ORNL Georgia Tourassi said, "Intuitively this makes sense because carrying out the more difficult objective is where learning the context of related tasks becomes beneficial. Humans can do this type of learning because we understand the contextual relationships between words. This is what we're trying to implement with deep learning."
"Today we're making decisions about the effectiveness of treatment based on a very small percentage of cancer patients, who may not be representative of the whole patient population," she furthered.
"Our work shows deep learning's potential for creating resources that can capture the effectiveness of cancer treatments and diagnostic procedures and give the cancer community a greater understanding of how they perform in real life."
Oak Ridge National Laboratory (ORNL) is an American multiprogram science and technology national laboratory managed for the United States Department of Energy (DOE) by UT-Battelle. ORNL is the largest science and energy national laboratory in the Department of Energy system by surface, and by annual budget. ORNL is located in Oak Ridge, Tennessee, near Knoxville. ORNL's scientific programs focus on materials, neutron science, energy, high-performance computing, systems biology and national security.
The laboratory is home to several of the world's top supercomputers including the world's third most powerful supercomputer ranked by the TOP500, Titan, and is a leading neutron science and nuclear energy research facility that includes the Spallation Neutron Source and High Flux Isotope Reactor. ORNL hosts the Center for Nanophase Materials Sciences, the BioEnergy Science Center, and the Consortium for Advanced Simulation of Light-Water Reactors.
Join the Conversation