Artificial Intelligence in Cancer Detection

Artificial Intelligence in Cancer Detection
 Let's Talk

Share this article

Each year, between 18 to 20 million people are diagnosed with cancer and about 9 million people succumb to cancer. Further, it is projected that by 2030 world will have around 26 million cancer cases and around 17 million fatalities per year. There are more than 100 types of cancer with breast, lung, colon, and prostate cancers being the most prevalent worldwide.

Number of cancer cases and deaths over the years

Figure 1: Number of cancer cases and deaths over the years

Source:World Health Organization

The failure to diagnose cancer on time accounts for half of the cancer patients' casualties each year. Most of the cancer cases are detected after stage II and there are various factors that contribute to the late cancer detection. The one of the major causes is delay from patients. Further, many research studies have revealed that there are some cancers which grow in your body for years however stay unobserved within the laboratory tests. Grail, a healthcare company, stated that late cancer detection accounts for about 70% of cancer patients' deaths. It's time for our healthcare industry to catch up with the rest of the modern world and become smart by utilizing the cutting-edge technologies like Machine Learning (ML). As a result, multiple companies are working to implement Artificial Intelligence (AI) technologies in healthcare, particularly in the early diagnosis of cancer.

AI and ML algorithms can aid the healthcare system in multiple ways, such as examining reports like X-rays, MRIs, Scans, etc., predicting and identifying diseases (better than the present medical test machines), making different treatment plans based on the patient’s disease and historical data, and assisting in robotic surgeries. Here, we are emphasizing on AI or ML based analysis of the reports and forecasting the possibility of developing cancer in the future.

Before exploring how AI and ML technologies can be used, let us understand how cancer is detected in conventional medical methods. Cancer is detected by observing the mutation of a gene which is a unit of DNA (deoxyribonucleic acid). The DNA contains instructions for producing a group of proteins, which are the building blocks of our bodies. Now, if there is a mutation in cell’s DNA then an aberrant protein will be produced that may lead to cancer. However, there are only some harmful mutations that can result in cancer. The Machine Learning techniques, particularly Deep Learning, processes huge amount of data such as genomic data, medical images, lab reports, clinical information, etc., to discover the cancer-causing harmful mutations developing in the body.

Let’s take a peek at a few of the tech giants that are integrating artificial intelligence into healthcare systems.


In 2006, Google launched a branch called Google Health, however it was shut down in 2008. Their initial goal was to create a consolidated database of all the patients' medical records. Google Health restarted in 2018 and started concentrating on AI-enabled diagnosis. To create Deep Learning programs that can aid in the early diagnosis of cancer, Google also collaborated with DeepMind's health team. Google is primarily focusing on applying AI to Genomic Analysis as mutations in the genes are majorly responsible for cancer. Google also collaborated with Pacific Biosciences (PacBio) to make the best use of their open-source tool ‘Deep Variant’. The healthcare system is now able to identify several hereditary diseases accurately and early, such as breast cancer, pulmonary arterial hypertension, or neurodevelopmental abnormalities, thanks to PacBio's work on genome sequencing technology (i.e., Long Read HiFi) and the Deep Variant program. The Deep Variant, in its simplest form, is based on a Convolutional Neural Network (CNN) that identifies the genome variants in sequencing data provided by PacBio and ascertains the sequencing mistakes in sequencing data. Google has also developed another open-source tool named ‘Deep Consensus’, which is also used with PacBio’s genome sequencing platforms to get more efficient and accurate results.

Secondly, Google is emphasizing on using Deep Learning in Medical Imaging & Diagnostics to support clinicians and to detect range of ailments such as eyes diseases, skin diseases, anemia, lung cancer, breast cancer, etc. Lung cancer and breast cancer both have extremely high fatality rates. Google developed a deep learning algorithm that quite precisely forecasts the risk of developing lung cancer even when the tumor is extremely small and undetectable with the current diagnosing techniques. Further, Google has partnered Northwestern Medicine to apply AI models for breast cancer prediction. Initially, patients’ mammograms will be reviewed by AI models and if the models show, for a particular patient, that there is a higher risk of getting malignancy then the radiologist will review that case with a higher priority.

Google has at least a couple of hundred patents or patent applications that are related to the application of Artificial Intelligence, or specifically Deep Learning, in medical sector. Around 15-20 patents or patent applications disclose the lung cancer and breast cancer prediction using deep learning techniques. Many patents and patent applications relate to the implementation of Deep Learning algorithms in imaging & diagnostic. Using deep learning models to enhance lung cancer screening and diagnosis is one of Google's intriguing patent filings. Here, a patient's data set is used to train a three-dimensional Deep Convolutional Neural Network (DCNN) to forecast the likelihood that cancer will be present in lung tissue. The same data set is additionally given to a two-stage prediction model, the first of which forecasts the location of one or more three-dimensional cancer candidates. The second stage is a probability model that gives each of the three-dimensional cancer candidates a cancer probability ('p') value.

Another interesting patent from Google relates to the use of Augmented Reality (AR) in a microscope used by a pathologist to review the slides containing biological samples. A digital camera is used in the procedure to take a picture of a biological sample, which is then sent to a compute unit that uses a Deep Convolutional Neural Network (DCNN)-based pattern recognizer to find interesting areas in the biological sample. The original digital image of the biological sample is then overlaid in real-time with an augmentation (such as a boundary or an outline). This makes it simple for the pathologist to determine whether the area of interest includes cancer cells.

A prototype of AR microscope system integrating a compute unit, AR display and a camera to image field of view

Figure 2: A prototype of AR microscope system integrating a compute unit, AR display and a camera to image field of view (FoV)

Source:Detecting cancer in real-time with machine learning

The AR microscope is capable of drawing an outline over the digital image of biological sample during observation

Figure 3: The AR microscope is capable of drawing an outline over the digital image of biological sample during observation

Source:Detecting cancer in real-time with machine learning


Grail, founded in 2015, is a big biotechnology and healthcare company having engineers, scientists, and medical professionals that are working to find solutions for detection of cancer at early stages. Using large-scale datasets, Next-Generation Sequencing (NGS) and Machine-Learning algorithms, Grail has developed a blood test named GalleriTM that can detect 50 types of cancer (including the most dangerous ones, such as lung cancer, breast cancer, prostate cancer, colon cancer, etc.) at early stages. Galleri basically examines the abnormalities in the methylation patterns of cell-free nucleic acids (cfDNA). All of the cells in our body, including malignant ones, releases cfDNA into the blood and the Galleri uses a proprietary technology to detect the cancerous cell-free nucleic acids. Galleri has the additional benefit of revealing the location of the malignancy in the body. It seems intriguing that cancer might be diagnosed early with a straightforward blood test rather than undergoing a painful biopsy, but according to Grail this is the first screening test and if the result is positive then additional screening procedures should be conducted to confirm the presence of cancer.

Grail is a start-up company, but they have taken a serious approach to IP. They have filed more than 150 patents or patent applications in this domain and that too in various geographic locations. Grail has very interesting patents in its set that includes detecting cancer using cell-free nucleic acids (cfDNA), detecting cancer using cell-free ribonucleic acids (cfRNA), classifying a cancer using Deep Learning techniques, etc.


In 2012, IBM began showing interest in the application of artificial intelligence in healthcare, and more specifically, in the prediction of cancer. To train its Deep Neural Network model, it launched "Watson for Oncology" and started gathering patient data, such as blood reports, imaging reports (MRI scans), genetic mutation reports, etc. The model was trained to recognize many malignancies, including bladder, prostate, breast, and lung cancer. Around 2017, IBM announced that its Artificial Intelligent Watson for Oncology is ready for clinical trials. IBM asserted that its AI model was effective in detecting cancer at an early stage. Furthermore, they claimed that by looking at the patients' imaging reports, their algorithm can detect breast cancer. Their approach estimated the likelihood of breast cancer in those patients for whom the radiologists had issued a "non-malignant" diagnosis. Later, breast cancer was discovered in half of the individuals. We may therefore conclude that their model had a 50% accuracy rate, which suggests that there is room for improvement. The model can be strengthened with further training data. Oncologists and radiologists in the healthcare industry, however, disagreed with the ’BM's assertions. Some radiologists claim that Watson for Oncology failed horribly. Recently, IBM has sold Watson Health to different companies.

IBM has a small number, probably less than hundred patents or patent applications and they are focused on patents or patent applications that are related to the use of Artificial Intelligence in cancer prediction. One of the interesting patent applications from IBM describes an AI system that uses two Deep Learning models (trained with labelled images) to identify the breast cancer or chances of having breast cancer in future. The first model is trained from prior mammograms and the second model is trained from current mammograms.

Besides these three major companies, many other companies along with some famous universities like Progenics Pharmaceuticals, Siemens Healthineers, University of California, Seoul National University, Shanghai Jiao Tong University, Stanford University, University of Texas, Chinese Academy of Science, etc., are putting their efforts to develop Machine Learning algorithms for early detection of the cancer. Further, researchers at IIT Madras, have recently created a supervised machine learning tool that can aid in detecting cancer. The supervised ML tool uses gene mutations, gene expressions and number variations in genes, and perturbations in the biological network owing to an altered gene expression to classify genes as tumor suppressor, neutral, or oncogenes. The researchers have named this tool, “PIVOT”.


Although, the Machine Learning algorithms are assisting doctors or clinicians in identifying the cancer in its early stages, their reliability and consistency is still under dispute. Even though these algorithms effectively identified cancer in people for whom existing lab technologies had failed to do so, they have also generated reports of false positives and false negatives. Because they were trained on finite data (big but limited), some medical experts continue to question the veracity of these algorithms. Additionally, the fundamental methodology that these algorithms employ is not disclosed to the clinicians, creating a lack of transparency. According to a recent article from the National Cancer Institute, these AI systems can also be programmed to produce biased results. National Cancer Institute also acknowledged that some algorithms have exhibited less accuracy for Black people in comparison to White people. Because these AI models take patient’s historical data as input, they can be made to reflect biasing based on gender, religion, color, etc. However, some experts believe that Artificial Intelligence is a bliss for cancer patients and younger oncologists. All agree that this methodology has not reached yet its full potential. Therefore, it needs to be supported by conventional lab tests, but it is definitely the way forward.

It appears that factors such as accuracy, transparency, and cost of AI in healthcare will determine its future. In addition, trends indicate that oncology will use AI more and more frequently, exponentially so. One thing is certain that as AI and ML models become more robust there will be many more companies jumping into this field and the IP landscape of AI/ML in health care will change drastically.