A data challenge fit for AI
Artificial intelligence has the opportunity to cause a huge disruption in the way healthcare operates, but the challenge is gathering enough data. So what if, instead of focusing all our energy on this data collection process, we can adjust our deep learning algorithms to require less data? That’s what we do with 3D G-CNNs!
Marysia Winkels, Machine Learning Engineer at Aidence, managed to achieve better performance while training our algorithms on 10 times less data using Group-Convolutions. Her research has been presented at leading conferences on deep learning, including the International Conference on Machine Learning (ICML 2018). An in-depth academic article is to be featured in Medical Image Analysis (MIA) Journal.
Deep learning, and convolutional neural networks in particular, have rapidly become the methodology of choice for all (medical) image-related tasks. However, these techniques typically require a substantial amount of labeled data to learn from, meaning a human radiologist needs to manually record their findings in a way the computer can understand. Furthermore, if we want the algorithms to generalise well over different patient populations, scanner types, reconstruction techniques and so forth, we need even more data!
This presents us with the challenge of data efficiency: the ability of an algorithm to learn complex tasks without requiring large quantities of data. Instead of spending all our time and energy at gathering more data, we try to increase the efficiency of the algorithms to handle the data that we already have.
Coming to a solution
To explore how we can improve the data efficiency of convolutional neural networks (CNN), we first need to understand why CNNs are such an appropriate choice for image tasks in the first place. The great thing about CNNs is that they are roughly invariant to translation. This means that if a model has learned to recognise a structure, such as a dog, it doesn’t matter where in the image the dog appears, it will still recognise it as being a dog.
This is great for images – after all, it rarely matters where exactly in an image a structure occurs, as long as you can see that it’s there. To get technical here for a moment, translations are a type of transformation you can apply to the image, but whether or not you apply it has no influence on the prediction of the model. However, the problem is that there are other types of transformations, such as reflection (mirroring) or rotation that sadly do currently influence the prediction of the model.
This is a problem, because this means that you can’t just present your algorithm with one orientation of an object (such as a dog), and expect it to work with objects that are similar, but rotated or flipped. In practice, however, especially in the medical domain, rotations and reflections of things you want to detect – such as pulmonary nodules – occur both on a small and large scale. In order for the model to be able to recognise that, you have to present the algorithm with all these orientations separately while training, which – as you can guess – means you need more training data.
Our solution to this was to create a new type of convolutional neural network (CNN) called the group-equivariant convolutional neural network (G-CNN), which can handle the rotated and reflected versions of images. And not only regular images, but also 3D volumes – the type that of images that we have when we have CTs or MRIs.
Applying to the Real World
Yes, recognizing dogs in pictures is all fun and games, but how well does this work for real problems, like a medical finding? As a case study, we use pulmonary nodule detection. Pulmonary nodules are small lesions in the lung that may be indicative of lung cancer, which is why radiologists will generally try to detect these so they can track the growth over time. However, looking for these nodules can feel like looking for a needle in a haystack – without the advantage that you can just burn down the haystack to find the said needle.
Lung nodules are visible on a chest CT – a 3D scan of the chest, visualising bones, muscles, fat, organs and blood vessels in grayscale. A typical chest CT is comprised of ~300 images (slices), stacked together to form the whole scan. You can imagine that looking through ~300 black and white images to find a small abnormality can be a tedious task, especially considering that nodules can take many shapes and forms.
That’s where AI comes in to help!