• Deep learning matches performance of radiologists in diagnosis of thyroid nodules

    Interpretation of nodules at thyroid ultrasound is time consuming and suffers from inter-reader variability. In our study, we developed a deep learning algorithm to provide management recommendations for thyroid nodules observed on ultrasound images and compared its performance with radiologists. We showed that the performance of the algorithm was similar to that of consensus of three expert readers.

  • Deep radiogenomics - classification of brain tumor to molecular subtypes based on magnetic resonance images

    Radiogenomics is a challenging task of distinguishing molecular subtypes of a lesion based on radiological imaging data. A method that we applied was transfer learning from a different brain MRI dataset containing scans from cases with tumors of a similar type. Obtained results show a notable association between imaging and genomic data. This provides strong evidence for genomic subtypes being exposed in MRI.

  • Genetic optimization of a risk stratification system for thyroid nodules

    Thyroid nodules are estimated to affect as much as 50% of the population. Triaging them for biopsy is done based on assessment of ultrasound imaging by radiologists. A system that quantifies ultrasound imaging features proposed by the American College of Radiology (ACR) is called Thyroid Imaging Reporting and Data System (TI-RADS). Using a genetic optimization algorithm, we developed data-driven Artificial Intelligence (AI) TI-RADS which offers significant improvement in specificity while maintaining high sensitivity for biopsy recommendation.

  • Python toolset for statistical comparison of machine learning models and human readers

    The most common statistical methods for comparing machine learning models and human readers are p-value and confidence interval. Although receiving some criticism recently, p-value and confidence interval give more insight into results than a raw performance measure, if interpreted correctly, and are required by many journals. Bootstrapping is a nonparametric method to compute them. This post shows an example python code utilizing bootstrapping for computing confidence intervals and p-values comparing machine learning models and human readers.

  • Uploading manuscript from Overleaf (ShareLaTeX) to arXiv

    How to correctly prepare files for upload. A set of steps and a checklist for successful submission to arXiv of a manuscript written in Overleaf (ShareLaTeX).

  • Thresholding method for imbalanced classification

    Class imbalance refers to unequal number of training examples between classes in a training set. Neural networks are known to estimate Bayesian posterior distribution. The number of training examples for a class can be used to approximate its prior probability. Therefore, model output can be adjusted to reflect uneven class priors and improve the accuracy of a classifier. This post provides a simple example together with a Python implementation of the thresholding method.

  • Colaboratory - deep learning on a GPU for free

    Colaboratory is a tool from Google that lets you run a Python Notebook in the cloud with GPU support. There are some limitations on available memory and time constraints for running a continuous session yet it should be enough to train a decent scale machine learning models. Here is an overview of the setup process together with a sample notebook that shows how to use Colaboratory to start with deep learning on Labeled Faces in the Wild dataset using Keras.

  • An overview of deep learning methods in radiology

    The most straightforward way of training a convolutional neural network (CNN) is to start with a random set of weights and train it using available data specific to the problem being solved (training from scratch). However, given the large number of parameters (weights) in a network, often above 10 million, and a limited amount of training data, a network may overfit to the available data, resulting in poor performance on test data. Two training methods have been developed to address this issue: transfer learning and off-the-shelf features (a.k.a. deep features).

  • A peer review template

    A set of template questions that I follow to fill each paragraph of a peer review for a technical journal.

  • Segmentation of brain tumor in magnetic resonance images

    We applied U-Net architecture for the task of whole tumor segmentation in brain MRI. The dataset used for development was obtained from The Cancer Imaging Archive (TCIA) and involved 110 cases of lower-grade glioma patients. To evaluate the quality of segmentation, we used Dice similarity coefficient (DSC) with 22-fold cross-validation. The achieved performance was 83.60% mean DSC and 87.33% median DSC. In comparison, DSC of two expert human readers for this kind of tumor is 84% with a standard deviation of 2%. This puts our method on a par with radiologists.

  • Class imbalance problem in convolutional neural networks

    In this study, we systematically investigate the impact of class imbalance on classification performance of convolutional neural networks (CNNs) and compare perform an extensive comparison of several methods to address the issue. Based on results from our experiments we conclude that (i) the effect of class imbalance on classification performance is detrimental; (ii) the method of addressing class imbalance that emerged as dominant in almost all analyzed scenarios was oversampling; (iii) thresholding should be applied to compensate for prior class probabilities when overall number of properly classified cases is of interest.