Tuesday, January 29, 2013

Overfitting in Computer Aided Diagnosis

I've just had the honour of having one of my journal papers published by invitation. The paper is on overfitting, a scourge of a problem for pattern recognition researchers. Overfitting is a problem that occurs when a learning machine is overly tuned to the data it was provided to learn on. Overfitting is a particularly problematic phenomenon as an overfitted classifier (or supervised learning algorithm) may yield extremely promising results on the data set being evaluated while simultaneously providing an unreliable test on new datasets that it has not yet been exposed to.

 
There are very few research papers in the literature that focus on the problem of overfitting and I'm very happy to publish on this topic. The paper focuses on the use of a plotting method for identifying overfitted supervised learning decision boundaries. It is meant to be used as an adjunct to existing established statistical validation methods (such as performing a randomized training and testing regiment). The paper demonstrates the visualization technique's potential on breast cancer screening data from MRI examinations. If you would like to read the paper, click on the citation reference below: