Tuesday, June 12, 2018

Sorting Statistic Based Assessment of Group-wise Differences

I recently published an article in the Journal of Healthcare Engineering demonstrating a minor adaptation of a component from my (now rather old) PhD thesis. The technique is based on sorting statistics and is compared with the established Cohen's d statistic for assessing effect size (the amount of difference observed between two groups of samples). Cohen's d statistic assumes that the measurements available follow a 'normal' (or Gaussian or Bell-shaped) distribution, whereas the approach I developed makes no such assumptions.

The technique was demonstrated on a large clinical autism dataset which demonstrated that the proposed approach can identify measurements with some potential to discriminate between two groups of samples but who's Cohen's d statistic implies little to no potential from that same feature measurement. It is hoped that the technique developed will be useful in data analysis and possibly could play a role in assessing feature measurements as part of a machine learning algorithm.

Here is a plot from the article that demonstrates the proposed method's relationship with the established Cohen's d statistic. Note that the proposed statistic is unsigned and that measurements falling in the center of the plot (between the two arms of the V shape) represent measurements for which Cohen's d implied they have little potential utility in characterizing autism while the proposed metric indicates that those measurements may be of value. A large set of 4,788 measurements were included in the analysis and many were identified as falling in this V shaped zone indicating potentially more utility from these measurements than would have been assessed had Cohen's d statistic been relied upon in this situation.

You can check out the article here.