How to easily visualize groups’ classification?

The Wise son:

I’ve created the best score to classify between two groups. We already talked about ROC and AUC to evaluate a diagnostic tool, but I want an intuitive way to visualize the qualities of my score’s classification. Any suggestions?

The Simple son:

I think I have a great graph exactly for that. Simple, clear and easy to create.

ggplot(df, aes(Group, Score, color = Score)) +

  geom_jitter(width = 0.1,

              size = 3.5,

              alpha = 0.65) +

  scale_color_viridis_c(option = "magma") +

  ylab("Score") +

  xlab("") +

  theme_dark() +

  theme(axis.text = element_text(size = 15)) +

  geom_hline(yintercept = c(60), col="white",linetype=5)

And here we can see poor, medium and good classification:

The Wicked son:

Simple, your colors are very nice, but don’t you think you’ve made it sound too simple this time? It is just a Scatter plot to present the distribution of the score in each class!

The Simple son:

Well, how many times does your audience  say “Wow!”? This is what happens with this kind of graph. Moreover, you can clearly see the misclassified (false positives / false negatives) cases: the  dots below in group-1 or above in group-2 the horizontal line, which is the cut-off.

He who couldn’t ask:

The title of the post about ROC is “ROC: Beyond Sensitivity and Specificity?”, so isn’t the current post should be called “Beyond ROC”? 

And you are all “Beyond me”? 

CONTACT US

WE CAN TAILOR TO YOUR NEEDS!

© IntegriStat 2022: IntegriStat LTD is the sole owner of the copyrights to all the content of this website. You may not reproduce or communicate any of the content on this website, including downloadable files, without the express written consent of IntegriStat LTD.  

Tal has over 5 years of experience of consulting researchers on a variety of biomedical research including cardiology, internal medicine and infectious disease.  As a biostatistician, she is engaged in study life cycle from planning throughout the statistical analysis and up to publication.  She also took part in big-data analysis as part of evaluating Hospital databases.  Tal has served as a clinical trials’ statistician for number of studies.  She is an R programmer and has been teaching short courses of applied biostatistics with R in Tel-Aviv university and Ono Academic College.

Dina has a strong background in statistics and a high level of data analytics abilities.  She has over 5 years of experience in applied biostatistics.  Dina holds an M.A. in Biostatistics and a B.A in statistics both from the Hebrew University.

Ronit manages all of IntegiStat's administrative affairs. She has experience in office management in general and specifically in the health sciences, and is certified in accounting and law.

Diklah founded and heads IntegriStat. She has extensive experience in managing diverse data projects of all sizes. Diklah has extensive experience in providing support to companies running clinical trials to validate their product for regulatory clearance including FDA and EMA.

Her professional experience also includes: statistician at West Pennsylvania Psychiatric Institute; establishment of a statistical service at Wolfson Medical Center, Holon; lead biostatistician at a number of biotech startups.

Diklah is the author or coauthor of more than 50 scientific publications. Diklah has a B.Sc. in Statistics from University of Haifa; an M.Sc. in Biostatistics from the Graduate School of Public Health, University of Pittsburgh; a Master of Entrepreneurship and Innovation degree from ISEMI, Swinburne University of Technology; and Ph.D. in Biostatistics from Ben Gurion University of the Negev.