This is a post from my collection where I talk about and I review a book I’ve read and I’ve found interesting to build a medical data scientist identity.

You can find other posts here.

 

Literature about machine learning is extremely broad and offers different point of views and suggestions on the way learning AI - albeit it is mostly IT-oriented, for techies - and the only discriminant may be the language one is willing to approach (R vs Python). Overall - these books’ target is most of the time for beginners.

It has been hard finding a textbook that - even talking about tech ML and AI - was oriented, written and thought for the medical field. Something that may inspire the medical informatics we very like or just that may inspire a new project, a new feature, a new way of thinking daily-life medical problem.

That is why I am basically enthusiastic on talking about Healthcare analytics made simple - by Vikas Kumar.

The author

In order to fully get the mindset the book has been written with is a good option to have a glance at the author’s LinkedIn profile.

Medical degree at the Pittsburgh University - he continues his education with a master degree in Computational Science and Engineering at the Georgia Institute of Technology.

On his profiles he describes himself as “Sr Data Scientist at OMNY Health | Teaching Assistant, Data & Visual Analytics at Georgia Tech | Author | Course Content Creator”.

The book

Book’s target

A book written by a MD to MDs willing to learn the basis of ML - and to data scientists willing to undergo the medical field for job reasons. It starts from the basic both for medical terminology and Python language.

Most of the time these two groups find hard to communicate each other; sometimes they even don’t. This is the aim of the book: provide to non-healthcare operators solid basis on predictive algorithms and to disclose to data scientists the medical terminology.

In the first chapters are illustrated the basis for understanding python language: how to import a dataset with Numpy or Pandas - which approach the reader had better to use while approaching a dataset (the author don’t directly say it, but just a short reminder: garbage in, garbage out).

Detailed

The author begins with an excursus on Numpy and Pandas - main methodic to clean and organize the table dataset - and keeps illustrating some statistical rule (time by time) when it is needed to fully understand the concept it is going to be explained.

Sensitivity, sensibility, negative/positive predictive value, false-positive rate are mentioned just before dive into the building of a database similar to an SQLite EMR (Electronic Medical Record).

A brief chapter is also dedicated to matplotlib and how to introduce it into the workflow pipeline.


A complete chapter is however dedicated to databases: SQL vs noSQL, why choosing SQL and he introduces step by step a series of tables aimed to save medical data accordingly.

CURD operations are introduced related to SQLite databases: Create, Update, Read, Delete.


Once basis are built - he introduces the next step: where and how use them along with Jupyter Notebook in order to build a from-zero predictive model inspired to most requested US healthcare needs.


Last chapter encloses a fast but complete digression on deep learning and related ethical and moral problems bounded to the technology.

Practical

Almost half of the book is dedicated about realizing health projects to solve practical problems: one for all - reduce the readmission in the Emergency Department once the patient has been discharged - measuring indirectly the healthcare quality provided (that model applies with some differences also for ESRD - End Stage Renal Disease patients).

Final opinion

A must-have or at least must-read book to all those MDs willing to start or boost their competences into the data analysis - both for enhancing department quality or research reasons - or just for those looking for a valid help while building predictive-only own-data-based models.

However - also already skilled MDs may find it interesting for a general refreshness of the basis.

My final opinion: strongly advised for beginners.