From Data to Diagnosis: A Journey into Machine Learning for Sickle Cell Detection

OluwatomisinšŸ“ˆšŸ“‰
5 min readMay 10, 2023

Introduction

Get ready to embark on an exciting journey with me as I share the story of how a team of talented machine learning engineers, including yours truly, worked on a project initiated by the Omdena Benin chapter. Our mission was to develop a super cool YOLO model for sickle cell disease classification and deploy it on a user-friendly web application. And guess what? We absolutely nailed it!

In the next few minutes, Iā€™ll be taking you on a fun-filled ride through all the ups and downs, the twists and turns, and the highs and lows of this awesome project. From brainstorming and building the model to testing and deploying it, youā€™ll get to experience it all. So sit back, relax, and get ready for some serious fun!

Motivation

  1. Sickle cell disease is a genetic disorder that affects millions of people worldwide and early detection and treatment of sickle cell disease can greatly improve the quality of life for those affected by it.
  2. Difficulty for developing countriesā€™ populations to access medical diagnoses.
  3. Machine learning models have the potential to assist medical centers in diagnosing sickle cell disease more accurately and efficiently.
  4. Developing a YOLO model for sickle cell disease classification can contribute to the ongoing efforts to combat this disease.
  5. Working on a project related to sickle cell disease can be a rewarding and meaningful experience, as it has the potential to make a positive impact on peopleā€™s lives.

Aim

  1. To develop a YOLO model for sickle cell disease classification.
  2. To deploy the model on a web application accessible to medical practitioners.
  3. To improve the accuracy and efficiency of sickle cell disease diagnosis.
  4. To ultimately contribute to the ongoing efforts to combat sickle cell disease and improve the lives of those affected by it.

Methodology

Data collection and annotation [Role: CoLEAD]

Did you know that thereā€™s an amazing database called erythrocytesIDB that contains images of peripheral blood smear samples taken from patients with Sickle Cell Disease? What makes this database even more special is that the images it contains didnā€™t require any preprocessing making it easy to be meticulously annotated by our team of enthusiastic volunteers using makesense.ai, an open-source web application for image annotations.

The annotations involved marking up the images with bounding boxes to differentiate the types of cells present in the samples into one of five categories: Normal, Target, Crystal, Sickle, and Others. This detailed annotation process was crucial to our project because it allowed us to train our machine learning models to accurately identify and classify different types of sickle cells in the images.

The annotated data was then exported in multiple formats, including YOLO, CSV, and VOC XML, and sent over to our modeling team for some serious model development action. The comprehensive annotations and the variety of formats provided us with the flexibility to choose the most appropriate format for our models and made the training process more streamlined and efficient.

Overall, we are extremely grateful to the volunteers who worked tirelessly to annotate the images in the erythrocytesIDB database, and to the developers of makesense.ai for creating such a powerful tool. Without their hard work and dedication, our project would not have been possible.

Model Development

After training our YOLO V5 and YOLO V8 models on our custom dataset, we didnā€™t just sit back and relax ā€” oh no, we monitored their progress closely and made necessary adjustments to fine-tune their hyperparameters and optimize their performance.

It was a wild ride, but with each tweak and modification, we saw our models get better and better at detecting and classifying sickle cells. It was incredibly satisfying to see our hard work pay off as we achieved increasingly high levels of accuracy and reliability.

Of course, our work wasnā€™t done yet. We still needed to put our models to the test and evaluate their performance on a separate test set of annotated images. With a rigorous analysis of metrics such as precision, recall, and F1 score, we were able to confirm just how accurate and dependable our models were.

All in all, our YOLO V5 and YOLO V8 models proved to be invaluable assets in our fight against Sickle Cell Disease. By training, monitoring, and fine-tuning these models, we were able to create a powerful tool that can make a real difference in the lives of patients and medical practitioners alike.

Model Deployment [Role: LEAD]

Hey there, let me tell you about our amazing deployment strategy! Since our model was relatively small but mighty, we decided to go all out and deploy it In-App using streamlit, an open-source python package for web development.

We got our creative juices flowing and built a three-page website that is guaranteed to blow your mind. First up, we have the ā€œAbout Projectā€ page, where you can get all the juicy details about the projectā€™s goals, aims, and motivations. Next, we have the ā€œProject Teamā€ page, where you can meet all the talented individuals who made this project possible.

But wait, thereā€™s more! The most exciting part of our website is the ā€œDiagnose Imageā€ page. This page is where the magic happens, where users can upload images to be diagnosed. Itā€™s like having your own personal virtual doctor! Our model provides a diagnosis and also generates some cool analysis of the image in plots and tables about the ratio of the different types of RBCs and the diagnosis made by the model.

But we didnā€™t stop there, we added a bonus feature, an export function that allows users to export the report generated based on the image uploaded in PDF format. How cool is that? We had a blast building this website and we canā€™t wait for you to check it out!

Conclusion

The development of a sickle cell disease detection model using machine learning algorithms has the potential to revolutionize the diagnosis and treatment of this debilitating disease. Our project utilized advanced computer vision techniques and powerful YOLO models to train highly accurate sickle cell detection models that catered to different use cases. By utilizing annotated images from the erythrocytesIDB database, we were able to train our models to accurately identify different types of cells in sickle cell disease images.

Through this project, we have demonstrated the immense potential of machine learning in medical research and the importance of collaboration between experts in different fields. With further development and refinement, our sickle cell detection model could become a valuable tool for medical practitioners in the fight against this disease. We hope that our work inspires others to pursue innovative solutions to tackle the most pressing health challenges of our time using the latest advances in artificial intelligence and machine learning.

ā€” ā€” ā€” ā€” BEFORE YOU GO ā€” ā€” ā€” ā€”

If youā€™re enjoying the contents on my Medium page and want to stay up to date with all my latest articles, consider subscribing! By subscribing, youā€™ll receive notifications whenever I publish a new piece, so you wonā€™t miss out on any of my insights or advice. Plus, itā€™s completely free and easy to do. Just click the ā€œSubscribeā€ button on my profile and youā€™re all set. Thanks for your support!

References

  1. GitHub: OmdenaAI/benin-chapter-red-blood-cells (github.com)
  2. Application: Diagnose Sickle Cell Diseases by Red Blood Cells(RBCs) Classification Ā· Streamlit
  3. Data Source: http://erythrocytesidb.uib.es/
  4. Contact: omdenabenin@gmail.com

BECOME a WRITER at MLearning.ai. Mind-to-Art models are here

--

--

OluwatomisinšŸ“ˆšŸ“‰

I am a Data scientist. I have been developing my skills for more than 2 years most especially in Machine Learning. I will be sharing what I have learnt with you