A Study on Machine Learning Algorithms with Different Encoding Techniques for Identifying the Right One for Patients' Big Data

Subrata Kumar Das; Mohammad Zahidur Rahman

A Study on Machine Learning Algorithms with Different Encoding Techniques for Identifying the Right One for Patients' Big Data

Authors

Subrata Kumar Das Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka, Bangladesh.
Mohammad Zahidur Rahman Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka, Bangladesh.

Keywords:

Big Data, Encoding Techniques, Healthcare Data, Machine Learning Algorithms, Statistical Metrics

Abstract

In predictive modeling, categorical features often arise problems because most supervised machine learning algorithms can read numerical data as input instead of categorical attributes. So, many encoding techniques are used to convert categorical values into a machine-understandable format. Besides, different classifier algorithms could show their performance differently on the Big dataset. Therefore, the study goal is to find a learning model that will be a better-suited approach to a large volume of patients' data. This study also checks which encoding technique help to provide the high accuracy of the trained models. We applied here some encoding techniques on patients' data individually and their composite strategies to training machines. However, encoding techniques applied to categorical features and models learned as a classifier do not perform well and provide better performance. Some models trained here using various encoding techniques do not even work when facing the patients' Big data. Moreover, the training time of all machine learning models was not the same for the dataset. Therefore, this paper would help developers to choose reliable machine learning models to design their systems considering patients' Big data.

Downloads

PDF
abstract

Published

12-06-2024

How to Cite

Subrata Kumar Das, & Mohammad Zahidur Rahman. (2024). A Study on Machine Learning Algorithms with Different Encoding Techniques for Identifying the Right One for Patients’ Big Data. Jahangirnagar University Journal of Science, 43(1), 63–78. Retrieved from https://jos.ju-journal.org/jujs/article/view/59

Download Citation

Issue

Vol. 43 No. 1 (2021): December 2020 - December 2021

Section

Articles

License

©2024 Jahangirnagar University Journal of Science. All rights reserved. However, permission is granted to quote from any article of the journal, to photocopy any part or full of an article for education and/or research purpose to individuals, institutions, and libraries with an appropriate citation in the reference and/or customary acknowledgement of the journal.

A Study on Machine Learning Algorithms with Different Encoding Techniques for Identifying the Right One for Patients' Big Data

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Make a Submission

Information