Analysis Of Skill Requirements In The Information Technology Job Market On Jobstreet Indonesia Using Machine Learning Algorithms

Authors

  • Muhammad Rifqi Majid Informatic Engineering, Department of Engineering, Lampung University
  • Hery Dian Septama Informatic Engineering, Department of Engineering, Lampung University
  • Mahendra Pratama Informatic Engineering, Department of Engineering, Lampung University

DOI:

https://doi.org/10.25299/itjrd.2025.20594

Keywords:

Jobstreet, Data Mining, CRISP-DM, Skills, Classificasion

Abstract

With the rapid advancement of information technology, the demand for skills in this field is growing significantly. Jobstreet provides various qualifications, including jobs in information technology. Therefore, classification is necessary to identify skill trends. Job vacancy data from Jobstreet can be utilized as raw data to generate a comprehensive classification of information technology (IT) skills. This research focuses on exploring machine learning algorithms in the context of classification to analyze skill trends. It also compares model accuracy in data classification, visualizes data mining results, and identifies sub-categories and skill trends required by the industry. The study adopts the CRISP-DM framework and employs k-Nearest Neighbor (KNN), Naïve Bayes Classifier (NBC), and Support Vector Machine (SVM) algorithms. The research methodology includes data collection through scraping techniques, data processing using machine learning algorithms (tokenization, stopword removal, stemming, n-gram visualization, and word embeddings), and data visualization through Looker Studio. The results show that the SVM model excels with an accuracy of 86.75%, followed by KNN at 83.33%, and NBC at 79.49%. The most in-demand job sub-categories include Business/System Analyst (34.1%), Network & System Administration (22.6%), and Developer/Programmer (8%). This study demonstrates the superiority of the SVM algorithm over other algorithms, highlighting its strong performance in text classification tasks.

Downloads

Download data is not yet available.

References

[1] C. Pete et al., “CRISP-DM 1.0,” in CRISP-DM Consortium, 2000, p. 76.

[2] R. Wirth, “CRISP-DM : Towards a Standard Process Model for Data Mining,” no. 24959.

[3] E. Kristoffersen, O. O. Aremu, F. Blomsma, P. Mikalef, and J. Li, Exploring the Relationship Between Data Science and Circular Economy: An Enhanced CRISP-DM Process Model, vol. 11701 LNCS. Springer International Publishing, 2019. doi: 10.1007/978-3-030-29374-1_15.

[4] P. Arsi and R. Waluyo, “Analisis Sentimen Wacana Pemindahan Ibu Kota Indonesia Menggunakan Algoritma Support Vector Machine (SVM),” J. Teknol. Inf. dan Ilmu Komput., vol. 8, no. 1, p. 147, 2021, doi: 10.25126/jtiik.0813944.

[5] N. Caetano, P. C. B, and R. M. S. Laureano, “Using Data Mining for Prediction of Hospital Length of Stay : An Application of the CRISP-DM Methodology,” vol. 2, pp. 149–166, doi: 10.1007/978-3-319-22348-3.

[6] F. Sholekhah, A. D. Putri, R. Rahmaddeni, and L. Efrizoni, “Perbandingan Algoritma Naïve Bayes dan K-Nearest Neighbors untuk Klasifikasi Metabolik Sindrom,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 4, no. 2, pp. 507–514, 2024, doi: 10.57152/malcom.v4i2.1249.

[7] N. S. Wardani, A. Prahutama, and P. Kartikasari, “Analisis Sentimen Pemindahan Ibu Kota Negara Dengan Klasifikasi Naïve Bayes Untuk Model Bernoulli Dan Multinomial,” J. Gaussian, vol. 9, no. 3, pp. 237–246, 2020, doi: 10.14710/j.gauss.v9i3.27963.

[8] A. C. Khotimah et al., “Comparison Naive Bayes Classifier, K-Nearest Neighbor And Support Vector Machine In The Classification of Individual On Twitter Account,” vol. 3, no. 3, 2022.

[9] A. Hermawan, I. Jowensen, J. Junaedi, and Edy, “Implementasi Text-Mining untuk Analisis Sentimen pada Twitter dengan Algoritma Support Vector Machine,” JST (Jurnal Sains dan Teknol., vol. 12, no. 1, pp. 129–137, 2023, doi: 10.23887/jstundiksha.v12i1.52358.

[10] S. Huber, H. Wiemer, D. Schneider, and S. Ihlenfeldt, “DMME: Data mining methodology for engineering applications - A holistic extension to the CRISP-DM model,” Procedia CIRP, vol. 79, no. March, pp. 403–408, 2019, doi: 10.1016/j.procir.2019.02.106.

[11] Y. Nurdiansyah, A. Andrianto, and L. Kamshal, “New book classification based on Dewey Decimal Classification (DDC) law using tf-idf and cosine similarity method,” J. Phys. Conf. Ser., vol. 1211, no. 1, 2019, doi: 10.1088/1742-6596/1211/1/012044.

[12] A. D. Adhi Putra, “Sentiment Analysis on User Reviews of the Bibit and Bareksa Application with the KNN Algorithm,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 8, no. 2, pp. 636–646, 2021.

[13] Y. A. Singgalen, “Analisis Sentimen Wisatawan terhadap Kualitas Layanan Hotel dan Resort di Lombok Menggunakan SERVQUAL dan CRISP-DM,” Build. Informatics, Technol. Sci., vol. 4, no. 4, pp. 1870–1882, 2023, doi: 10.47065/bits.v4i4.3199.

[14] A. Géron, Hands-On Machine Learning with Scikit-Learn, 1st editio. Sebastopol: O’Reilly Media, Inc., 2019. doi: dl.acm.org/doi/10.5555/3378999.

[15] H. Sulistiani, Implementasi Berbagai Metode Kecerdasan Buatan (Artificial Intelligence) Pada Masalah Gangguan Kepribadian (Narcissistic Personality Disorder: NPD). Bandarlampung, 2024.

[16] V. Nurcahyawati and Z. Mustaffa, “Improving sentiment reviews classification performance using support vector machine-fuzzy matching algorithm,” Bull. Electr. Eng. Informatics, vol. 12, no. 3, pp. 1817–1824, 2023, doi: 10.11591/eei.v12i3.4830.

[17] J. T. Sri Sumantyo, “Development of circularly polarized Synthetic Aperture Radar onboard Unmanned Aerial Vehicle (CP-SAR UAV),” in International Geoscience and Remote Sensing Symposium (IGARSS), 2012, pp. 4762–4765. doi: 10.1109/IGARSS.2012.6352549.

[18] E. Fujisaki and T. Okamoto, “Secure integration of asymmetric and symmetric encryption schemes,” in Annual International Cryptology Conference, Springer, 1999, pp. 537–554.

[19] T. ElGamal, “A public key cryptosystem and a signature scheme based on discrete logarithms,” IEEE Trans. Inf. theory, vol. 31, no. 4, pp. 469–472, 1985.

[20] Y. Arta, E. A. Kadir, and D. Suryani, “KNOPPIX: Parallel computer design and results comparison speed analysis used AMDAHL theory,” in Information and Communication Technology (ICoICT), 2016 4th International Conference on, IEEE, 2016, pp. 1–5.

[21] M. Hofmann and R. Klinkenberg, Data Mining and Knowledge Discovery Series. 2014.

[22] J. Wei and K. Zou, “EDA: Easy data augmentation techniques for boosting performance on text classification tasks,” EMNLP-IJCNLP 2019 - 2019 Conf. Empir. Methods Nat. Lang. Process. 9th Int. Jt. Conf. Nat. Lang. Process. Proc. Conf., pp. 6382–6388, 2019, doi: 10.18653/v1/d19-1670.

[23] A. Downey, J. Elkner, and C. Meyers, “Think Python: How to Think Like a Computer Scientist,” p. 304, 2014.

[24] R. Ribeiro, A. Pilastri, C. Moura, F. Rodrigues, R. Rocha, and P. Cortez, “Predicting the tear strength of woven fabrics via automated machine learning: An application of the CRISP-DM methodology,” ICEIS 2020 - Proc. 22nd Int. Conf. Enterp. Inf. Syst., vol. 1, pp. 548–555, 2020, doi: 10.5220/0009411205480555.

[25] S. Y. Feng et al., “A Survey of Data Augmentation Approaches for NLP,” Find. Assoc. Comput. Linguist. ACL-IJCNLP 2021, pp. 968–988, 2021, doi: 10.18653/v1/2021.findings-acl.84.

[26] Y. Kang, Z. Cai, C. W. Tan, Q. Huang, and H. Liu, “Natural language processing (NLP) in management research: A literature review,” J. Manag. Anal., vol. 7, no. 2, pp. 139–172, 2020, doi: 10.1080/23270012.2020.1756939.

[27] M. Heydarian, T. E. Doyle, and R. Samavi, “MLCM: Multi-Label Confusion Matrix,” IEEE Access, vol. 10, pp. 19083–19095, 2022, doi: 10.1109/ACCESS.2022.3151048.

[28] I. Analytics, D. S. M. Media, and A. R. Reserved, “A complete guide to cleaning and preparing data for analysis using ExcelTM and Google SheetsTM,” 2019.

[29] A. Zhang, Data Analytics: Practical Guide to Leveraging the Power of Algorithms, Data Science, Data Mining, Statistics, Big Data, and Predictive Analysis to Improve Business, Work, and Life. North Charleston: CreateSpace Independent Publishing Platform, 2017. doi: https://dl.acm.org/doi/book/10.5555/3153180.

[30] H. Wiemer and L. Drowatzky, “A Holistic Extension to the applied sciences Data Mining Methodology for Engineering Applications ( DMME ),” 2019.

Downloads

Published

2025-07-21

How to Cite

Majid, M. R., Septama, H. D., & Pratama, M. (2025). Analysis Of Skill Requirements In The Information Technology Job Market On Jobstreet Indonesia Using Machine Learning Algorithms. IT Journal Research and Development, 10(1), 21–34. https://doi.org/10.25299/itjrd.2025.20594

Issue

Section

Articles