Classification of Job Skills Using the TF-IDF Method and Decision Tree on LinkedIn Job Vacancy Data

Authors

  • Mohamud Ahmednor KLASIFIKASI KETERAMPILAN KERJA MENGGUNAKAN METODE TF-IDF DAN DECISION TREE PADA DATA LOWONGAN KERJA LINKEDIN
  • Suhartono UIN Maulana Malik Ibrahim Malang
  • Imamudin UIN Maulana Malik Ibrahim Malang

DOI:

https://doi.org/10.31328/js.v8i1.7152

Abstract

This study aims to analyze skill trends in the Information Technology (IT) sector by utilizing synthetic job vacancy data resembling LinkedIn format through a text mining approach. The TF-IDF method was applied to extract important keyword features from unstructured job descriptions, while the Decision Tree algorithm was used to classify job types based on the extracted features. The dataset consists of 100 job listings in mixed Indonesian and English languages, with comprehensive text preprocessing to ensure data quality. The results indicate that the combination of TF-IDF and Decision Tree is effective in identifying key skills and categorizing job types accurately and interpretably. Data Engineer emerged as the most sought-after job category, with dominant keywords such as “data,” “experi,” “work,” “team,” and “product” reflecting the need for both technical and collaborative skills. The Decision Tree model achieved an accuracy of 80.3%, performing particularly well in classifying Data Analyst positions. Visualizations, including Word Cloud and feature importance plots, provide intuitive insights into skill demands that can benefit job seekers, curriculum developers, and recruitment companies. In conclusion, this study demonstrates that employing TF-IDF and Decision Tree methods can effectively automate the classification of job skills from vacancy data, thereby supporting data-driven decision-making in the workforce amidst the digital era and Industry 4.0 revolution.

Downloads

Published

2025-04-30