Improving Spam Detection for German Users: A Machine Learning Approach to German Email Classification

Authors

  • Kashif Iqbal Computer Science Department, Greenwich University Karachi, Pakistan. Author
  • Muhammad Khalid Computer Science Department, Greenwich University Karachi, Pakistan. Author
  • Shamim Akhtar Faculty of Engineering Science and Technology, IQRA University, Karachi. Author
  • Sajid Yasin Computer Science Department, Greenwich University Karachi, Pakistan. Author
  • Noor Ahmed Computer Science, SZABIST, Street, Karachi, 10587, Sindh, Pakistan. Author
  • Aqsa Shahid Department of Computer Science & Software Engineering, Ziauddin University, Karachi, Pakistan. Author

DOI:

https://doi.org/10.71146/kjmr487

Keywords:

Deutsch E-Mail Klassifizierung, ham und spam Klassifikator, DeutschE-Mail classification, Textklassifikation, Spam-Erkennung, Automatische E-Mail-Sortierung

Abstract

The proliferation of unsolicited and potentially harmful emails has necessitated the development of robust email classification systems. This study focuses on the classification of German-language emails using the CODEAALTAG dataset, a comprehensive collection of both legitimate (ham) and unwanted (spam) emails.

By leveraging this dataset, we apply various machine learning algorithms—including Naive Bayes, Support Vector Machines (SVM), Random Forests, and deep learning models—to accurately distinguish between ham and spam emails. The CODEAALTAG dataset is meticulously curated and features a wide array of attributes, including content-based features, header information, and technical metadata.

We evaluate the performance of these classification techniques using standard metrics such as accuracy, precision, recall, and F1-score. Our findings indicate that advanced feature selection methods and ensemble learning approaches significantly enhance classification accuracy.

The results demonstrate the efficacy of the CODEAALTAG dataset in training and validating high-performance email classifiers, thereby contributing to improved email security and user experience. This study underscores the importance of specialized datasets like CODEAALTAG in advancing the field of email filtering and provides valuable insights for future research and development in spam detection technologies.

Downloads

Download data is not yet available.
image

Downloads

Published

2025-06-01

Issue

Section

Engineering and Technology

How to Cite

Improving Spam Detection for German Users: A Machine Learning Approach to German Email Classification. (2025). Kashf Journal of Multidisciplinary Research, 2(06), 81-99. https://doi.org/10.71146/kjmr487

Similar Articles

1-10 of 261

You may also start an advanced similarity search for this article.