Enhancing Email Spam Detection with LLMs: Practical Experience with Rspamd and GPT

Day 1 | 13:30 | 00:30 | K.4.601 | Vsevolod Stakhov


Note: I'm reworking this at the moment, some things won't work.

The stream isn't available yet! Check back at 13:30.
Get involved in the conversation!Join the chat

This talk explores the practical implementation of Large Language Models (LLMs) in email filtering, giving the example of the integration between Rspamd and various LLM services. We'll discuss how LLMs can complement traditional filtering methods, comparing supervised (Bayes) and unsupervised (LLM-based) approaches to spam detection.

We'll examine real-world results from different models (GPT-3.5, GPT-4, and alternatives via OpenRouter), analyzing their effectiveness, false positive rates, and cost implications. The presentation will cover advanced features such as content categorization, password extraction from archives, and message anonymization for privacy-preserving learning.

Special attention will be given to practical deployment considerations, including:

  • Cost-effective strategies for different scales of operation
  • Self-hosted models vs. cloud APIs
  • Privacy considerations and message anonymization techniques
  • Integration with existing email infrastructure
  • Extended message analysis capabilities

The talk will conclude with insights into future developments and best practices for implementing LLM-based email filtering in both personal and enterprise environments.

Target Audience: Email administrators, spam filtering specialists, and developers interested in modern email security solutions.