Detecting Spam with Genetic Regular Expressions

by Eric Conrad
Sept. 1, 2017 0 comments SANS Institute email issues

Regular Expressions (“regex” for short) are strings used to detect patterns in data.They are often used to detect and block various forms of malware, including spam andnetwork-based attacks.This paper describes an approach for detecting spam with automatically-generatedregular expressions (where regexes are generated according to simple logic), followed by a‘genetic’ approach (where regexes are generated, and then ‘evolve’ to the final solution via agenetic algorithm).