Scram is an e-mail scanner that I built for use in our company back in June 2001. You can read about it here. It has been dutifully performing its job for a year now and I would like to expand its capabilities. I have placed it under the GPL and begun work on a new layout for the system. Scram will still operate on the same basic notion that the best/easiest way to filter e-mail with the greatest flexibility is to save each e-mail in its entirety as a text file, subject it to multiple scans, then if it passes, forward it on to the internal mail server.
This requires multiple pieces to work together. First you need an smtp/esmtp front-end to accept the incoming mail. Scram currently uses the obtuse.com smtpd daemon for this function. I would like to build a new front-end to replace it. That will be one of the first things on the list. The back-end will also need to be built. Again, scram currently uses the obtuse smtpfwdd forwarding daemon for this function. After work is completed on the front and back end daemons, the main scanning engine will be re-written preferably in perl or in C. My original purpose for writing it in bash script was to make it very flexible and easy for a novice admin to configure. I wanted all the guts to be open for scrutiny, without having to be a programmer. This will still be the case, just with a different approach.
One of the downsides to the original scanmail script was that it was hard to read, which defeated the original purpose. It needs to be more modular. There needs to be a processing engine that takes care of all the I/O, and keeps track of what's going where. This engine should be written for speed and multithreaded. The new engine will then present each e-mail in it's entirety as a text file to a series of modules. It is these modules that will do all the real work of scrutinizing each e-mail. The modules can be written in any language desired, as long as it meets the basic module structure requirements. These requirements will be such things as variable naming conventions, STDIN, STDOUT, return value codes, logging protocol, etc. The scanning engine will not care what the module does, or how it does it, as long as it plays by the rules.
The last piece of the puzzle should be some sort of data interface. For realtime stats, web access to the system, remote configuration. Since this new design will do away with smtpfwdd and its reliance on sendmail, logging and access to info will be very important. This data interface should run on a tcp or udp port and could be part of the scanning engine, but really should be in it's own process. Here are some preliminary sketches I have done: