|
|
Rubrik: World-wide News/Products & News IBM:
A Prototype Anti-Spam System Which Has the Potential to Eliminate Virtually
All Spam "SpamGuru"
- an Enterprise Anti-Spam Filtering System (01.09.04)
- Spam is a massive problem - it currently accounts for between 1/3 and 1/2
of all emails and costs companies billions of dollars as the result of lower
productivity, loss of legitimate messages and the need for increased
bandwidth and storage. In a bid to try solve the problem, IBM has brought
together scientists from different areas of research division to develop an
enterprise anti-spam filtering system which combines several different filtering
technologies to create the ultimate anti-spam system. For example one of the
spam filters - Chung-Kwei - is a pattern-discovery-based system which uses an
algorithm developed by life sciences researchers focused tackling
computational biology challenges such as gene finding and protein annotation.
By itself, Chung-Kwei detected 96.56 percent of spam messages with just a
.066 percent false positive rate during tests conducted in IBM's labs. By
combining Chung-Kwei with the other spam filtering techniques, IBM
researchers have created SpamGuru - a prototype anti-spam system which they
believe has the potential to eliminate virtually all spam. IBM
Research is developing an enterprise-class anti-spam filter as part of our
overall strategy of attacking the Spam problem on multiple fronts. The
anti-spam filter, "SpamGuru", mirrors this philosophy by
incorporating several different filtering technologies and intelligently
combining their output to produce a single spamminess rating or score for each
incoming message. The use of
multiple algorithms improves the system's effectiveness and makes it more
difficult for spammers to attack. While a spammer may defeat any single algorithm, SpamGuru can rely on
its remaining algorithms to maintain a high-degree of effectiveness. SpamGuru's
filtering architecture uses multiple classification algorithms which are
integrated into a single classification pipeline. SpamGuru's pipeline allows
it to benefit from multiple classifiers with minimum extra computational
cost. SpamGuru's classification technologies include spoof detection,
Bayesian filtering, plagiarism detection, automatically generated white- and
black-lists, and Chung-Kwei, a novel technique that uses advanced
pattern-matching algorithms developed by IBM's bioinformatics group. Chung-Kwei: a Pattern-discovery-based System for the Automatic
Identification of Unsolicited E-mail Messages (SPAM) Chung-Kwei is a system that we developed recently for the analysis of electronic mail and the automatic identification and tagging of unsolicited messages (=spam). The underlying method uses pattern-discovery and has its underpinnings in a generic approach that has been behind successful solutions we developed for tackling computational biology problems such as gene finding and protein annotation. Chung-Kwei can be trained very quickly using a body of known spam/white messages and can do so without interrupting the ongoing classification of incoming e-mail. The prototype system, that we developed by training on a repository of 87,000 spam and white messages, achieved a sensitivity of 96.56 percent with a false positive rate of 0.066 percent, or one-in-six-thousand messages. In terms of speed, the Chung-Kwei prototype is capable of classifying approximately 200 messages per second, on a 2.2 GHz Intel-Pentium platform. (ma) IBM Deutschland eServer, Storage, Linux Kontakt: Hans Rehm Tel. (0711) 785-4148, Fax (0711) 785-1078 E-Mail: hansrehm@de.ibm.com Kontakt: Kirsten Drechsler Tel. (0711) 785-4827 E-Mail: drechsler@de.ibm.com Web: www.de.ibm.com |