Guide On How To Install FuzzyOCR On FreeBSD

A step by step guide on how to install FuzzyOCR on FreeBSD

This guide will show you how to Install FuzzyOcr to catch image spam. "Ocr stands for Optical Character Recognition"

Important FuzzyOcr now requires SpamAssassin 3.2.x

Installing FuzzyOcr is quite easy. A few people seem to have trouble getting this to work, I have a feeling it’s because they are installing FuzzyOcr from the wrong port. FuzzyOcr is present two places in FreeBSD ports collection. And my guess is that these people are installing the wrong version.

FuzzyOcr is found here /usr/ports/mail/p5-FuzzyOcr and in /usr/ports/mail/p5-FuzzyOcr-devel. It’s the last one we need.

Right in order for FuzzyOcr to work probably we need "Tesseract" installed as well Tesseract is the optical character recognition engine that FuzzyOcr uses. Tesseract does not get installed automatically as a dependency so we’ll have to do this ourselves. We will installed Tesseract first before installing FuzzyOcr.

As root

cd /usr/ports/graphics/tesseract
make install clean

Next up is FuzzyOcr.

As root

cd /usr/ports/mail/p5-FuzzyOcr-devel
make install clean

We need to copy the defaults configuration file to the right place no need to edit it as it works right out of the box.

As root

cp /usr/local/share/examples/FuzzyOcr/FuzzyOcr.* /usr/local/etc/mail/spamassassin

All we need now is to enable the FuzzyOcr module in Spamassassin.

As root

vi /usr/local/etc/mail/spamassassin/init.pre

Add the following lines to init.pre somewhere at the bottom.

As root

# FuzzyOcr
loadplugin Mail::SpamAssassin::Plugin::FuzzyOcr

And that’s it all that’s left is to restart SpamAssassin

As root

svc -t /service/spamd/
svstat /service/spamd/ /service/spamd/log/

If you like you can test if this is working with some sample mails containing image spam.

As root

cd ~
fetch http://www.xfiles.dk/files/fuzzyocr/fuzzytest.tar.gz
tar -zxvf fuzzytest.tar.gz
rm fuzzytest.tar.gz
cd fuzzytest
spamassassin -t < ocr-gif.eml
spamassassin -t < ocr-jpg.eml
spamassassin -t < ocr-png.eml