mattdorn.com

Generously funded by Matt Dorn

Final modifications to home network: Squid and DansGuardian

without comments

I’ve made some final alterations to the Internet gateway box on my home network: I’ve added Squid for transparent proxying and Web content caching, and DansGuardian for content filtering.

“Transparent proxying” means that clients on the LAN can’t bypass the proxy–all requests made through port 80 will be redirected to port 8080, which is the port DansGuardian listens on, which in turn references port 3128, which is the port Squid listens on.

In researching how to do this, I encountered constant warnings that I’d have to compile transparent proxying into the kernel, but apparently Red Hat 7.0 (at least if you do a pre-configured server installation as I did) already has it enabled, so all that was necessary were the Squid package (which was already installed) and DansGuardian (which can be downloaded as a Red Hat RPM).

SQUID

The Squid config file (/etc/squid/squid.conf) is among the most comprehensive I’ve seen, with comments that provide a lot of good general documentation for Squid itself. After reading up a bit on Squid optimization, I decided to leave the vast majority of the variables at their default settings.

Essentially the only changes I ended up making had to do with 1) making sure the clients on my LAN access to the proxy server after I setup a firewall rule to direct all HTTP requests through the proxy, and 2) enabling transparent proxying. To accomplish the access control part, the following lines had to be added to the access controls and http access sections of the file:

acl lan 192.168.1.0/24
http_access lan

A word about CIDR (Classless Internet Domain Routing) notation–the “24″ in the above line means that the first 24 bits (i.e., the first 3 numbers) are used to identify the network, while the remaining 8 bits (i.e., the last number) are used to identify hosts, which suits my private “Class C” network which allows hosts a range of 254 IPs from 192.168.1.1 to 192.168.1.254 (0 in the last position identifies the network, and 255 is the broadcast address, so they’re not available for hosts).

To achieve transparent proxying, the “httpd accelerator” needed to be enabled in Squid, with the following changes to squid.conf:

httpd_accel_host virtual
httpd_accel_port 80
httpd_accel_with_proxy on
httpd_accel_uses_host_header on

I’m not entirely clear why an http daemon (however “virtual”) is necessary to accomplish transparent proxying. At any rate, with these settings, I could no longer serve Web pages from Apache on the default HTTP port 80, so I changed that port in /etc/httpd/conf/httpd.conf to 81. While I’m not really using the Web server for anything, DansGuardian uses it to run a CGI script that displays dynamic “access denied” messages when a user attempts access to a blocked resource.

The following FAQ on “interception caching”–which I assume is the same as transparent proxying–was helpful, though I didn’t end up using the recommended ipchains settings for Linux 2.2: http://www.squid-cache.org/Doc/FAQ/FAQ-17.html

Finally, Squid was not set up to start on system startup in my runlevels to so I:

/sbin/chkconfig squid on

and then started it manually:

/etc/rc.d/init.d/squid start

DANSGUARDIAN

Installing an RPM of DansGuardian on a Red Hat system, DansGuardian works more or less right out of the box. The main problem is that it works a little too well–i.e., virtually all non-html or image file extensions, mime types, etc. are blocked. The program does not come with a list of blacklisted sites, but you can download one from the site, which I did. (While DansGuardian is GPL’d free software, you in fact have to pay for blacklist subscriptions–Dan provides a free download of your first file, though.)

For my needs, the necessary changes to /etc/dansguardian/dansguardian.conf were:

accessdeniedaddress = 'http://192.168.1.1:81/cgi-bin/dansguardian.pl'
weightedphrasemode = 0

The first change points DG to the CGI script that will . You need Apache running for this–you also have the option of using a simple non-dynamic HTML page with a generic “access denied” message if you don’t want to run Apache. As for the second change, using the “weighted phrases” feature of DG is simply too restrictive for my needs. Besides, I have a hunch that it may be the most resource-intensive feature of DG, and I have doubts about the ability of my 100MHz server to handle it without bringing Web surfing to a crawl.

To reduce the amount of restricted content (I’m only interested in filtering out hard-core porn sites), I edited the following files in /etc/dansguardian and commented out most of the settings to make them far less restrictive: bannedextensionlist, bannedmimetypelist, bannedregexpurllist (this file scans the URL for dirty words–I left it mostly intact but removed the word “sex,” because it would seem to block access to a lot of non-pornographic content (the “Sex” section on Salon.com, for example). “bannedsitelist” and “bannedurllist” contain include statements to refer to the blacklist files. After setting up the blacklist, I simply commented out all of the includes except those that point to the “porn” and “adult” subdirectories. Finally, in the “pics” file, I simply disabled this kind of filtering with the “enablePICS” variable.

Finally, as per the DG online documentation, I added a log rotation script as a weekly cron job (every Sunday at midnight): crontab -e, then add the line: 59 23 * * sun /etc/dansguardian/logrotation

To make sure that the firewall would redirect HTTP requests to be filtered by DansGuardian, I executed the following line, and added it to /etc/rc.local to be executed upon subsequent system startups:

/sbin/ipchains -A input -p tcp -d 0/0 80 -j REDIRECT 8080

With the RPM install of DG, it’s added automatically to the runlevels, so chkconfig is not necessary. Starting DG for the first time, though, I typed: /etc/rc.d/init.d/dansguardian start It took a while on my old server to process the huge blacklist file, but on subsequent startups, it uses a “processed” version of the blacklist, and starts up more quickly. Amazingly, there’s been no noticeable lag in Web browsing–at least on a LAN where only two users ever use it at the same time.

Written by mdorn

August 27th, 2003 at 1:34 pm

Posted in Uncategorized

Leave a Reply