We have a FreeBSD based router running IPFW and the in-kernel NAT. It has a Xeon E3-1275v6, which has 4 hyperthreaded cores running at 3.8GHz. It's a pretty beefy router, and it hasn't given us a whole lot of trouble since we got everything configured.
One of the problems we had initially was that the default TCP timeout of 5 minutes was far too short for how we work. We often left connections to PostgreSQL databases open for far longer than that and it became a pain in the ass so we dropped the following in /etc/sysctl.conf
:
# sets idle lifetime to one hour, was 300=5min net.inet.ip.fw.dyn_ack_lifetime=3600 net.inet.ip.fw.dyn_udp_lifetime=3600 # this one was 10 seconds
I don't recall exactly if we changed the UDP one at the same time, nor do I recall why.
The next problem that presented itself was that we were running out of space in our state table. This should have been a clue but we just stuck a band-aid on it:
net.inet.ip.fw.dyn_max=100000
This stuck around for almost a year before another problem presented itself. We ran out of space for dynamic state again and our network became unresponsive.
Aug 2 09:32:53 ipfw kernel: ipfw: Cannot allocate dynamic state, consider increasing net.inet.ip.fw.dyn_max
At this point I started investigating what was actually going on with our dynamic state. I wrote some commands to help with this that I think are generally useful if you use IPFW and in-kernel NAT:
# Count state records: ipfw -D show | wc -l # Sum state records by Source and Destination IP Addresses: ipfw -D show | awk ' { print $7 "->" $10 } ' | sort | uniq -c | sort | awk ' { if ($1 > 15) { print $0 } } ' # Sum state records by Source and Destination IP Addresses and destination port: ipfw -D show | awk ' { print $7 "->" $10 ":" $11 } ' | sort | uniq -c | sort | awk ' { if ($1 > 15) { print $0 } } ' # Sum by rule number: ipfw -D show | awk ' { print $1 } ' | sort | uniq -c | sort | awk ' { if ($1 > 15) { print $0 } } ' # Clear dynamic state for a specific rule number: ipfw -D delete <RULENUMBER>
The -D
flag tells the ipfw
command to only touch dynamic state rules. The rest is basic UNIX stuff:
sort | uniq -c | sort
sums unique records and sorts them with the highest number on topawk
is used to parse out just the values we want, and also to filter the counts to be more than 15 (an arbitrarily chosen number to make the resulting list more managable)With these tools I discovered that there were dozens of thousands of state records just for DNS. Armed with this knowledge I looked at our firewall rules to see what ports we actually use:
53
: DNS, extremely short lived connections1194
: OpenVPN, which is configured with a 10 second keep-alive5060
: SIP, which is configured with a 30 second keep-aliveAt this point it became clear that an hour is far too long for UDP state to hang around. We configured it down to 35 seconds, and cleared all of the dynamic state. We are now hovering at around 1200 dynamic state rules, which seems far more reasonable for our network size.