Setting up a firewall properly using Linux 2.4.x/2.6.x: a solution

Status
Not open for further replies.

fgaliegue

Posts: 25   +0
Setting up a firewall using Linux 2.4.x/2.6.x: a solution (and how iptables works)

Hello all,

Here is a solution using iptables in order to have a fully functional firewall. I'll first put in the commands themselves, and will explain in later posts why and how it works.

The setup is the following:
* the machine acts as a router for a local network with IP network/mask of 192.168.1.0/24, on eth0 ;
* the interface is ppp0, its address is dynamic.

Here goes, supposing that your iptables rules are blank:

#
# The central part of it - conntrack, ie stateful firewalling
#
iptables -N connstate
iptables -A connstate -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A connstate -m state --state INVALID -j DROP
iptables -A connstate -m state --state NEW -p tcp ! --syn -m limit --limit 2/sec -j LOG --log-prefix "NEWNOTSYN: "
iptables -A connstate -m state --state NEW -p tcp ! --syn -j REJECT --reject-with tcp-reset
iptables -A connstate -m state --state NEW -j RETURN
iptables -A connstate -j LOG --log-level CRIT --log-prefix "CONNTRACK BARF: "
iptables -A connstate -j DROP

#
# For all three filter chains: drop everything by default - first chain to go through is
# connstate
#
for i in INPUT OUTPUT FORWARD; do
iptables -P $i DROP
iptables -A $i -j connstate
done

#
# Deal with the loopback special case
#
iptables -A OUTPUT -o lo -j ACCEPT
iptables -A INPUT -i lo -j ACCEPT

#
# First, let's allow anything from the local machine to the local network
#

iptables -N local_to_eth0
iptables -A local_to_eth0 -j ACCEPT
iptables -A OUTPUT -o eth0 -j local_to_eth0

#
# Also allow anything from the local machine to the Internet
#
iptables -N local_to_ppp0
iptables -A local_to_ppp0 -j ACCEPT
iptables -A OUTPUT -o ppp0 -j local_to_ppp0

#
# Now, only allow some protocols (here, let's say SSH and ping only) from the local
# network to the local machine
#
iptables -N eth0_to_local
iptables -A eth0_to_local -p tcp --dport 22 -j ACCEPT
iptables -A eth0_to_local -p icmp --icmp-type echo-request -j ACCEPT
iptables -A INPUT -i eth0 -j eth0_to_local

#
# For the Internet, first thing is to masquerade our local network... Private IP
# addresses cannot be routed on the Internet!
#
iptables -t nat -A POSTROUTING -o ppp+ -j MASQUERADE

#
# Now, packets that come from our local network and go to the Internet: accept
# everything
#
iptables -N eth0_to_ppp0
iptables -A eth0_to_ppp0 -j ACCEPT
iptables -A FORWARD -i eth0 -o ppp0 -j eth0_to_ppp0

#
# END
#
 
Stateful firewalling, what it is

OK, how it works...

First of all, at the center of it all is the connstate table, which makes use of the ip_conntrack module. What this module does is stateful firewalling.

What this means is that it keeps track of "connections" at large in a table, ie it knows whether a packet it receives belongs to an existing dialog. It puts packet in basically 4 categories:

  • NEW: this packet doesn't belong to any existing dialog, it initiates a new one;
  • ESTABLISHED: this packet belongs to an already existing dialog;
  • RELATED: this packet does not belong to an existing dialog, however it is acknowledged as related to an existing one. The most common example of this is any ICMP error (the payload of an ICMP packet contains the headers of the offending packet, this is why conntrack can relate an error to a dialog).
  • INVALID: this packet wants to be part of an existing dialog but isn't; or, more simply, it is malformed (bad headers, bad checksum, you name it).
 
Packet traversal

Now, it's time to delve a little in the core of the story, the netfilter architecture.

First of all, there are three tables defined as standard: the mangle table, the nat table and the filter table. When you don't specify which table to use (see option -t of iptables), the filter table is used by default.

Each table has a number of chains associated with it. You can create new chains on the fly, however creating new tables requires a kernel module.

The different chains traversed by a packet depends on where the packet comes from, and where it is supposed to go. There are three possible cases:

  • the packet comes through an interface and is destined to the local machine (I);
  • it comes through an interface and is destined to another machine on another interface (F);
  • it originates from the machine itself and is destined to another machine (O).

Note that the fact of being able to forward packets through interfaces is the act of routing.

ARRIVAL OF A PACKET (I, F only)

When a packet arrives through an interface, first of all it traverses these two chains: mangle/PREROUTING and nat/PREROUTING. After traversing these two chains, it reaches the point where the kernel has to decide where to route it.

Hence the name (PREROUTING) of these two chains. This means that the nat/PREROUTING chain can influence the routing decision by modifying the destination address. We will see some uses of it.

FIRST ROUTING DECISION (I, F only)

From then on, the kernel looks at the destination address of the packet and determines one of two scenarios:

  • The packet is meant to reach the machine itself (I). In this case, it goes through the mangle/INPUT and filter/INPUT chains.
  • The packet is meant to reach another machine, which the kernel knows it can access (F): it will then go through the mangle/FORWARD and filter/FORWARD chains.

PACKETS ORIGINATING FROM THE MACHINE ITSELF (O only)

For packets originating from the machine itself, three chains are traversed: filter/OUTPUT, nat/OUTPUT and mangle/OUTPUT.

SECOND ROUTING DECISION (O, F only)

From then on, and provided that the packet hasn't been dropped before, the kernel knows that this packet has to go outside. By looking at its routing table, it determines the correct interface and sends the packet there.

AFTER SECOND ROUTING DECISION (O, F only)

Nope, it's not ready to go yet. Before it reaches this point, it first has to go through the two last chains: nat/POSTROUTING and mangle/POSTROUTING. As their name says, these chains can alter packets just before they are actually sent to the wild and out of reach.

The nat/POSTROUTING chain, in particular, can rewrite the source address of the packet before it is sent out. This is used in our basic firewall. We will see why.
 
How it really works

Now, let's see what happens when a machine from our local network, say 192.168.1.3, attempts a Web connection to www.google.com (66.102.9.147):

  • the packet arrives through interface eth0; it is untouched by the mangle/PREROUTING and nat/PREROUTING table until it reaches the first routing decision;
  • by looking at its routing table, the kernel knows that it should go to interface ppp0, therefore it sends it to the filter/FORWARD chain;
  • the first instruction in the filter/FORWARD chain is to send all packets to the connstate chain (-j connstate); therefore it is sent there;
  • the connstate chain determines that the packet, not being part of/related to an existing dialog, is of type NEW. It has also checked that, being TCP, it has its SYN bit set (otherwise it would have been rejected: -m state --state NEW -p tcp ! --syn -j REJECT --reject-with tcp-reset), therefore it returns it to the caller, filter/FORWARD (-m state --state NEW -j RETURN);
  • next on the filter/FORWARD chain, the packet is matched by the fact that it comes from interface eth0 (-i eth0) and goes to interface ppp0 (-o eth0), therefore it is instructed to make this packet go to the eth0_to_ppp0 chain;
  • the eth0_to_ppp0 chain is straightforward: accept everything. Therefore the packet leaves the filter table and goes on its way;
  • second kernel routing decision, the packet is sent to ppp0;
  • however, before it gets out, is is trapped by the nat/POSTROUTING chain: the only rule in this chain is that any packet going out by interface ppp0 (-o ppp0) will see its source address (192.168.1.3 in our example) rewritten by the IP address of this interface at this time (x.y.z.t), and its source port (let's say, 1025) rewritten by another source port (let's say P) (-j MASQUERADE). The nat table will store this address rewriting rule for later use. Therefore, what www.google.com will see is a packet coming from address x.y.z.t and port P.

It is necessary to do so, as private IP addresses (defined in RFC 1918) CANNOT be routed on the Internet.

OK, now the first packet is gone, let's see what happens when the response from www.google.com comes in:

  • first it comes in through ppp0; it reaches the mangle/PREROUTING which leaves it untouched;
  • the nat/PREROUTING chain, however, sees that there's a rewriting rule for this connection: therefore it rewrite the destination address and port (x.y.z.t, port P) to the original ones (192.168.1.3 and 1025);
  • it then reaches the first kernel routing decision: it knows that this packet should go to interface eth0, therefore it is sent to the FORWARD chain;
  • reaching the FORWARD chain, the first instruction is to go through chain connstate;
  • the connstate chains identifies the packet as part of an existing connection, it is therefore an ESTABLISHED packet and is immediately accepted. It then reaches the second kernel routing decision and is delivered. End of story!

Now you can see the power of stateful firewalling: provided you trigger the stateful functionality of your firewall BEFORE anything else, you only have to write the filtering rules ONE WAY. What's more, it's much, much more secure than writing the rules both ways, like ipchains required to do.
 
Source NAT, destination NAT

Now let's explore a little more the NAT stuff.

We've seen one example of source NAT: the MASQUERADE target. There are others, however, such as SNAT for example. Provided you have a fixed IP address for your net connection and the interface used is eth1, you would use it like this:

iptables -t nat -A POSTROUTING -o eth1 -j SNAT --to x.y.z.t

where x.y.z.t is the address of your eth1 interface.

Why do both SNAT and MASQUERADE exist? Well, the MASQUERADE target has two fundamental differences:

* it will look up each time the IP of the outgoing interface;
* if the interface drops, it clears all connections using this interface from the conntrack table, unlike SNAT.

In short, use MASQUERADE if your IP address on the Internet is dynamic, use SNAT otherwise.

Now, an example of destination NAT. Say you want to play Diablo 2 from one of the machines in the LAN which has address 192.168.1.5. You want to be able to host games. Diablo 2 listens on port TCP/4000. Your net interface is ppp0.

You'll need to do two things here:
  • rewrite any TCP packet to port 4000 arriving on your net interface so that the destination address be rewritten to 192.168.1.5 without changing the destination port,
  • add rules to the filter/FORWARD table so that any TCP packet coming from ppp0, going to eth0, with destination address 192.168.1.5 and destination port 4000 will be accepted.

Well, let's do it, starting with the filter table changes:
  • first, let's create chain filter/ppp0_to_eth0:
    iptables -N ppp0_to_eth0​
  • now, let's fill it with the rule we want:
    iptables -A ppp0_to_eth0 -d 192.168.1.5 -p tcp --dport 4000 -j ACCEPT​
  • finally, tell filter/FORWARD that any packet coming from ppp0 and going to eth0 should jump to chain filter/ppp0_to_eth0:
    iptables -A FORWARD -i ppp0 -o eth0 -j ppp0_to_eth0​

Done for the filter table changes. Now, the nat table changes: as written above, we want that any packet coming from interface ppp0 (our net interface), protocol TCP and destination port 4000, must see its destination address rewritten to 192.168.1.5. We have to use the PREROUTING table here, since this rewriting will influence the first kernel routing decision, and use the DNAT target, which does what we want (rewrite destination address):
iptables -t nat -A PREROUTING -i ppp0 -p tcp --dport 4000 -j DNAT --to 192.168.1.5​

Done!
 
Wow.. Someone actually bothers with this stuff in our days of point-and-click? :p

A good read to anyone interested in the murky internals of iptables :)
 
Nodsu said:
Wow.. Someone actually bothers with this stuff in our days of point-and-click? :p

I'm kind of an old school guy :p

This kind of firewall setup is used in production on all the servers I have to manage and it never failed on me.

It failed ONCE however, at home, using [some kind of file sharing protocol], because the TCP connection lifespan is set to 5 days by default - my conntrack table was full (more than 120k entries!). Setting the lifespan to 12 hours instead cured the problem... But it's a limit to remember for very big systems.
 
Why the two last lines in the connstate table - and about fragments

You could very well ask, why on earth are these two last lines here in the connstate table:


iptables -A connstate -j LOG --log-level CRIT --log-prefix "CONNTRACK BARF: "
iptables -A connstate -j DROP

After all, the rules above cover all the possible states, don't they?

Well yes they do, but these two lines are here "just in case". The reason for them being logged with log level CRIT is so that the logged packet(s) be visible whatever way you're connected to the machine (distant or local connection). In case there is such a packet, you have found a bug in netfilter!

These two rules actually never caught anything on any of the systems which have this kind of firewall in my case. But who knows, it may very well happen one day.
In case there are, such packets are just dropped, you don't want them to reach anything, do you?

Also, about fragments: a frequently used type of attack is sending packets in fragments - ie, the first packet contains only parts of the headers of the underlying protocol, the rest of the headers is in another; this kind of attack is frequently used to bypass firewalls. But you need not worry! The use of conntrack itself REQUIRES that packets be reassembled before they even get out of the interface from which they originate. Should a partial packet be seen and not completed in a delay determined at compile time (can't remember where it is defined though), it will be silently ignored by the kernel and removed from skbuffs (that's the structure holding network packets in Linux), therefore avoiding a memory DoS.

Linux is actually very clever, come to think of it :p
 
iptables command syntax: selecting the table, chain, position

Now, on to the syntax of the iptables command.

Note that this post only deals with how to select the correct table and chain you want to act upon. The way to define a rule (what packets should be matched and what to do with them) will be explained later.

First: which table

Theoretically, the first argument to iptables should be the table it is asked to act upon. This is the -t option. If not specified, it is the filter table by default. All arguments between brackets are optional. All arguments between <...> are MANDATORY.

Hence the command:

iptables -A OUTPUT blah blah​

is the equivalent of:

iptables -t filter -A OUTPUT blah blah​

If you want to act on ANOTHER table than filter (say, nat), then the -t argument becomes MANDATORY.

Second: how to act on the table

What to do on the table is determined by the following option. This can be:

  • -L [chain] [options]: list the contents of chain chain; if chain is omitted, list the contents of all chains in the selected table. Options [options] include:

    • -v: be verbose, that is, also print the number of packets matched and the number of bytes in these packets;
    • -n: do NOT resolve IP/network addresses or port numbers to names - speeds up the printing immensely, should your DNS system be down;
    • --line-numbers: also print the rule's line number, useful with the -I command as highlighted below.
    Hence the command:
    iptables -t filter -L -vn​
    will print the contents of all chains in the filter table, along with the number of packets matched by each rule in all chains. VERY useful!
  • -A <chain> <rule>: append rule rule to chain chain.
  • -I <chain> [position] <rule>: prepend (put in first position) rule rule to chain chain. If position is present, will insert the new rule before rule number position (ie, iptables -t filter -I foo 2 <rule> will insert the rule rule before rule number 2 of chain foo of table filter). Note that rule numbers start at 1, not 0.
  • -Z: reset all packet counters (the ones printed by the -L -v set of options) in all chains for the selected table.
  • -F [chain]: delete all rules from chain chain; if chain is omitted, attempts to flush all rules from all chains for the given table.
  • -X <chain>: deletes chain chain. Note that you CANNOT delete predefined chains (ie, for instance, you cannot delete the PREROUTING chain of the nat table, nor the FORWARD chain from the filter table), you can only delete user defined chains. Note that iptables will refuse to delete a chain which either:
    • is still referenced by another chain in the same table, or
    • has still rules in it (use the -F option first to clear the rules in this case).
  • -D <chain> <position|rule>: delete rule defined by rule or rule at position position from chain chain. Of course, most of the time, the position will be specified instead of the rule to delete.
  • -R <chain> <position> <newrule>: replace rule in chain chain at position position by rule newrule.
  • -N <chain>: creates a new chain with name chain.
  • -E <oldchain> <newchain>: rename chain oldchain to newchain. This has NO INFLUENCE on existing references - ie, if a table referenced allchain before, it will reference newchain afterwards. It is therefore purely a cosmetic change.
  • -P <chain> <target>: only valid with builtin chains, determines the default target (see below) for packets that reach the end of the chain without being sent out of the table by any rule before it. The only valid values for target are DROP, ACCEPT or REJECT.
 
iptables command syntax, continued: matching packets

So, we now know how to select the table and chain we want to alter. Let's see now all the &lt;rule&gt; stuff. It is in two parts, as the title of the post states:

  • how to match packets;
  • what to do with them.

How to match packets

iptables has a pretty comprehensive set of rules in order to match packets. As a result, only some of them are described.

Of course, you can combine several of the criteria mentioned here, in certain limits however - these will be explained too.

Matching on incoming/outgoing interface

At the lowest possible level, it is possible to match packets based on the incoming or outgoing interface.

To match the incoming interface, use the option -i <interface> (for example, -i eth0). Note that if the number besides the interface class is replaced by a +, it will match all such interfaces (eth+ will match eth0, eth1, eth2 and so on).

To match the outgoing interface, use the option -o <interface>. The same holds true wrt the + character.

IMPORTANT!
  • the -o option can not be used in the */PREROUTING and */INPUT chains;
  • the -i option can not be used in the */OUTPUT and */POSTROUTING chains.

The reason is pretty straightforward: at PREROUTING time, you cannot tell which interface a packet is going to (it's up to the kernel to determine this - and nat/PREROUTING can influence the routing decision, remember), and at INPUT time, the packet being destined to the local machine, there's just no outgoing interface. Similarly, the POSTROUTING chains cannot tell where a packet comes from (similarly, only the kernel knows that), and OUTPUT packets, having been initiated from the machine itself, don't have a source interface either.

For */FORWARD chains, however, both options can be used.

You can NEGATE this option by using ! before the interface name: for example, -i ! eth0 will match all packets NOT coming from interface eth0.

Matching the source/destination address(es)

You can match source addresses using the -s <address>[/netmask], and in a similar vein, match destination addresses using the -d <address>[/netmask] option. The mask is optional and can either follow the "classic" notation (255.255.255.0) or the CIDR notation (24). For instance:

  • -s 192.168.1.0/24 will match all packets which source addresses are in the range 192.168.1.1-192.168.1.254 (192.168.1.255 will be included if the protocol is UDP however - but TCP cannot do broadcasting); this could have been written -s 192.168.1.0/255.255.255.0;
  • -d 100.201.4.19 will match all packets which destination address is 100.201.4.19. This could also have been written -d 100.201.4.19/255.255.255.255 or -d 100.201.4.19/32, but then why you would write any of those is beyond the scope of my understanding :p

Similarly, you can negate the argument by preceeding it with a !, for example, -s ! 192.168.1.0/24.

You can also put machine names instead of IP addresses, but this is STRONGLY DISCOURAGED (try and imagine what happens if your current rules block DNS traffic...).

Matching against the protocol

You can match the protocol with the -p <protocol> [protocol-options] option (and negate the match with ! as well). We will see the three commonly used protocols here, which are TCP, UDP and ICMP, however others exist as well, you can see a non exhaustive list in file /etc/protocols on any Linux system.

So, let's see what flags you can specify with protocols TCP (-p tcp) and UDP (-p udp):

  • --sport <port>[:port2]: match any UDP or TCP packet which source port is port (NOTE: this option is ILLEGAL for any other protocols than TCP or UDP!). If port2 is specified, will match packets whose source ports are between port and port2, inclusive (for example, --sport 1024:65535 is a classical way of matching all "user" packets - only root can bind to a source port less than 1024). You can negate with ! before the argument too (--sport ! 1:1023 for instance).
  • --dport <port>[:port2]: match any UDP/TCP packet with destination port port. As above, the port2 and negation are also possible.
  • --syn (TCP ONLY): match TCP packets which have the SYN bit set. Negatable with ! before the option as well (! --syn), as is used in our sample firewall.

With the ICMP protocol, you can specify the type of message you want with the --icmp-type <type> protocol - and negate it too (for instance --icmp-type ! timestamp-request). As a side note it should be noted that conntrack knows how to deal with ping request packets (ICMP type echo-request): it will expect ping answers (ICMP type echo-reply) as results... Smart, I tell you!

Generally, you can see all available flags for a protocol by typing iptables -p <protocol> -h (for instance, iptables -p tcp -h).

Extended matching

Many, many extensions exist for iptables which are called with the -m option, and each of them has its specific option. An extensively used one in the sample firewall is of course the state extension. As a general rule, an extension is invoked with -m <extension_name> [extension arguments and options].

You can see all arguments of an extension by invoking iptables -m <extension_name> -h. Which you will want to do anyway, since I won't be mentioning all possible options :p

Only some of them are listed here, but you will certainly find them useful ;)

  • the state extension, which you are probably familiar with now. Its only option, negatable, is the --state <state>. Matching all non NEW packets for example will be written -m state ! --state NEW.
  • the time extension. Yes, it actually exists! Want to allow outgoing connections to the net only between 8am and 10pm? Match with -m time --timestart 08:00 --timestop 22:00... Similarly, you can specify days, too! (see iptables -m time -h...)
  • the limit extension. Especially useful when logging packets, so that you don't flood the logs. -m limit --limit 2/sec will only match 2 packets a second, for instance.
 
iptables command syntax, the end: what to do with packets - targets

Now that we have specified what packets we want to match, we must decide what to do with them, and that is the goal of targets. You specify the target by using the -j option, which sould ALWAYS be the last one.

You can either jump to a user defined table (but you CANNOT jump to a built-in table!), or to one of the predefined targets.

Like protocols and extensions, you can see all possible options of a target by invoking iptables -j <target> -h.

Some targets are legal only in certain tables, some can be used in several tables but are preferably used in one of them in particular.

Targets only legal in the nat table

  • -j DNAT --to <address>[-address2][:port[-port2]]: only legal in the nat/PREROUTING table, alters the destination address and/or port of an incoming packet. For instance, -j DNAT --to 192.168.1.2-192.168.1.8:1024-2022. If the port(s) is(are) unspecified, only alters the source address. If several addresses and/or ports are specified, they are used in a round robin manner.
  • -j SNAT --to <address>[-address2][:port[-port2]]: only legal in the nat/OUTPUT and nat/POSTROUTING tables, alters the source address and/or port of an outgoing packet. Similar in principle (ports, addresses, round robin) to DNAT.
  • -j MASQUERADE: alters the source address and port of the packet before sending it. See above for the difference with DNAT. Only valid in nat/OUTPUT and nat/POSTROUTING.
  • -j REDIRECT --to <port>[:port2]: only valid in nat/PREROUTING. Changes the destination address to the IP address of the interface the packet is coming from and rewrites the destination port to port. If port2 is specified, will use each port in the range in a round robin manner.

Targets to be used in the filter table

The targets below can be used in either table, however they are best used in the filter table since they decide on the fate of the packet.

All of these targets are final, that is, when such a target is reached, the treatment stops there and the packet goes out of the filter table, and on to the next one (except for the DROP target).

  • -j DROP: silently drop the packet;
  • -j ACCEPT: accept the packet;
  • -j REJECT [--reject-with reject-type]: reject the packet by sending an error packet to the requester. By default, rejects with the ICMP error port-unreachable. Other targets are available, like tcp-reset (send a TCP RST packet, only valid with TCP packets of course), see iptables -j REJECT -h for a complete list;
  • -j RETURN: only valid from user defined tables, returns packet back to the calling chain.

The LOG target

The LOG target will do just that, log packets using syslog. Where it will be logged will depend on the configuration of your sysklogd/syslog-ng daemon. The two most interesting options are these:

  • --log-level <level>: determine which level the packets should be logged at, default being INFO;
  • --log-prefix <prefix>: put prefix before packet details in the logs. Helpful since you can just grep for this prefix to see the packets you want.

The LOG target is NOT a final target: the packet will keep going down the rules until either a final target or the default policy of the chain is reached.
 
Putting it all together...

So now, for example, you can decrypt this line:

iptables -A connstate -m state --state NEW -p tcp ! --syn -m limit --limit 2/sec -j LOG --log-prefix "NEWNOTSYN: "​

Let's examine it in detail:

  • -A connstate: append a rule to the connstate chain; as no -t option is specified, this will be the connstate chain of the filter table; therefore, what table and chain to act upon, and what to do are specified with this option.
  • -m state --state NEW -p tcp ! --syn -m limit --limit 2/sec: this is our packet matching rule:
    • -m state --state NEW: the packet shoud be seen as NEW by conntrack,
    • -p tcp ! --syn: it should be TCP and NOT have its SYN bit set,
    • -m limit --limit 2/sec: we must not have seen more than 2 of these in the last second.
  • -j LOG --log-prefix "NEWNOTSYN: ": this is the target, we log the packet with level INFO (the default) and prefix it with "NEWNOTSYN: " in the logfile. The packet will then go on through the other rules of the chain.
 
Nice tutorial on how iptables work :)
For mere mortals, there are a few other choices, in order of complexity:
1) firestarter (UI based...kinda like typical Windows firewalls, but not very flexible)
2) firehol (text config, but more configurable and relatively simple to understand)
3) shorewall (text config, but more complicated and also has traffic shaping support)

I'm only at the (2) level, but have gotten it to do what I need. It apparently generates 400 IP tables rules out of a 20-30 line config file, so it's a lot easier to read and understand how the firewall is configured, for this mere mortal...
 
Status
Not open for further replies.
Back