nftables
nftables is the successor to iptables. It replaces the existing iptables, ip6tables, arptables, and ebtables framework. It uses the Linux kernel and a new userspace utility called nft. nftables provides a compatibility layer for the ip(6)tables and framework.
Introduction
As with the iptables framework, nftables is built upon rules which specify actions. These rules are attached to chains. A chain can contain a collection of rules and is registered in the netfilter hooks. Chains are stored inside tables. A table is specific for one of the layer 3 protocols. One of the main differences with iptables is that there are no predefined tables and chains anymore.
Tables
A table is nothing more than a container for your chains. With nftables there are no predefined tables (filter, raw, mangle...) anymore. You are free to recreate an iptables-like structure, but anything might do.
Currently there are 5 different families of tables:
- ip: Used for IPv4 related chains.
- ip6: Used for IPv6 related chains.
- arp: Used for ARP related chains.
- bridge: Used for bridging related chains.
- inet: Mixed ipv4/ipv6 chains (kernel 3.14 and up).
- netdev: Used for chains that filter early in the stack (kernel 4.2 and up).
It is not hard to recognize the old tables framework in these tables. The inet and netdev table families have no equivalent in the iptables world, though. The table family inet is used for both IPv4 and IPv6 traffic. It should make firewalling for dual-stack hosts easier by combining the rules for IPv4 and IPv6. The table family netdev sees packets that the driver has just passed up to the networking stack. Therefore, it is used for very efficient ingress filtering, etc. It makes no assumptions about L2 and L3 protocols and sees all traffic for a given interface.
Chains
Chains are used to group together rules. As with the tables, nftables does not have any predefined chains. Chains are grouped in base and non-base types. Base chains are registered in one of the netfilter hooks, non-base chains are not. Thus, a base chain has a hook its registered with, a type, and a priority. In contrast, non-base chains are not attached to a hook and they don't see any traffic by default. They can be used to arrange a rule-set in a tree of chains.
There are currently three types of chains:
- filter: for filtering packets.
- route: for rerouting packets.
- nat: for performing Network Address Translation. Only the first packet of a flow hits this chain, making it impossible to use it for filtering.
The hooks that can be used are:
- prerouting: This is before the routing decision, all packets entering the machine hit this hook.
- input: All packets for the local system hit this hook.
- forward: Packets not for the local system, those that need to be forwarded hit this hook.
- output: Packets that originate from the local system hit this hook.
- postrouting: This hook comes after the routing decision has been made, all packets leaving the machine hit this hook.
The ARP address family only supports the input and output hook
The bridge address family only seems to supports the input, forward and output hook
Rules
Rules specify which action has to be taken for which packets. Rules are attached to chains. Each rule can have an expression to match packets and one or more actions to perform when matching. One main difference to iptables is that it is possible to specify multiple actions per rule. Another is that by default counters are off. A counter must be specified explicitly in each rule for which you want packet- and byte-counters.
Each rule has a unique handle number by which it can be distinguished.
The following matches are available:
- ip: IP protocol.
- ip6: IPv6 protocol.
- tcp: TCP protocol.
- udp: UDP protocol.
- udplite: UDP-lite protocol.
- sctp: SCTP protocol.
- dccp: DCCP protocol.
- ah: Authentication headers.
- esp: Encrypted security payload headers.
- ipcomp: IPcomp headers.
- icmp: icmp protocol.
- icmpv6: icmpv6 protocol.
- ct: Connection tracking.
- meta: meta properties such as interfaces.
Matches
Match | Arguments | Description/Example |
ip | version | Ip Header version |
hdrlength | IP header length | |
tos | Type of Service | |
length | Total packet length | |
id | IP ID | |
frag-off | Fragmentation offset | |
ttl | Time to live | |
protocol | Upper layer protocol | |
checksum | IP header checksum | |
saddr | Source address | |
daddr | Destination address | |
ip6 | version | IP header version |
priority | ||
flowlabel | Flow label | |
length | Payload length | |
nexthdr | Next header type (Upper layer protocol number) | |
hoplimit | Hop limit | |
saddr | Source Address | |
daddr | Destination Address | |
tcp | sport | Source port |
dport | Destination port | |
sequence | Sequence number | |
ackseq | Acknowledgement number | |
doff | Data offset | |
flags | TCP flags | |
window | Window | |
checksum | Checksum | |
urgptr | Urgent pointer | |
udp | sport | Source port |
dport | destination port | |
length | Total packet length | |
checksum | Checksum | |
udplite | sport | Source port |
dport | destination port | |
cscov | Checksum coverage | |
checksum | Checksum | |
sctp | sport | Source port |
dport | destination port | |
vtag | Verification tag | |
checksum | Checksum | |
dccp | sport | Source port |
dport | destination port | |
ah | nexthdr | Next header protocol (Upper layer protocol) |
hdrlength | AH header length | |
spi | Security Parameter Index | |
sequence | Sequence Number | |
esp | spi | Security Parameter Index |
sequence | Sequence Number | |
ipcomp | nexthdr | Next header protocol (Upper layer protocol) |
flags | Flags | |
cfi | Compression Parameter Index | |
icmp | type | icmp packet type |
icmpv6 | type | icmpv6 packet type |
ct | state | State of the connection |
direction | Direction of the packet relative to the connection | |
status | Status of the connection | |
mark | Connection mark | |
expiration | Connection expiration time | |
helper | Helper associated with the connection | |
l3proto | Layer 3 protocol of the connection | |
saddr | Source address of the connection for the given direction | |
daddr | Destination address of the connection for the given direction | |
protocol | Layer 4 protocol of the connection for the given direction | |
proto-src | Layer 4 protocol source for the given direction | |
proto-dst | Layer 4 protocol destination for the given direction | |
meta | length | Length of the packet in bytes: meta length > 1000 |
protocol | ethertype protocol: meta protocol vlan | |
priority | TC packet priority | |
mark | Packet mark | |
iif | Input interface index | |
iifname | Input interface name | |
iiftype | Input interface type | |
oif | Output interface index | |
oifname | Output interface name | |
oiftype | Output interface hardware type | |
pkttype | Packet type: unicast, multicast or broadcast | |
skuid | UID associated with originating socket | |
skgid | GID associated with originating socket | |
rtclassid | Routing realm |
Statements
Statements represent the action to be performed when the rule matches. They exist in two kinds: Terminal statements, unconditionally terminate the evaluation of the current rules and non-terminal statements that either conditionally or never terminate the current rules. There can be an arbitrary amount of non-terminal statements, but there must be only a single terminal statement. The terminal statements can be:
- accept: Accept the packet and stop the ruleset evaluation.
- drop: Drop the packet and stop the ruleset evaluation.
- reject: Reject the packet with an icmp message.
- queue: Queue the packet to userspace and stop the ruleset evaluation.
- continue:
- return: Return from the current chain and continue at the next rule of the last chain. In a base chain it is equivalent to accept.
- jump <chain>: Continue at the first rule of <chain>. It will continue at the next rule after a return statement is issued.
- goto <chain>: Similar to jump, but after the new chain the evaluation will continue at the last chain instead of the one containing the goto statement.
Sets
nftables allows defining anonymous and named sets (dictionaries and maps). For example, the following nft script defines the fullbogons set, adds elements to it and drops packages from the IPs conforming the set.
rules.nft
#!/sbin/nft add set filter fullbogons { type ipv4_addr; flags interval; } add element filter fullbogons {0.0.0.0/8} add element filter fullbogons {10.0.0.0/8} add element filter fullbogons {41.62.0.0/16} add element filter fullbogons {41.67.64.0/20} add rule filter input iifname eth0 ct state new ip saddr @fullbogons counter drop comment "drop from blacklist"
Installation
Kernel
Example Config: A bare minimum for basic IPv4 firewalling with NAT:
[*] Networking support ---> Networking options ---> [*] Network packet filtering framework (Netfilter) ---> Core Netfilter Configuration ---> <M> Netfilter connection tracking support <M> Netfilter nf_tables support <M> Netfilter nf_tables conntrack module <M> Netfilter nf_tables counter module <M> Netfilter nf_tables log module <M> Netfilter nf_tables limit module <M> Netfilter nf_tables masquerade support <M> Netfilter nf_tables nat module IP: Netfilter Configuration ---> <M> IPv4 nf_tables support <M> IPv4 packet rejection <M> IP tables support (required for filtering/masq/NAT) <M> Packet filtering <M> REJECT target support <M> iptables NAT support <M> MASQUERADE target support
Additional Common Config: For mixed IPv4 and IPv6 rules combined into one table: CONFIG_NF_TABLES_INET
(If family inet is not enabled, only families ip and ip6 can be used individually)
[*] Networking support ---> Networking options ---> [*] Network packet filtering framework (Netfilter) ---> Core Netfilter Configuration ---> <M> Netfilter nf_tables support [*] Netfilter nf_tables mixed IPv4/IPv6 tables support
Additional Optional Config: Early filtering based on network device requires netdev tables support: CONFIG_NF_TABLES_NETDEV
[*] Networking support ---> Networking options ---> [*] Network packet filtering framework (Netfilter) ---> Core Netfilter Configuration ---> <M> Netfilter nf_tables support [*] Netfilter nf_tables netdev tables support
Nftables is very modular, and has many more options than mentioned here. Certain software likely requires additional features.
Depending on your intended purposes, you can go as bare minimum as listed here, OR you can enable other things as modules, because the kernel will load them as needed.
Disclaimer: This network section of the kernel changes a lot - these options are subject to change OFTEN; more modules = more compatibility. They only load when requested.
Emerge
Install net-firewall/nftables:
root #
emerge --ask net-firewall/nftables
Configuration
OpenRC
The init script supports the following actions:
- /etc/init.d/nftables save stores the currently loaded ruleset in /var/lib/nftables/rules-save
- /etc/init.d/nftables reload loads the currently loaded ruleset in /var/lib/nftables/rules-save
- /etc/init.d/nftables stop is intended to be called on system shutdown, verifies if
SAVE_ON_STOP
is enabled in /etc/conf.d/nftables and saves the ruleset - /etc/init.d/nftables start is intended to be called on system boot and loads the last saved ruleset
- /etc/init.d/nftables clear flushes the currently loaded ruleset
- /etc/init.d/nftables list lists the currently loaded ruleset
Don't forget to add nftables service to startup:
root #
rc-update add nftables default
It is suggested to invoke /etc/init.d/nftables save manually after altering the ruleset. Otherwise, if there is an issue during system shutdown and saving the ruleset fails, the system might boot up with an older ruleset.
systemd
After first setup:
root #
touch /var/lib/nftables/rules-save
root #
systemctl enable --now nftables-restore
Usage
All nftable commands are done with the nft utility from net-firewall/nftables.
Tables
Creating tables
The following command adds a table called filter for the ip(v4) layer:
root #
nft add table ip filter
Likewise, a table for arp can be created with
root #
nft add table arp filter
The name "filter" used here is completely arbitrary. It could have any name
Listing tables
The following command lists all tables for the ip(v4) layer:
root #
nft list tables ip
table filter
The contents of the table filter can be listed with:
root #
nft list table ip filter
table ip filter { chain input { type filter hook input priority 0; ct state established,related accept iifname "lo" accept ip protocol icmp accept drop } }
using -a with the nft command, it shows the handle of each rule. Handles are used for various operations on specific rules:
root #
nft -a list table ip filter
table ip filter { chain input { type filter hook input priority 0; ct state established,related accept # handle 2 iifname "lo" accept # handle 3 ip protocol icmp accept # handle 4 drop # handle 5 } }
Deleting tables
The following command deletes the table called filter for the ip(v4) layer:
root #
nft delete table ip filter
chains
Adding chains
The following command adds a chain called input to the ip filter table and registered to the input hook with priority 0. It is of the type filter.
root #
nft add chain ip filter input { type filter hook input priority 0 \; }
If You're running this command from Bash you need to escape the semicolon
A non-base chain can be added by not specifying the chain configurations between the curly braces.
Removing chains
The following command deletes the chain called input
root #
nft delete chain ip filter input
Chains can only be deleted if there are no rules in them.
rules
Adding rules
The following command adds a rule to the chain called input, on the ip filter table, dropping all incoming traffic to port 80:
root #
nft add rule ip filter input tcp dport 80 drop
Deleting rules
To delete a rule, you first need to get the handle number of the rule. This can be done by using the -a flag on nft:
root #
nft rule ip filter input tcp dport 80 drop
table ip filter { chain input { type filter hook input priority 0; tcp dport http drop # handle 2 } }
It is then possible to delete the rule with:
root #
nft delete rule ip filter input handle 2
Management
Atomic rule loading
nft supports atomic rule replacement by using nft -f. Thus it is possible to conveniently manage the rules in a text file. Comments may be added to the file by prefixing them with #
, as with shell scripts; they can also be appended to the end of rules as comment "<arbitrary string>"
, and these will be preserved as-is in nft list output.
Compared to building a ruleset with multiple nft calls in a shell script, this also ensures that failures in such a script do not end with an only partially applied ruleset.
/etc/nftables-local
skeleton nftables config file#! /sbin/nft -f # this is a skeleton file for an nftables ruleset # load it with nft -f /etc/nftables-local # it is supported to define variables here, that can later on be # expanded in rule definitions define http_ports = {80, 443} flush ruleset table inet local { chain input { type filter hook input priority 0; policy drop; tcp dport $http_ports counter accept comment "incoming http traffic"; } chain output { type filter hook output priority 0; policy drop; } }
Backup
You can also backup your rules:
root #
echo "flush ruleset" > backup.nft
root #
nft list ruleset >> backup.nft
If you are loading your ruleset with nft -f from a file, do not overwrite this file with the nft list ruleset output. This overwrites comments and variable definitions.
Logging
Start logging
Logging of e.g. dropped packages is possible by adding a line with the keyword log at the end of the rule-set, e.g. log prefix "nft.dropinput";
.
Adding a prefix will produce a log entry to /var/log/messages, such as:
/var/log/messages
example entry in messages fileJun 07 13:35:19 host kernel: nft.dropinput IN=eno1 OUT= MAC=...
Configure syslog-ng
Logging will be written by default to messages file and will fill up the file with annoying information. Based on using the prefix, the syslog-ng filters will be used to redirect those to its own file nft.log.
/etc/syslog-ng/syslog-ng.conf
entries for logging nft entries to its own file#add destination for nft messages destination netfilter { file("/var/log/nft.log"); }; #add filter for nft (netfilter) messages and messages filter f_netfilter { message("nft"); }; filter f_messages { not message("nft"); }; #use filter for nft logging log { source(src); filter(f_netfilter); destination(netfilter); }; #modify original lines to filter out nft messages log { source(src); filter(f_messages); destination(messages); }; log { source(src); filter(f_messages); destination(console_all); };
Examples
See the Nftables examples article.
Troubleshooting
Before loading new or edited rules check them with nft
user $
nft -c -f ruleset
No such file or directory
If this error is printed for every chain of a table definition make sure, that the table's family is available through the kernel. This happens for example if the table uses family inet and the kernel configuration did not enable mixed IPv4 and IPv6 rules (CONFIG_NF_TABLES_INET).
Conflicting intervals
A set definition of IP ranges causes this error if ranges overlap. For example 224.0.0.0/3 and 240.0.0.0/5 overlap completely. Either add auto-merge to the set's options, drop the range that is fully included or change syntax to 224.0.0.0-255.255.255.255.
table netdev filter { set blocked_ipv4 { 169.254.0.0/16, 224.0.0.0/3, 240.0.0.0/5 } # fix the last two IPs overlapping auto-merge }
Restart of nftables or reboot cause blocked connections
Default configuration of the save and restore function use numeric mode to store the rule set. The persisted rule set could have changed from the original upload from a manually written file. Such a transformation might break things. Therefore make sure:
- that /etc/conf.d/nftables contains the parameter -n for the SAVE_OPTIONS
- and loading your rule set as root yields a working configuration
- and the save and restore cycle of restarting nftables service causes the issue
If all three conditions are met remove the -n parameter from SAVE_OPTIONS in /etc/conf.d/nftables. Then load your rule set again from the manually written file and restart the service again. This cycles through save and restore and should create a fully working rule set.
This affected at least version 0.9.9, see bug #819456.
Family netdev and ingress hook
Broken packets should be rejected early which requires an ingress hook for family netdev. This sets up a chain that acts for a dedicated network device before packets enter further processing – improved performance. The configuration looks like this:
table netdev filter { chain ingress { type filter hook ingress device enp4s0 priority -500; # Drop all fragments. ip frag-off & 0x1fff != 0 counter drop # Drop XMAS packets. tcp flags & (fin|syn|rst|psh|ack|urg) == fin|syn|rst|psh|ack|urg counter drop # Drop NULL packets. tcp flags & (fin|syn|rst|psh|ack|urg) == 0x0 counter drop # Drop uncommon MSS values. tcp flags syn tcp option maxseg size 1-535 counter drop } }
Mind the device name enp4s0. If this changes for example when changing hardware or an upgrade changed device naming this family is broken. In turn none of the rules will be loaded. The error looks like this (filename and line numbers differ depending on the host configuration):
/etc/nftables.conf:94:9-15: Error: Could not... chain ingress { ^^^^^^^ /etc/nftables.conf:94:9-15: Error: Could not... chain ingress { ^^^^^^^ /etc/nftables.conf:94:9-15: Error: Could not... chain ingress { ^^^^^^^ /etc/nftables.conf:94:9-15: Error: Could not... chain ingress { ^^^^^^^ /etc/nftables.conf:94:9-15: Error: Could not... chain ingress { ^^^^^^^
Check the device name is actually correct and exists, e.g. ip addr list.
See also
- Iptables — a program used to configure and manage the kernel's netfilter modules., contains a section about migration to nftables
External Resources
- https://kernelnewbies.org/nftables_examples
- https://wiki.archlinux.org/index.php/Nftables
- https://wiki.nftables.org/wiki-nftables/index.php/Main_Page
- https://wiki.nftables.org/wiki-nftables/index.php/Quick_reference-nftables_in_10_minutes
- https://wiki.nftables.org/wiki-nftables/index.php/Moving_from_iptables_to_nftables