It’s probably about time I broke tradition and brought this blog daringly close to being on-topic. This time it’s all about securing public network infrastructure. This in itself could feel like a bit of an oxymoron; why bother securing a router or switch whose job it is to pass all traffic, both good and bad?
The answer comes in all sorts of reasons - from bad guys causing service interruptions, or worse, stealing IP addresses to causing smaller, less noticeable issues like port mirroring or sflow to sniff or summarise network traffic. Thankfully, there are a number of key measures we can enact to ensure traffic always goes where it should. This blog is focussed primarily on Cisco infrastructure, but the principles should apply to all campus-to-carrier-grade network equipment.
It’s worth noting that although I have panned out a reasonable level of detail here, it’s not an exhaustive list, and further security measures are usually required, depending on the structure and design of the network in question.
Update, update, update
The first thing to remember is that security vulnerabilities are discovered in public routing and switching infrastructure all the time. I strongly advise subscribing to mailing lists run by organisations such as the Cisco Product Security Incident Response Team (PSIRT) as well as some of the excellent ones documented at SecLists.Org. The idea here is that by keeping the software running on devices up-to-date, you’ll stay on top of known backdoors and holes. Bear in mind that many manufacturers (Cisco, Brocade and Juniper come to mind) will backport security updates, so even if your network is built on a slightly older train, you can often still keep on top of security issues.
Make use of AAA
No, not batteries: AAA stands for Authentication, Authorisation and Accounting, and it’s the way that access to devices is controlled and monitored. Let’s break it down:
Authentication is the acceptance of a user as being valid on the system. Usually this will be checked against a stored public SSH key and passphrase/password against either local authentication or central process such as TACACS or Radius. Once a user is accepted onto a system, the process of Authorisation begins. It is this process that determines the level of access the particular user has been granted. This could be as little as ‘show’ commands in the case of frontline support agents, or in the case of a senior engineer, full configure, reset and clear commands could be available. Lastly, Accounting is used to keep an audit of what the user has done during their session. In the case of access networks, this can be information on traffic throughput or, in the case of core and edge devices, this is can sometimes be used to keep a log of commands executed by the user.
In an estate of tens, hundreds or even thousands of devices, it is imperative that centrally controlled AAA is used. On the day that a new member of staff joins or leaves a team, or gets promoted to a more senior role, modifying the AAA configuration on hundreds or thousands of devices can be painful and prone to human error. If your password policy dictates regular password changes, this could present the similar nightmare of having to change the configuration on every device.
Using centrally managed AAA systems means you avoid worrying situation where an ex-member of staff’s account is left live with the old password, which would allow them to tamper with your infrastructure long after they left your organisation.
Once the above has been configured, it’s a good idea to remove all default or system accounts and change default SNMP communities to ensure that, although the back door is locked, the front door is not left wide open.
All management interfaces will benefit from having an Access Control List (ACL) attached to them that specifies which IPs are attached to management stations. This, along with the SSH and SNMP ACLs, should prevent an unauthorised workstation from accessing the configuration interface of the equipment.
In the event of an issue, quick access to information can mean the difference between a security incident and securely saving the day! As such, it’s great to have the logs in one place and thus make the information easily accessible. Log-collating software packages such as Splunk are great for this, allowing fast searches using regular expressions (regexp) to match exacting criteria for problem solving.
An often-overlooked element of network security is the monitoring, which should be nothing short of comprehensive. In addition to the usual availability and traffic throughput monitoring, it’s also worthwhile ensuring that interface status, routing status, temperature, power (and many other things) are graphed alongside threshold alerting. This will bring potential issues to the attention of the Network Operation Centre as soon as possible. The information should also be archived to allow heuristic calculations based on historical data.
In order to avoid death by over-information, it’s important to ensure that the thresholds for alerts are meticulously set. Getting too many alerts can be just as bad as getting none, as these are often missed by staff who receive them in the hundreds each day. That alert for the gradually increasing power supply temperature in a core device should not be hidden among alerts for ports that were decommissioned six months ago.
It’s also a good idea to keep logs of changes to router configurations. The Really Awesome New CIsco Differ (RANCID) – yes, really – is great for this and can automatically log in to network devices periodically to check for status and configuration changes. It can then store these in a repository of your choice, such as SVN or CVS, which can be configured to e-mail changes to the network teams who will review as necessary.
It almost goes without saying these days but routers still support not only telnet and http, but also ftp and tftp among other unencrypted services. These protocols can transmit password information over public networks in plain text and could feasibly result in an unscrupulous party retrieving login credentials along with any information exchanged between the admin terminal and the device. These are simple to disable with commands like ‘no ip http server’ and ‘no ip telnet server’. These should be executed after enabling their secure equivalents – otherwise the admin proper may be locked out!
Know your flows
In the event of unwanted traffic such as DDoS attack packets passing your borders, it’s especially useful to know where they’re coming from, where they’re going to as well as their type, protocol, size, etc. Happily, mechanisms such as sflow and netflow exist which take a copy of the packet headers and send them off to a reporting server. The reporting server then uses this information to present data on the flows passing through the device. Reports can be generated based on such metrics as top speakers by packets, flows, bits per second, etc., all of which is great for tracking down the source of the DDoS.
BCP38 is part of the best practice documentation and describes a method for preventing IP spoofing by blocking spoofed packets before they enter the Internet. It’s described in RFC 2827 and is designed for DoS attacks, but actually it’s quite easy to implement. For example, BCP38 suggests that if a customer network uses 220.127.116.11/24, only packets with a source address within that subnet are allowed to pass inbound on the customer-facing interface. This is done with an ACL statement, the syntax of which depends on the router software in use.
Lookout and lockout
Another often-overlooked but simple method for securing infrastructure is instantly blocking further access attempts from a source IP address after a number failed access attempts within a predefined timeframe. A good example could be 3 access attempts within 60 seconds. Some network engineers implement tighter controls, perhaps blocking an entire /24 after three attempts from within that /24 in the allowed timeframe. The exact policy implemented will depend on many external factors, such as compliance standards. Some security requirements s uch as PCI DSS make recommendations for minimum settings in these areas.
It’s worth mentioning TCP keepalives at this point. Again, this is from the concept of keeping your device’s TCP stack clean and tidy and not hanging on to stale, half-open TCP sessions. It’s a one-liner on most devices; Cisco use the commands, ‘service tcp-keepalives-in’ and ‘service tcp-keepalives-out’ for sessions originating from outside the router and inside the router respectively. With these enabled, the device will send TCP keepalives, and drop any sessions that remain orphaned.
Small service, big impact
Many DDoS attacks utilise small UDP services such as NTP, DNS and even Chargen because of the small request and large payload. Definitely look at switching these off or at least blocking access to all but your own hosts that require access. Most routers will never have a need to run such services in real-world deployments and are very often abused, as bots will scan for them regularly.
Lastly, and probably most painfully for many network engineers is the lack of saving the configuration of network devices. The simple (and the almost subconsciously habitual) command, ‘write memory’ can save a world of pain after making configuration changes on a device. If the device is rebooted without running this, any changes made since the last ‘write memory’ will be lost. If this includes configuration that covers the remote administration of the device in question, it can often mean a long drive to the data centre before the device can be recovered. See my previous blog on “Out of Band Networks” for more on that scenario.
I’ve raised a few points here, but in reality this is barely scratching the surface when it comes to router security. In addition to the above, it’s worth ensuring that your BGP configuration is secure, your OSPF is not advertising LSAs from the wrong interfaces, your EIGRP keys are long and hard to crack, your prefixes have the correct RPKI origin validation and much, much more.
I think the key message here should be to write your network security policy and make it relevant to the organisation it serves. Stick to it like glue for every device and do not make one device the weak link that breaks the security chain. And importantly, don’t be hesitant to evolve your policy as and when needed.