Chapter 2. Recommendations

Table of Contents

1. Hardware Issues
1.1. Time
1.2. Failing disks
1.3. Hardware monitoring
2. Physical Security
2.1. Safes and Data organization
2.2. Buildings
3. Network Issues
4. Certificate Issues
4.1. CDPs
4.2. Application specific problems
5. Organizational Aspects
5.1. Dual Access Control
5.2. Privacy vs. Security
5.3. Enforcement of Access Control
5.4. Privacy Officer Integration
5.5. Enterprise Integration
5.6. Parallel use of several end user PKIs

This chapter should give you a lot of hints which you should bear in mind if you design your first PKI. Please don't ignore this section if you are an experienced PKI administrator. We try to list the big traps here. So if you know another major problem then please add it.

1. Hardware Issues

This section lists hardware issues which were a problem for some PKIs during their production use. This list does not discuss performance issues.

1.1. Time

One of the biggest problems of PKI systems is the time. There are two different kinds of computers - online and offline. The usual administrators logic is that a network connected computer can use a timeserver. The question is can you trust this timeserver? A timeserver creates two problems. First is the timestamp really from the timeserver and second is the time source of the timeserver trustworthy? The connection to a timeserver can be secured via tunnel technologies like IPsec but the real problem is the time source. The most timeservers finally use a radio clock which receives a unsigned time signal from a radio station. This signal can be easily faked because it is very weak. So network time sources are really insecure.

After discussing the disadvantages of using online computers we can discuss offline technologies. Radio clocks are problematic and hence we need not discuss them twice. Also many buildings made from good ferroconcrete do not have problems with radio signals because they act as really nice Faraday cages! So what are the alternatives? First we need a trustworthy time source. Simply take a digital watch and compare it's time with several other clocks on the Internet, the video text of your TV, a radio clock, the sun ;-), GPS and any other source you can find. Second you transfer the time from your watch to the computer. Last but not least you have to check the drift and the clock itself on the computer. The drift is a small and easy to handle problem. The clock itself is a much bigger problem. If your computer is always connected to a power outlet then the clock should only drift. Please remember this if you put your CA on a notebook and the notebook into a safe. Several new notebooks have really bad CMOS batteries which result in a wrong time at every reboot. So as you can see time is not trivial.

1.2. Failing disks

The most common hardware crashes involve cooler and disk failures. You should backup all your important data - especially ALL issued certificates. Never lose a certificate or you must revoke the complete CA. Backups are a nice thing but it costs some time to recover from a backup. This results in two facts. First you must have a detailed (time-) plan how to recover from a backup. Second you should be able to tolerate disk crashes. RAIDs are sometimes expensive but they helps a lot (ask your SAN admins :) ).

1.3. Hardware monitoring

Usually you can make a visual observation if your laptop crashes. A crashed offline computer can be detected by visual monitoring too. However a crashed online component of a PKI is problem because important information is not available,such as newly issued certificates and CRLs. Such a situation also brings down your services like SCEP. If a public interface of the security infrastructure is down then you are bound to have a trust problem with your users in the future. It is also noteworthy that a simple ping is not enough to detect hardware failure. You cannot detect a crashed web or OpenCA perl server with a ping. Software bugs can also cause 100 percent load. I know this problem from our web mail programs really well :)