It's great that there's a lot more attention these days to making sure websites are fast. No one likes waiting for a page to load. At Hunch, we're constantly trying to reduce page load times to under a second. We don't always get there, but it's what we're aiming for.
There are two parts to meeting these goals. First, your servers have to be able to generate a page in well under a second. Second, your servers have to consistently do this even as more load is applied to them, errors occur, flash crowds appear, etc. The actual number of pages per second each server can do isn't actually part of the equation (assuming your app is reasonably horizontally scalable over the performance range of interest). That's really more just a question of economics and optimizing hardware expense vs programmer time to optimize code.
Unfortunately, building applications that provide low variance performance is really hard. The simplest thing to do is run all infrastructure at no more than 50-75% utilization. Configure Apache to limit the number of client connections it will accept (MaxClients setting), see if your load balancers will reject traffic above a certain level, and design your application with an "emergency mode" where it simplifies its operation (no writing, no advanced features etc).
In my experience, the next most common reason for variable performance is unpredictable locking problems or code errors leading to critical locks being held for too long. For example, if a web server leaks a db connection and doesn't end an open transaction, the entire db may be inaccessible to other clients due to that transaction's locks. Ideally db's would have way to automatically abort any transaction running more than a certain amount of time. Alternatively, you can write scripts that monitor your db and kill transactions that have run for too long.
Another example of variable performance is when writes happen to form a hot spot somewhere in your data leading to contention for synchronizing access to that data. One solution to this is to not actually write data from your web app to your db, but instead log it to a file or shared queue and have a backend process collect the data and lazily write it as time permits. Of course, this only works if you don't need immediate access to the data. Another solution is to use data structures that don't require complicated synchronization. For example, instead of updating statistics about your site in your db, update them in memcache using atomic increment operations and periodically sync them back to a db.
Finally, caching can lead to highly variable performance. This can happen because requests are going to objects that weren't previously cached or because your cache accidentally evicted an important piece of data to make room for some less important piece of data. Worst case, you discover that your app actually can't run without a warm cache and you can't actually restart your app under load.