The first public release of Multics, the ancestor of Unix, was 40 years ago this month....
http://www.multicians.org/chrono.html
The source is available too. Maybe someone wants to port it to x86?
http://web.mit.edu/multics-history/source/Multics_Internet_Server/Multics_sources.html
Saturday, October 24, 2009
Tuesday, October 20, 2009
Consistent performance
It's great that there's a lot more attention these days to making sure websites are fast. No one likes waiting for a page to load. At Hunch, we're constantly trying to reduce page load times to under a second. We don't always get there, but it's what we're aiming for.
There are two parts to meeting these goals. First, your servers have to be able to generate a page in well under a second. Second, your servers have to consistently do this even as more load is applied to them, errors occur, flash crowds appear, etc. The actual number of pages per second each server can do isn't actually part of the equation (assuming your app is reasonably horizontally scalable over the performance range of interest). That's really more just a question of economics and optimizing hardware expense vs programmer time to optimize code.
Unfortunately, building applications that provide low variance performance is really hard. The simplest thing to do is run all infrastructure at no more than 50-75% utilization. Configure Apache to limit the number of client connections it will accept (MaxClients setting), see if your load balancers will reject traffic above a certain level, and design your application with an "emergency mode" where it simplifies its operation (no writing, no advanced features etc).
In my experience, the next most common reason for variable performance is unpredictable locking problems or code errors leading to critical locks being held for too long. For example, if a web server leaks a db connection and doesn't end an open transaction, the entire db may be inaccessible to other clients due to that transaction's locks. Ideally db's would have way to automatically abort any transaction running more than a certain amount of time. Alternatively, you can write scripts that monitor your db and kill transactions that have run for too long.
Another example of variable performance is when writes happen to form a hot spot somewhere in your data leading to contention for synchronizing access to that data. One solution to this is to not actually write data from your web app to your db, but instead log it to a file or shared queue and have a backend process collect the data and lazily write it as time permits. Of course, this only works if you don't need immediate access to the data. Another solution is to use data structures that don't require complicated synchronization. For example, instead of updating statistics about your site in your db, update them in memcache using atomic increment operations and periodically sync them back to a db.
Finally, caching can lead to highly variable performance. This can happen because requests are going to objects that weren't previously cached or because your cache accidentally evicted an important piece of data to make room for some less important piece of data. Worst case, you discover that your app actually can't run without a warm cache and you can't actually restart your app under load.
There are two parts to meeting these goals. First, your servers have to be able to generate a page in well under a second. Second, your servers have to consistently do this even as more load is applied to them, errors occur, flash crowds appear, etc. The actual number of pages per second each server can do isn't actually part of the equation (assuming your app is reasonably horizontally scalable over the performance range of interest). That's really more just a question of economics and optimizing hardware expense vs programmer time to optimize code.
Unfortunately, building applications that provide low variance performance is really hard. The simplest thing to do is run all infrastructure at no more than 50-75% utilization. Configure Apache to limit the number of client connections it will accept (MaxClients setting), see if your load balancers will reject traffic above a certain level, and design your application with an "emergency mode" where it simplifies its operation (no writing, no advanced features etc).
In my experience, the next most common reason for variable performance is unpredictable locking problems or code errors leading to critical locks being held for too long. For example, if a web server leaks a db connection and doesn't end an open transaction, the entire db may be inaccessible to other clients due to that transaction's locks. Ideally db's would have way to automatically abort any transaction running more than a certain amount of time. Alternatively, you can write scripts that monitor your db and kill transactions that have run for too long.
Another example of variable performance is when writes happen to form a hot spot somewhere in your data leading to contention for synchronizing access to that data. One solution to this is to not actually write data from your web app to your db, but instead log it to a file or shared queue and have a backend process collect the data and lazily write it as time permits. Of course, this only works if you don't need immediate access to the data. Another solution is to use data structures that don't require complicated synchronization. For example, instead of updating statistics about your site in your db, update them in memcache using atomic increment operations and periodically sync them back to a db.
Finally, caching can lead to highly variable performance. This can happen because requests are going to objects that weren't previously cached or because your cache accidentally evicted an important piece of data to make room for some less important piece of data. Worst case, you discover that your app actually can't run without a warm cache and you can't actually restart your app under load.
Friday, October 16, 2009
Systems programming in "scripting" languages
Whenever someone thinks about building databases, web servers or other high performance pieces of server software the natural assumption is that it has to be written in C or C++ (or maybe java). The argument is usually something along the lines of only C or C++ provides the necessary performance, scalability etc. I think this is fundamentally missing the point of what makes high performance and highly scalable server software.
Most server software that is performance bound is limited by I/O, poor multithreading concurrency, or some finite kernel resource (threads, open files etc). None of these have much to do with the implementation language. Efficient I/O can be initiated just as easily from Python as from C++. Libevent and other ways to demultiplex concurrent I/O can be done just as well by Python as C++.
Now consider the common problems with server software -- security, maintainability and porting. Exactly the three things that higher level languages like Python can make a lot easier.
Now, there's one potential down side. Most of these alternative languages are garbage collected and potentially can expose your application to arbitrary and long pauses while the garbage collector runs. In practice, this is less of a problem than you might think. Concurrent garbage collectors work well. Python's reference counting + mark&sweep strategy results in very little volatility in latency in practice.
If I were building a server application designed to handle billions of requests across a large farm of servers I'd definitely be thinking of building it in Python instead of C/C++.
Most server software that is performance bound is limited by I/O, poor multithreading concurrency, or some finite kernel resource (threads, open files etc). None of these have much to do with the implementation language. Efficient I/O can be initiated just as easily from Python as from C++. Libevent and other ways to demultiplex concurrent I/O can be done just as well by Python as C++.
Now consider the common problems with server software -- security, maintainability and porting. Exactly the three things that higher level languages like Python can make a lot easier.
Now, there's one potential down side. Most of these alternative languages are garbage collected and potentially can expose your application to arbitrary and long pauses while the garbage collector runs. In practice, this is less of a problem than you might think. Concurrent garbage collectors work well. Python's reference counting + mark&sweep strategy results in very little volatility in latency in practice.
If I were building a server application designed to handle billions of requests across a large farm of servers I'd definitely be thinking of building it in Python instead of C/C++.
Monday, October 12, 2009
Yahoo Open Hack NYC
This past Friday and Saturday I had a fun 24 hours coding at Yahoo's NYC Hack Day. The idea is Yahoo provides a big space, lots of junk food, and a bunch of Yahoo'ers to answer questions and you get to code with a bunch of interesting folks. At the end of it all, everyone demo's their stuff and there are prizes in various categories. You were encouraged to build stuff using Yahoo APIs, but by no means had to.
I ended up building a sort of mobile craigslist using the new geolocation support built into iPhone Safari and Firefox 3.5. Suffering from the MIT problem of building your own tools instead of just focusing on the application, I also had to build my own python app server yaaps (Yet Another App Server).
My friends from Hunch built some cool TV widgets. I had no idea that Yahoo had managed to get most of the major TV vendors to embed a linux based computer running widgets from the Yahoo app store. The Amazon widget, for example, lets you stream TV and movies on demand from the Amazon store.
The other interesting thing I learned about was YQL. YQL is Yahoo's effort to provide standardized adapters to load data from many different sources and then expose it through a SQL interface. Once someone writes an adapter for some source, anyone else can query that data without having to learn the peculiarities of the source's API. Plus, the YQL backend has lots of smarts so you can join different sources against each other, use aggregate operations like max(), etc that make complicated data access much easier than using the raw underlying APIs.
I ended up building a sort of mobile craigslist using the new geolocation support built into iPhone Safari and Firefox 3.5. Suffering from the MIT problem of building your own tools instead of just focusing on the application, I also had to build my own python app server yaaps (Yet Another App Server).
My friends from Hunch built some cool TV widgets. I had no idea that Yahoo had managed to get most of the major TV vendors to embed a linux based computer running widgets from the Yahoo app store. The Amazon widget, for example, lets you stream TV and movies on demand from the Amazon store.
The other interesting thing I learned about was YQL. YQL is Yahoo's effort to provide standardized adapters to load data from many different sources and then expose it through a SQL interface. Once someone writes an adapter for some source, anyone else can query that data without having to learn the peculiarities of the source's API. Plus, the YQL backend has lots of smarts so you can join different sources against each other, use aggregate operations like max(), etc that make complicated data access much easier than using the raw underlying APIs.
Subscribe to:
Posts (Atom)