I'll start the series with some thoughts on queues. I think queues are one of the great under-appreciated abstractions in software. I constantly find that they're the basis of simplifying and solving many hard processing problems.
For example, the simple queue allows a producer to be separated from a consumer of data. Fundamentally, a producer-consumer queue is powerful because is synchronizes and decouples two independent parts of a system and lets them be written and run in isolation from each other. Anything that lets us decouple, synchronize and isolate a complicated system into simpler pieces is a good thing!
The producer and consumer might be in the same thread and simply be used as a convenience to store work that still needs to be done (as in remembering nodes that still need to be visited for a breadth-first search). Alternatively, the producer and consumer could be in the same thread but asynchronous events such as a disk controller generating an interrupt that is quickly queued for later handling by a disk driver in an operating system. Finally, the producer and consumer may be in different threads and serve as an explicit loosely coupled way for the two threads to coordinate their actions -- perhaps one thread generates log records and the other thread writes them out to a file.
A natural extension of a producer-consumer queue is to have multiple producers and consumers arranged linearly to form a processing pipeline. N different processing steps can be de-coupled from each other and still be properly synchronized through the very simple to understand idea of putting work into an output queue and reading work from an input queue.
Naturally, each stage of a pipeline can run concurrently with every other stage without requiring subtle reasoning about the correctness of locking schemes. In fact, multiple instances of each stage of the pipeline can run concurrently in order to increase throughput of the system without requiring additional synchronization effort.
If the queues are transactional and persistent, then the stages of the pipeline can be run on independent machines where machines and processes can fail without affecting the correctness of the system. Again, all without subtle failure recovery logic in the applications using the queuing service.
If the queues are of finite size, they can be used to schedule processes. When an output queue fills up a process must wait for space to free up, thus letting whatever process that is reading from that queue a chance to catch up.
Imagine being able to write a program that scales to large numbers of machines/processors, runs reliably even in the face of machine and process failures, properly synchronizes itself, inherently applies back-pressure to slow down parts of the system that are running too fast and which requires nothing more complicated than understanding the idea of being able to put() to an output queue and get() from an input queue?