Free Programming Books
Free download ebooks on computer and programming | |||
Free ebook "Squid: The Definitive Guide" Sample ChapterSquid: The Definitive Guide Download chapterFree download Chapter 8: Advanced Disk Cache Topics Squid is the most popular Web caching software in use today, and it works on a variety of platforms including Linux, FreeBSD, and Windows. Written by Duane Wessels, the creator of Squid, Squid: The Definitive Guide will help you configure and tune Squid for your particular situation. Newcomers to Squid will learn how to download, compile, and install code. Seasoned users of Squid will be interested in the later chapters, which tackle advanced topics such as high-performance storage options, rewriting requests, HTTP server acceleration, monitoring, debugging, and troubleshooting Squid. Advanced Disk Cache TopicsPerformance is one of the biggest concerns for Squid administrators. As the load placed on Squid increases, disk I/O is typically the primary bottleneck. The reason for this performance limitation is due to the importance that Unix filesystems place on consistency after a system crash. By default, Squid uses a relatively simple storage scheme (ufs). All disk I/O is performed by the main Squid process. With traditional Unix filesystems, certain operations always block the calling process. For example, calling open( ) on the Unix Fast Filesystem (UFS) causes the operating system to allocate and initialize certain ondisk data structures. The system call doesn't return until these I/O operations complete, which may take longer than you'd like if the disks are already busy with other tasks. Under heavy load, these filesystem operations can block the Squid process for small, but significant, amounts of time. The point at which the filesystem becomes a bottleneck depends on many different factors, including:
Do I Have a Disk I/O Bottleneck?Web caches such as Squid don't usually come right out and tell you when disk I/O is becoming a bottleneck. Instead, response time and/or hit ratio degrade as load increases. The tricky thing is that response time and hit ratio may be changing for other reasons, such as increased network latency and changes in client request patterns. Perhaps the best way to explore the performance limits of your cache is with a benchmark, such as Web Polygraph. The good thing about a benchmark is that you can fully control the environment and eliminate many unknowns. You can also repeat the same experiment with different cache configurations. Unfortunately, benchmarking often takes a lot of time and requires spare systems that aren't already being used. If you have the resources to benchmark Squid, begin with a standard caching workload. As you increase the load, at some point you should see a significant increase in response time and/or a decrease in hit ratio. Once you observe this performance degradation, run the experiment again but with disk caching disabled. You can configure Squid never to cache any response (with the null storage scheme, see the later section "The null Storage Scheme"). Alternatively, you can configure the workload to have 100% uncachable responses. If the average response time is significantly better without caching, you can be relatively certain that disk I/O is a bottleneck at that level of throughput. If you're like most people, you have neither the time nor resources to benchmark Squid. In this case, you can examine Squid's runtime statistics to look for disk I/O bottlenecks. The cache manager General Runtime Information page (see Chapter 14) gives you median response times for both cache hits and misses: Median Service Times (seconds) 5 min 60 min: HTTP Requests(All): 0.39928 0.35832 Cache Misses: 0.42149 0.39928 Cache Hits: 0.12783 0.11465 Near Hits: 0.37825 0.39928 Not-Modified Replies: 0.07825 0.07409 For a healthy Squid cache, hits are significantly faster than misses. Your median hit response time should usually be 0.5 seconds or less. I strongly recommend that you use SNMP or another network monitoring tool to collect periodic measurements from your Squid caches (see Chapter 14). A significant (factor of two) increase in median hit response time is a good indication that you have a disk I/O bottleneck. If you believe your production cache is suffering in this manner, you can test your theory with the same technique mentioned previously. Configure Squid not to cache any responses, thus avoiding all disk I/O. Then closely observe the cache miss response time. If it goes down, your theory is probably correct. Once you've convinced yourself that disk throughput is limiting Squid's performance, you can try a number of things to improve it. Some of these require recompiling Squid, while others are relatively simple steps you can take to tune the Unix filesystems. | |||