Apache Configuration for better performanceJan 4, 2011

Configuring your Apache server correctly and getting most out of it can make a huge difference how your website works and impression it makes on its users. Especially on dynamic sites even fraction of a second matters. In this article let's have a look how we can measure and improve Apache server performance.

What Apache is good at is it is designed to be as fast as possible, even with the default configuration it works quite well for a normal website, however as sites become more complex squeezing your Apache installation and getting the best out of it becomes more important.

Before spending time on how to improve performance you should check how fast your current server is running and what performance level you can work out. Understand the web server requirements and experiment with various available options. It is important to figure out which parts of your web application are creating the problem use tools like ab, httperf or WAST.

Make sure you don't have any unnecessary background applications running on your server. i.e. printing services on UNIX or sendmail if not required.

First import thing is built Apache correctly, with the extensions and modules you required for your website. The more modules, the more memory used. Some modules are automatically included, so you will have to explicitly enable and disable desired modules.

Dynamic components

Dynamic components are the most time sapping components of any web server. Dynamic components, like CGI, can increase response time just to load and execute a simple application. A better option is to use a script based solution. i.e. mod_perl, python or Jakarta interface for Java.

The main advantage of the script based solutions is that they embed the interpreter into the Apache executable and removes the initial loading problem with dynamic scripts. The disadvantage however is that configuration can be complex and getting the exact system correct can be time consuming. Some solutions even don't work quite as one would expect with virtual hosts, and you will need to change certain scripts to take full advantage of the speed enhancements on offer. The improvements, however, can be significant, with as much as 60 to 70 percent of execution time being knocked off of a Perl script simply by using mod_perl in place of CGI.

Static components

If your website uses a lot of static components you can use a tiny Apache (with minimum modules statically compiled) as the front end server to serve static contents. Request for dynamic contents are forwarded to the heavy Apache (compiled with all required modules). It will serve static contents fast without much memory usage. This can be achieved by using mod_proxy and mod_rewrite. Suppose you have a lightweight Apache server listening to port 80 and the heavyweight Apache server listening on port 8080 and the following configuration in the lightweight Apache configuration file:

ProxyPassReverse / http://%{HTTP_HOST}:8080/
RewriteEngine on
RewriteCond %{REQUEST_URI} !.*\.(gif|png|jpg|js|ico|css)$
RewriteRule ^/(.*) http://%{HTTP_HOST}:8080/$1 [P]

Apache configuration directives

There are lots of Apache configuration directives available in the standard Apache distribution. In order to achieve best out of your Apache server it is important to understand what Apache configuration directives are and how they work. Let's have a look at few important Apache configuration directives

AllowOverride

One incredibly useful way of extending Apache server configurable parameters without editing the main configuration file is using .htaccess files. The problem is that the use of .htaccess files also slow down the server.

If AllowOverride is not set to 'None', then Apache will attempt to open .htaccess file in each directory that it visits. For example:

If your web document root is set as:

DocumentRoot /var/www/public_html
<Directory />
    AllowOverride all
</Directory>

If a request is made for URI index.html, then Apache will attempt to open
/.htaccess
/var/.htaccess
/var/www/.htaccess
/var/www/public_html/.htaccess

Apache not only have to look if a .htaccess file exists (then it has to parse and process the elements), but also for any parent directories and then make the changes based on the contents of all of those files.

To get maximum performance you should disable the use of .htaccess files. Any directory specific configuration can go in the main configuration file where it can be parsed once by Apache. However if directory specific configuration is in main configuration file you will have to restart Apache for any change to take affect.

MaxRequestsPerChild

The MaxRequestsPerChild directive sets the limit on the number of requests that an individual child server process will handle. After MaxRequestsPerChild requests, the child process will die. It's set to 0 by default, that means the child process will never expire. It is appropriate to set this to a value of few thousands. This can help prevent memory leakage since the process dies after serving a certain number of requests. Do not set this too low, since creating new processes does have overhead.

KeepAlive

The KeepAlive directive allows multiple requests to be sent over the same TCP connection. This is particularly useful while serving HTML pages with lots of images. If KeepAlive is set to Off then each image request will require a separate TCP connection to be made. Overhead due to establishing TCP connection can be eliminated by turning On KeepAlive.

KeepAliveTimeout

KeepAliveTimeout determines how long to wait for the next request. Set this to a low value, perhaps between two to five seconds. If it is set too high, child processed are tied up waiting for the client when they could be used for serving new clients.

HostnameLookups

The HostnameLookups directive enables DNS lookup so that hostnames can be logged instead of the IP address. This adds latency to every request since the DNS lookup has to be completed before the request is finished. HostnameLookups is Off by default in Apache 1.3 and above. Leave it Off and use post-processing program such as logresolve to resolve IP addresses in Apache's access logfiles. Logresolve ships with Apache.

When using Allow from or Deny from directives, use IP address instead of a domain name or a hostname. Otherwise a double DNS lookup is performed to make sure that the domain name or the hostname is not being spoofed.

FollowSymLinks and SymLinksIfOwnerMatch

If FollowSymLinks option is set, then the server will follow symbolic links in this directory. If SymLinksIfOwnerMatch is set, then the server will follow symbolic links only if the target file or directory is owned by the same user as the link.

If SymLinksIfOwnerMatch is set, then Apache will have to issue additional system calls to verify whether the ownership of the link and the target file match. Additional system calls are also needed when FollowSymLinks is NOT set. For example:

DocumentRoot var/www/public_html
<Directory />
   Options SymLinksIfOwnerMatch
</Directory>

For a request made for URI /index.html, Apache will perform lstat() on /var, /var/www, /var/www/public_html, and /var/www/public_html/index.html. These additional system calls will add to the latency. The lstat results are not cached, so they will occur on every request.

For maximum performance, set FollowSymLinks everywhere and never set SymLinksIfOwnerMatch. Or else, if SymLinksIfOwnerMatch is required for a directory, then set it for that directory alone.

MaxClients

The MaxClients sets the limit on maximum simultaneous requests that can be supported by the server. No more than this much number of child processes are spawned. It shouldn't be set too low such that new connections are put in queue, which eventually time-out and the server resources are left unused. Setting this too high will cause the server to start swapping and the response time will degrade drastically. Increase ServerLimit to set MaxClients above 256.



blog comments powered by Disqus
Me Hi! My name is Zeeshan Muhammad Khan (nick name Shan) and I am a software engineer, database developer, web developer, programming geek, statistics geek, mathematics geek, system analyst and maintainer of this site. read more

Web Shelf