|
We hope this workshop enables you to install the Apache web server on a RedHat Linux computer in a fairly secure way. The workshop assumes you have RedHat Linux installed and working on a computer directly attached to a network (LAN). It also assumes you have a basic understanding of working with Linux.
Single line directives consist of just the directive keyword followed by one or more words (arguments) that may or may not be predefined. Whether or not the arguments are predefined depends on the directive. The Apache server ignores any whitespace (spaces, tabs, etc.) after the initial whitespace separating directives and corresponding arguments. For example:
ServerName localhost
You may find multiple-line directives very similar to HTML tags. The beginning of a multi-line directive starts with an opening angle bracket, the directive name, and a closing angled bracket. Any number of directive-specific sub-directives may follow the opening directive. Like single line directives, the Apache server ignores any whitespace (spaces, tabs, etc.) after the initial whitespace separating directives and corresponding arguments. You close the multiple-line directive with a opening angle bracket, a forward slash, the directive name, and a closing angled bracket. For example:
<Directory />
Options FollowSymLinks
AllowOverride None
</Directory>
By default, each directive is commented well so you can usually read the comment to figure out the intended purpose. The table below lists the directives in the sequence they appear in the configuration file. Make only the changes listed in the "Workshop" column and read the information covered in the "Production" column.
| Directive | Workshop | Production (with Explanation) |
|---|---|---|
| ServerType standalone | No change | Traditionally, Unix servers used a program called "inetd" that "listened" for client requests (Telnet, mail, FTP, web, etc.). This program saved system memory resources because it only loaded the corresponding server program as it was needed (at a small load-time penalty). However, thanks to the reduced cost of primary memory (RAM), we rarely implement inetd anymore. Also, web servers tend to see quite a bit of traffic so since it will be frequently loaded anyway, we might as well leave it running. |
| ServerRoot "/etc/httpd" | No change | The directory structure is already in place to support the error and log files and you probably should not change this unless your log files outgrow the root (/) partition. Note that although the directory specifies "/etc/httpd," the "/etc/httpd" directory actually points to "/var/log/httpd." You can verify this by performing an `ls -l /etc/httpd` to see where things actually end up. |
| LockFile /var/lock/httpd.lock | No change | See file's comments. |
| PidFile /var/run/httpd.pid | No change | Each program that runs on a computer is assigned a number known as the "process identifier" (aka PID, hence the "pid" filename extension). As the server's manager, you may find this number handy to know to manage the server (to `kill` it, for example). By leaving it the PidFile in the default location, you can use more automated routines to manage the server (eg, `apachectl`). |
|
ScoreBoardFile /var/run/httpd.scoreboard Timeout 300 |
No change | See file's comments. |
|
KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 15 |
No change | Before NCSA introduced the "KeepAlive" directive, every request to the server had to be set up uniquely. As you can imagine, there is significant overhead in that process (allocating handles, authorizing/authenticating, etc.). The "KeepAlive" directive enables browsers to use one connection to do their actions. This saves time for the server and the client. |
|
MinSpareServers 5 MaxSpareServers 20 StartServers 8 MaxClients 150 MaxRequestsPerChild 100 |
No change |
If you ever bring up a web server in
production you need to seriously consider the hardware
resources available on your web server. You should consider:
|
|
LoadModule AddModule |
No change |
Modules embrace and extend the functionality of the Apache core.
Many modules come with Apache to provide basic
expected web services. However, because Apache is an open source
product, many third parties create modules for use with Apache.
This includes scripting languages like Perl, PHP, and others. It
also includes database interface modules. You can [un]comment modules in this section. It is recommended that you just comment out the modules you don't want because later it will be easier for you to add them back in as necessity dictates. If you do change a "LoadModule", however, be sure you change the corresponding "AddModule." Season to taste here. You can find a good description of the default set of modules at the Apache web site. NB: the config, agent, and referrer logs are actually documented as mod_log_config/agent/referred. For production, you should consider commenting out the following modules:
|
| Port 80 | No change | This is the default web server port location. Change this if you want to prevent casual surfers from accessing your web site, or you want to add another, totally separate server to your web site. For example, if you change this to 8080, then you would specify the URL as http://www.somedept.unt.edu:8080/. |
| User nobody Group nobody | No change | By default, Unix systems create the "nobody" user and "group" accounts. Since people access the web server remotely, if someone was able to hack into the server, this means their hacking success only brings them the privileges of the "nobody" account. The "nobody" account/group does not get many privileges--and you should avoid given it any! |
| ServerAdmin root@localhost | Use your email address | Change this to a valid email address. If you set up mail services on the Linux computer, then you should set up appropriate email aliases in "es;/etc/aliases"es; (and run `newaliases`) to forward email to a regular user account. In either case, replace "root@localhost" with a valid email address. Aliases (on this or another computer) prove useful because people only need to remember one email address (and not who is managing the system). Good choices include one or all of the following: www, webmaster, operator, sysadm, and abuse. |
| #ServerName localhost | Change to your hostname | Uncomment this line and replace "localhost" with the name of this web server. |
| DocumentRoot "/home/httpd/html" | No change |
If you want to replace "/home/httpd/html," replace it
with a directory accessible
to the "nobody" account. For example, you could
follow these steps:
|
|
<Directory /> Options FollowSymLinks AllowOverride None </Directory> |
No change |
See the directory
directive documentation for extensive commentary on this
directive. Note the "/" argument following the
beginning "Directory" term. That means this directive
applies to the root directory (not the document root
directory) of the entire server. Apache can service any directory
on the computer, so this is a general overall restriction. I recommend that you change the "Options" sub-directive to read "Options None." Reasons follow:
|
|
<Directory "/home/httpd/html"> Options Indexes Includes FollowSymLinks AllowOverride None Order allow,deny Allow from all </Directory> |
No change |
Note this "Directory" directive affects only the
"/home/httpd/html" directory (assuming you configured
"/home/httpd/html" as your DocumentRoot earlier in the
configuration file). On a production server, to be absolutely
safe, I recommend replacing the default options with
"None" as in the previous directive. This directive includes some new sub-directives. The "Order" sub-directive specifies the way in which we allow clients to connect. In this case, by default, we allow people and then check if they should be denied. This is fine for most public access web servers, but there are other approaches. |
| UserDir public_html | No change | This enables users with login IDs on the system to have their own web directory space. If my login name is zaphod, the a request to http://www.somedept.unt.edu/~zaphod/public_html would correspond to my ~/public_html directory. If you choose to allow users to have these kinds of directories, you should ensure they are locked down as specified in the comments in the default httpd.conf file. |
| DirectoryIndex index.html index.htm index.shtml index.cgi | No change | This directive specifies the filenames the web server send to the client by default when the client request refers to a server directory. The web server searches for names starting from the left and going to the right. This means that, to improve efficiency, you should enter the most frequently-used name left-most in the listing. |
| AccessFileName .htaccess | No change | Access files can appear in any directory and affect all subdirectories. They follow the same conventions as the DirectoryIndex filenames. |
|
<Files ~ "^\.ht"> Order allow,deny Deny from all </Files> |
No change | See the important comments in httpd.conf file. |
| UseCanonicalName On | No change | See file's comments. |
| TypesConfig /etc/mime.types | No change | By default, when a client asks the server for a document, it sends plain text. The "mime.types" file sets up associations based on filename extensions. For example, the last line in the "/etc/mime.types" file includes a mime type to specify text/html for files with the extension "html" or "htm." If you add support for some other program or browser plug-in, you would want to ensure the "mime.types" document contained a corresponding entry. The mime-type tells the web browser the type of information and the web browser then determines a corresponding application. |
| DefaultType text/plain | No change | See file's comments. |
|
<IfModule mod_mime_magic.c> MIMEMagicFile share/magic </IfModule> |
No change | The "IfModule" directive determines whether or not a given module (extension to the basic Apache server) is loaded. Then, if the module is loaded, any additional sub-directives can be loaded. If the module isn't loaded, then the Apache server won't generate errors (otherwise due to the strange sub-directives). |
| HostnameLookups Off | No change | One side-effect of leaving these off is that your logs will not display the canonical name of the IP address. Many log programs, however, will look up the names for you when you run them. |
|
ErrorLog /var/log/httpd/error_log LogLevel warn LogFormat CustomLog |
No change | See file's comments. |
| ServerSignature On | No change | See file's comments. |
|
Alias /icons/ "/home/httpd/icons/" <Directory "/home/httpd/icons"> Options Indexes MultiViews AllowOverride None Order allow,deny Allow from all </Directory> |
No change | Aliases are handy for when directories get renamed or moved or are deep in a directory tree. In this case, the icons alias sets up a directory for use by the autoindex sub-directive to provide icons to the client. |
|
ScriptAlias /cgi-bin/ "/home/httpd/cgi-bin/" <Directory "/home/httpd/cgi-bin"> AllowOverride None Options ExecCGI Order allow,deny Allow from all </Directory> |
No change | You can place CGI programs in the directory specified by the second argument. |
|
Read the comments for the remaining directives to understand how
they work--they should be straightforward now that you have a taste
for how things work. The only peculiar one remaining is the
<Location> directive. The <Location> directive is similar to the <Directory> directive except that it functions entirely outside the filesystem. In other words, it functions purely through the URL specified by the client. This is useful when you have some modules (such as the status module) that specify a "virtual" directory and corresponding URL. For example, in my opinion, the "status" provides information you do not want to share with anybody. You could use the <Location> directive to limit the access to the status module's URL. In fact, the configuration file has an example of just how to do that. |
||
/etc/rc.d/init.d/httpd start Starting httpd: [ OK ]If you do not see any errors, try accessing the web site. You should get default test page. Try replacing the "/home/httpd/html" file with your own and then access the web site again. You should get your web page. To stop the web server, you can enter `/etc/rc.d/init.d/httpd stop` and you should see something similar to the following:
/etc/rc.d/init.d/httpd stop Shutting down http: [ OK ]A summary of the commands to manager the web server follow:
| Command | Explanation |
|---|---|
| start | Starts the web server if it isn't started. |
| stop | Stops the web server if it is running. If it isn't running, it will say "FAILED" after it finishes running. |
| restart | Restarts the web server it is running or starts it if it isn't running. |
| reload | Causes the web server to reload its configuration file. NB: I've had mixed results with this actually working, so now I always stop/start (or restart) the web server to ensure the changes are in effect. |
| status | Displays the process IDs of the running web servers. |
You can view the log by entering `less /var/log/httpd/error_log`. As you may notice, each time you stop/start the server, it gets logged to this file (as a notice). Try generating some errors on the web server by accessing web pages that don't exist. Now view the log again. Why are those considered errors?
It is important to understand that if a page has been moved without an appropriate redirector put in place, then this is where you would expect to find the error listed. So even though you may have intentionally moved the page, it is still considered an error if someone fails to find it.
If you want to do some testing and would like a real-time display of the log file, you can enter `tail -f /var/log/httpd/error_log`. The `tail` command, by default, displays the last lines of a file. By specifying "-f," the `tail` command continuously refreshes the end of file displayed. Start the command now and cause some errors to see this in action.
Webalizer requires a PNG library for producing graphics. To install the PNG library, enter the `rpm -v --install ftp://rpmfind.net/linux/contrib/libc6/i386/gd-1.8.2-1.i386.rpm` command. Now you can install the webalizer program.
To install the webalizer programs, enter the `rpm -v --install ftp://rpmfind.net/linux/contrib/libc6/i386/webalizer-2.00.12-2.i386.rpm` command. You now have everything you need to run webalizer.
As you may remember from the section regarding the configuration file, we do not, by default, resolve client IP addresses. This saves time for the client and server. Typically, web utilization programs are run when system utilization is low. The Webalizer program has two primary components. On component, webazolver, builds a DNS cache of all the IP address in the access_log file. The other component, webalizer, works with both the access_log and DNS cache to create the access log summary. When Linux installed the software, it set it up to run automatically, on a regular basis, by updating the "/etc/crontab" file. You can edit/change this file to suit needs.
Edit the "/etc/webalizer.conf" file and make the following changes:
webalizer Webalizer V2.00-12 (Linux 2.2.14-5.0) English Using logfile /var/log/httpd/access_log (clf) DNS Lookup (5): 1 addresses in 0.22 seconds Using DNS cache file dns_cache.db Creating output in /home/httpd/html/webalizer Hostname for reports is 'jiggy.cas.unt.edu' History file not found... Generating report for June 2000 Generating summary report Saving history information... 35 records in 1.15 seconds, 30/secIf your output is similar to the above, you should be able to go to your web site to see the results. They should not appear as anything fantastic since you only have had the web server on-line for a few hours.
It turns out that clients accessing a web server tend to do come from various geographic sites. This causes a problem wherein it can cause the system administrator a great deal of trouble tracking down hackers and crackers. For this reason, it is best not to use any sort of scripting and when you must, lock the scripts down very tightly. It turns out that poor programming has proven to be such a nuisance that the W3C group has published several clear, concise, and well-written documents on the subject of CGI script security.
#!/bin/sh echo "Content-type: text/plain" echo "" echo "This is the printenv script" echo "" NOW=`/bin/date` echo "This script last ran on $NOW" echo "" echo "The environment:" /usr/bin/printenv | /bin/sortAfter you save the file, make sure it is executable by "everyone." Because the web server runs as user "nobody," we need to enable the user "nobody" to run this script. You can make it writable by you, but executable by everyone else by entering the `chmod 755 printenv` command.
Now test the CGI script by going to the URL similar to "http://www.somdept.unt.edu/cgi-bin/printenv." If you get an error, review the server's error log to find a clue to the problem's cause.
You can actually write a program in any language to achieve similar results. For example, you could write a Perl script similar to the following:
#!/usr/bin/perl
$now=localtime();
print <<EOD;
Content-type: text/plain
This is the printenv script written in Perl
This script last ran on $now
The environment:
EOD
foreach $key (sort keys %ENV) {
print "$key=$ENV{$key}\n";
}
Notice the "Content-type:" line followed by the blank line?
It specifies a MIME type. On static text files, the web server sets
this for you automatically according to the http protocol
specification. However, by default, CGI scripts have no MIME types.
Therefore, you must define one. If, for example, you want to send
HTML to the client, then you should specify the "text/html"
MIME type.Some of the most common mistakes made in CGI programming include:
Wouldn't it be nice if you could add just a little more functionality to your web pages without have to weather the penalty of load/execution time every time a client makes a CGI request? That is were server side scripting languages step in and Apache has a very simple one called "SSI" or, Server Side Include scripts.
The penalty, of course, is some speed. However, the SSI scripting language is simple enough that it should not impact performance too greatly. The real problem is that, by default, the exec() function is allowed.
The exec() function enables you to execute programs on the server and feed the output to the client's web browser. This also gives hackers one more tool to try. I they find writable space on your filesystem, they can create a small SSI script to further their hacking into your computer. However, SSI scripts are useful for providing that extra local touch and also can make managing pages easier.
For example, if all your pages in a series of files need the same header and footer, then SSI can help you there by enabling you to "include" and external file. For this, you can safely turn off the exec() functionality. Add the following to the file "/home/httpd/html/header.html:"
<!-- Header document begins --> <HTML><HEAD> <TITLE>SSI Test</TITLE> </HEAD><BODY> <!-- Header document ends -->Add the following to the file "/home/httpd/html/footer.shtml" (note shtml extension):
<!-- Footer document begins --> <HR> Authored by thewiz@oz.org.<BR> Current time is <!--#echo var="DATE_LOCAL" -->.<BR> </BODY></HTML> <!-- Footer document ends -->Add the following to the file "headcase1.shtml" (note shtml extension):
<!--#include file="header.html" --> This is the file <!--#echo var="DOCUMENT_NAME" -->.<BR> It was last updated on <!--#echo var="LAST_MODIFIED" -->.<BR> Want to see <A HREF="headcase2.shtml">headcase2</A>?<BR> <!--#include file="footer.shtml" -->And, finally, add the following to the file "headcase2.shtml" (note shtml extension):
<!--#include file="header.html" --> This is the file <!--#echo var="DOCUMENT_NAME" -->.<BR> It was last updated on <!--#echo var="LAST_MODIFIED" -->.<BR> Want to see <A HREF="headcase1.shtml">headcase1</A>?<BR> <!--#include file="footer.shtml" -->Now load the "headcase1.shtml" page to see the results. Notice that when you view source you don't see everything you entered because the server replaced the SSI command with the desired output.
If you find this valuable, you should visit the BigNoseBird web site for complete details including tutorials. We cover SSI conceptually here to prepare you for future scripting workshops.
One way that you can limit damage relating to the use of scripts is to limit the people who can access them, or require some sort of authorization for accessing them. That is the purpose of the access control directives, which we cover next.
Please note that no authentication method is 100% secure. Hackers can "spoof " IP addresses and sniff clear text passwords. The method you choose should depends on your environment. Once you implement a protected website, you should periodically peruse the error log file for failed logins. In our environment, I periodically process the log files and send error reports to web site maintainers for them to monitor.
<Directory /home/httpd/html/iprestricted> order deny,allow Deny from all Allow from yourPC.yourDomain.edu .yourDomain.edu </Directory>Replace "yourPC.yourDomain.edu" with your PC's domain name. Replace ".yourDomain.edu" with your department's domain name. Notice you can separate IP addresses with spaces. Even though we have DNS lookups turned off, the server will resolve the above names to ensure the client's validity. You can also have multiple Allow lines to make reading easier. Now you are ready to test your work.
Make the "/home/httpd/html/iprestricted" directory and restart the web server. Only your host and the hosts in your domain can access the iprestricted directory (and subdirectories, if any). If you try it from another computer not in matching those entries, you should receive a "403 Forbidden" error. You should always test the security to ensure it does what you expect.
<Directory /home/httpd/html/pwrestricted> AuthUserFile /etc/httpd/conf/passwords AuthGroupFile /dev/null AuthName "Enter your login name" AuthType Basic require valid-user </Directory>Make the "/home/httpd/html/pwrestricted" directory and then create the password file. To create the password file, do the following:
htpasswd -c /etc/httpd/conf/passwords myuserid New password: Re-type new password: Adding password for user myuseridYou no longer need to use the "-c" switch after you create the initial password file. Now, restart the web server and test your work. Try replacing valid-user with the user name you specified. Try enabling several different users to the directory.
Words of caution: Do not store the password file in any location accessible by the web server. Otherwise, someone could download the password file and learn valid login names to try. You should use different login names and directories than you see here since we plan to publish this document on the web (where everyone could learn your layout).