2003 01 23

                    LINUX NEWS
          January 23, 2003 -- Issue #116


1) Sean’s Notes

2) Linux News

SCO Licencing?
Why I'd Miss Mandrake
.NET is a Marketing Program; Java is a Market
I Guess It Ran Windows

3) Linux Resources

CFEngine in Perl
It's All About Efficiency
Is It Really Gone?
RPM Really Broken?
VNC Not Working?

4) App o’ the Week

~~~~~~~~~~~~~~~~~~~~~~~~ ADVERTISEMENT ~~~~~~~~~~~~~~~~~~~~~~~~~

Gain study time and enhance your learning! Hear hundreds of certification exam questions on audio CDs or cassettes. Learn while you commute to and from work, exercise, or walk the dog. Ideal for those times when you can’t read. 90-day money back guarantee if you’re not happy.


For information on how to advertise in this newsletter
please contact mailto:adsales@CramSession.com or visit

1) Sean's Notes

I can't think of an application more synonymous with Linux than
the Apache web server. It's powerful and flexible. It's fast.
It's free.

I've always found the origins of this software to be
interesting. Way back when, the most popular web server software
was made by the National Center for Supercomputing Applications
(NCSA), and was simply called the NCSA web server. After the guy
who did most of the work on it left the NCSA (to go to a startup
company called Netscape, if memory serves), the product started
to fall behind.

In true community fashion, people were still producing bugfixes.
Taking the initiative, a couple of guys forked the NCSA code
branch and applied the user-contributed patches. The result was
"A Patchy" web server, which we now know as Apache. It quickly
became the most popular server on the Internet. (Remember, we're
talking 1995 here, so the web was quite small).

In this series of articles, I'll walk through the workings of a
web server, and then we'll get into Apache configuration.

The job of a web server isn't really that hard. It generally
listens on TCP port 80 for incoming connections, and upon
receiving one, reads in a request and a series of headers,
followed by a blank line signifying that the request is
complete. The response is similar, a status line, some headers,
a blank line, and then the content.

A request for a URL such as http://www.cramsession.com/default.asp
might look like this:

GET /default.asp HTTP/1.1
Host: www.cramsession.com

HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Thu, 23 Jan 2003 03:20:54 GMT
Content-Length: 30134
Content-Type: text/html
Expires: Thu, 23 Jan 2003 03:20:54 GMT
Set-Cookie: ctm%5Fa=0; path=/
Set-Cookie: IPRO=UV&5098929; expires=Sat, 10-Aug-2002 04:00:00
            GMT; domain=.brainbuzz.com; path=/
Cache-control: private


The first three lines are what the client sends. The first word
specifies the method, most often GET (ie to get a page). After
that comes the full path to the resource being requested, in
this case /default.asp. The leading '/' is important, otherwise
we have what's known as a relative url, which will generate an
error. After the path comes the version of HTTP we're speaking.
Pretty much everything these days is version 1.1.

Between the request line and the blank line are a series of
headers. I could pass my browser, the languages I'll accept, my
telephone number, anything. Some are used by the web server, and
some are passed along to the pages. In this case, I'm telling
the web server that I'm requesting the page within the
www.cramsession.com site. Since we're connected to
www.cramsession.com anyway, this may seem redundant. However, as
we'll see when we make our way over to virtual domains, we can
have thousands of domains served off of one IP address. One
thing that HTTP 1.1 brings in is the host header, which the
client sends along to tell the remote web server what virtual
domain is being called.

Since I'm not interested in giving Cramsession any more
information about me, I'll just send along a blank line to
signify that my request is done. I get back a line saying that
the response is in HTTP 1.1 format, and the status code is 200.
An English code is returned to help debugging, simply "OK". In
general, response codes are 3 digits. The first digit has

2 - Things are good

3 - Not an error, but you're not getting content back either.
This could mean a redirect to another URL, or that your local
copy of the document is still valid, so there is no need to
send a copy. The RFC calls this "Redirection".

4 - Client error.  It's your problem.

5 - Server error.  It's my problem.

As you get down within each class of responses, the other digits
tell you more. You're familiar with 404 (file not found) and
possibly 500 (server error). As usual, the RFC tells you


(error codes are in 6.1.1)

Some of the headers are of interest. "Content-type: text/html"
is on every response, it tells you what MIME type the response
is in. If we had requested a GIF, we'd see "image/gif". This is
for the web browser. If you ever see text when you expected an
image or an executable, chances are your web server is mucking
up the MIME type.

The two items with the date, when combined, tell us that the
remote server doesn't want us to cache (the current date and
"expires" are the same).

Of particular interest are the "Set-Cookie" lines. You've heard
of cookies before, right? Well, that's how they look. If we
already had a cookie to present to the site, we'd have sent it
in our headers. As it stands, the Cramsession web site is
handing us three cookies, it looks like the latter two are for
ad banner tracking and ASP sessions respectively. On our next
request to the site, we'd be expected to send those back to the

Of course, you noticed the server header, letting us know that
Cramsession is a proud user of Microsoft IIS 5.0. Oh well,
nobody's perfect.

Once the server gets the request, how does it know what to
return? On a static content server (i.e. nothing ever changes),
the web server can append the request to the root of the web
site and return the file. So, if I'm asked for /index.html, and
the root of my web server is /var/www/html, I'll return the file
called /var/www/html/index.html. Of course, the web server has
to do some sanity checking, because if I asked it for
../../../etc/passwd, it had better not hand out
/var/www/html/../../../etc/passwd (which maps to the the system
password file). Don't laugh, this is a common problem,
especially once you get into dynamic content!

Now, it could be that the web server wants to generate dynamic
pages. The original way to do this was through CGI, the common
gateway interface. The client would request something like:

GET /cgi-bin/form.pl

...which would run a script called form.pl. form.pl would do
it's stuff, like emailing the results of a form, and then return
HTML or a status code indicating a redirect.

Dynamic content brings up an interesting question... How do we
get data from the client to the server? With the GET method,
it's tacked on to the request:

GET /cgi-bin/form.pl?fname=sean&lname=walberg HTTP/1.1

The question mark separates the path from the query string.
Within the query string, the key=value pairs are separated by an
ampersand. Thus, I'm sending two variables to form.pl. CGI
specifies that the browser puts that into the scripts
environment, where it can decode as necessary.

When the GET isn't enough, we've got the POST method. It looks
just like the GET method, except that the data goes after the
headers, but before the blank line:

POST /cgi-bin/form.pl HTTP/1.1
Header: blah


A POST is generally used when there is a lot of data to send,
as the server is not obligated to accept long query strings.

Between static pages and CGI lies another technology, called
server parsed pages. Static content is interspersed with tags
that the web server executes as code. With the popular PHP
language, I could put a tag like <?=$fname?> which would be
replaced by "sean" in the URL above.

Web servers, especially Apache, are capable of all this and
more. As we get into the configuration, we'll see how this is
implemented, and some of the tools available to the
administrator. Until then, we've looked a bit at the HTTP
protocol, and how the web browser translates a request into a

Long live the Penguin,


2) Linux News

SCO Licencing?

A week or so ago, there were some musings about SCO thinking
about trying to make some money off of their UNIX patents by
applying them to Linux users. While there is a grain of truth
(it's not Linux specifically, but certain vendors that
redistribute SCO libraries on Linux), this article has some
great comments about claims such as these, and points out the
difference between some of the Unix trademarks.


Why I'd Miss Mandrake

This is an interesting piece on one person's opinion of
Mandrake's recent bankruptcy filing. The user comments at the
end are also interesting.


.NET is a Marketing Program; Java is a Market

Every so often SUN comes out with a good blast at one of its
competitors. This time around, their sights are set on
Microsoft's .NET product, or whatever it's being called these
days. In the same breath, of course, they tout Java as the answer.


I Guess It Ran Windows

It's the only explanation...


3) Linux Resources

CFEngine in Perl

CFEngine is a piece of software that lets you centrally manage
configuration files across your enterprise. This article is a
look at a perl fork of it, which promises to enhance what
CFEngine can do. The author is writing this as a series of
articles on how to write an application, so if you're looking to
become a developer (or just hack around like I do) then it's
worth a read.


It's All About Efficiency

Sure, your PHP Nuke site might hold up under light load. But
when your hits increase tenfold in a matter of minutes, what
happens? This article looks at the design of the aceshardware
web site, which uses efficient caching to avoid database hits,
among other things. Even if you're not a developer, it's an
interesting look at what works and what doesn't.


Is It Really Gone?

While I'm sure many people get rid of hard drives without even
so much as a formatting, I was surprised at how much can be
recovered even after this has been done. If security of the data
is important, and you need it wiped, remember to do a pass or
two of "dd if=/dev/zero of=/dev/DEVICE" before you give it away.
(/dev/urandom might also be a good source of data to write to
the drive.)


RPM Really Broken?

I've seen RPM really blow it a couple of times, requiring that
you rebuild the system RPM database. It's not too hard, as long
as you know the command (because the command to fix it and
destroy it are usually presented together in the documentation,
leading to confusion). My hero, Shadowman, lends his thoughts
on the issue, along with a couple of other handy ones.


VNC Not Working?

Cramsession reader Cody (who is really getting the hang of this
Linux thing) had a bit of trouble with VNC not showing him the
display he was used to. Looks like some config files were out of
order. Read the thread to learn more.


4) App o' the Week
I'm not sure why I haven't mentioned AbiWord before. It's a
great Word Processor, and is very lightweight. I use it mainly
for viewing Word attachments within email, but it's a
fully-functional piece of software.


(C) 2003 BrainBuzz.com, Inc. All Rights Reserved.

          This message is from CramSession

You are currently subscribed to the following list
   Hottest Linux News and Resources
   as: sean@ertw.com

To un-subscribe from this newsletter by e-mail,
   send a blank email message to:

To subscribe to this newsletter and many others visit
our site at: