Jan 23 2003


                    LINUX NEWS
            http://www.Cramsession.com
          January 23, 2003 -- Issue #116


TABLE OF CONTENTS

1) Sean’s Notes

2) Linux News

SCO Licencing?
Why I'd Miss Mandrake
.NET is a Marketing Program; Java is a Market
I Guess It Ran Windows

3) Linux Resources

CFEngine in Perl
It's All About Efficiency
Is It Really Gone?
RPM Really Broken?
VNC Not Working?

4) App o’ the Week

~~~~~~~~~~~~~~~~~~~~~~~~ ADVERTISEMENT ~~~~~~~~~~~~~~~~~~~~~~~~~

Gain study time and enhance your learning! Hear hundreds of certification exam questions on audio CDs or cassettes. Learn while you commute to and from work, exercise, or walk the dog. Ideal for those times when you can’t read. 90-day money back guarantee if you’re not happy.

http://ad.cramsession.com/?RC153&AIY70

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For information on how to advertise in this newsletter please contact mailto:adsales@CramSession.com or visit http://cramsession.com/marketing/default.asp


1) Sean’s Notes

I can’t think of an application more synonymous with Linux than the Apache web server. It’s powerful and flexible. It’s fast. It’s free.

I’ve always found the origins of this software to be interesting. Way back when, the most popular web server software was made by the National Center for Supercomputing Applications (NCSA), and was simply called the NCSA web server. After the guy who did most of the work on it left the NCSA (to go to a startup company called Netscape, if memory serves), the product started to fall behind.

In true community fashion, people were still producing bugfixes. Taking the initiative, a couple of guys forked the NCSA code branch and applied the user-contributed patches. The result was “A Patchy” web server, which we now know as Apache. It quickly became the most popular server on the Internet. (Remember, we’re talking 1995 here, so the web was quite small).

In this series of articles, I’ll walk through the workings of a web server, and then we’ll get into Apache configuration.

The job of a web server isn’t really that hard. It generally listens on TCP port 80 for incoming connections, and upon receiving one, reads in a request and a series of headers, followed by a blank line signifying that the request is complete. The response is similar, a status line, some headers, a blank line, and then the content.

A request for a URL such as http://www.cramsession.com/default.asp might look like this:

GET /default.asp HTTP/1.1 Host: www.cramsession.com

HTTP/1.1 200 OK Server: Microsoft-IIS/5.0 Date: Thu, 23 Jan 2003 03:20:54 GMT Content-Length: 30134 Content-Type: text/html Expires: Thu, 23 Jan 2003 03:20:54 GMT Set-Cookie: ctm%5Fa=0; path=/ Set-Cookie: IPRO=UV&5098929; expires=Sat, 10-Aug-2002 04:00:00 GMT; domain=.brainbuzz.com; path=/ Set-Cookie: ASPSESSIONIDCCDQACQA=LNGFLPEAPNPMNPOOMFLEEBDA; path=/ Cache-control: private

<html>…

The first three lines are what the client sends. The first word specifies the method, most often GET (ie to get a page). After that comes the full path to the resource being requested, in this case /default.asp. The leading ’/’ is important, otherwise we have what’s known as a relative url, which will generate an error. After the path comes the version of HTTP we’re speaking. Pretty much everything these days is version 1.1.

Between the request line and the blank line are a series of headers. I could pass my browser, the languages I’ll accept, my telephone number, anything. Some are used by the web server, and some are passed along to the pages. In this case, I’m telling the web server that I’m requesting the page within the www.cramsession.com site. Since we’re connected to www.cramsession.com anyway, this may seem redundant. However, as we’ll see when we make our way over to virtual domains, we can have thousands of domains served off of one IP address. One thing that HTTP 1.1 brings in is the host header, which the client sends along to tell the remote web server what virtual domain is being called.

Since I’m not interested in giving Cramsession any more information about me, I’ll just send along a blank line to signify that my request is done. I get back a line saying that the response is in HTTP 1.1 format, and the status code is 200. An English code is returned to help debugging, simply “OK”. In general, response codes are 3 digits. The first digit has significance:

2 - Things are good

3 - Not an error, but you’re not getting content back either. This could mean a redirect to another URL, or that your local copy of the document is still valid, so there is no need to send a copy. The RFC calls this “Redirection”.

4 - Client error. It’s your problem.

5 - Server error. It’s my problem.

As you get down within each class of responses, the other digits tell you more. You’re familiar with 404 (file not found) and possibly 500 (server error). As usual, the RFC tells you everything:

http://www.ietf.org/rfc/rfc2616.txt

(error codes are in 6.1.1)

Some of the headers are of interest. “Content-type: text/html” is on every response, it tells you what MIME type the response is in. If we had requested a GIF, we’d see “image/gif”. This is for the web browser. If you ever see text when you expected an image or an executable, chances are your web server is mucking up the MIME type.

The two items with the date, when combined, tell us that the remote server doesn’t want us to cache (the current date and “expires” are the same).

Of particular interest are the “Set-Cookie” lines. You’ve heard of cookies before, right? Well, that’s how they look. If we already had a cookie to present to the site, we’d have sent it in our headers. As it stands, the Cramsession web site is handing us three cookies, it looks like the latter two are for ad banner tracking and ASP sessions respectively. On our next request to the site, we’d be expected to send those back to the server.

Of course, you noticed the server header, letting us know that Cramsession is a proud user of Microsoft IIS 5.0. Oh well, nobody’s perfect.

Once the server gets the request, how does it know what to return? On a static content server (i.e. nothing ever changes), the web server can append the request to the root of the web site and return the file. So, if I’m asked for /index.html, and the root of my web server is /var/www/html, I’ll return the file called /var/www/html/index.html. Of course, the web server has to do some sanity checking, because if I asked it for ../../../etc/passwd, it had better not hand out /var/www/html/../../../etc/passwd (which maps to the the system password file). Don’t laugh, this is a common problem, especially once you get into dynamic content!

Now, it could be that the web server wants to generate dynamic pages. The original way to do this was through CGI, the common gateway interface. The client would request something like:

GET /cgi-bin/form.pl

…which would run a script called form.pl. form.pl would do it’s stuff, like emailing the results of a form, and then return HTML or a status code indicating a redirect.

Dynamic content brings up an interesting question… How do we get data from the client to the server? With the GET method, it’s tacked on to the request:

GET /cgi-bin/form.pl?fname=sean&lname=walberg HTTP/1.1

The question mark separates the path from the query string. Within the query string, the key=value pairs are separated by an ampersand. Thus, I’m sending two variables to form.pl. CGI specifies that the browser puts that into the scripts environment, where it can decode as necessary.

When the GET isn’t enough, we’ve got the POST method. It looks just like the GET method, except that the data goes after the headers, but before the blank line:

POST /cgi-bin/form.pl HTTP/1.1 Header: blah … fname=sean&lname=walberg

A POST is generally used when there is a lot of data to send, as the server is not obligated to accept long query strings.

Between static pages and CGI lies another technology, called server parsed pages. Static content is interspersed with tags that the web server executes as code. With the popular PHP language, I could put a tag like <?=$fname?> which would be replaced by “sean” in the URL above.

Web servers, especially Apache, are capable of all this and more. As we get into the configuration, we’ll see how this is implemented, and some of the tools available to the administrator. Until then, we’ve looked a bit at the HTTP protocol, and how the web browser translates a request into a response.

Long live the Penguin,

Sean mailto:swalberg@cramsession.com


2) Linux News


SCO Licencing?

A week or so ago, there were some musings about SCO thinking about trying to make some money off of their UNIX patents by applying them to Linux users. While there is a grain of truth (it’s not Linux specifically, but certain vendors that redistribute SCO libraries on Linux), this article has some great comments about claims such as these, and points out the difference between some of the Unix trademarks.

http://www.practical-tech.com/business/b01162003.htm


Why I’d Miss Mandrake

This is an interesting piece on one person’s opinion of Mandrake’s recent bankruptcy filing. The user comments at the end are also interesting.

http://newsforge.com/newsforge/03/01/17/238248.shtml?tid#


.NET is a Marketing Program; Java is a Market

Every so often SUN comes out with a good blast at one of its competitors. This time around, their sights are set on Microsoft’s .NET product, or whatever it’s being called these days. In the same breath, of course, they tout Java as the answer.

http://wwws.sun.com/software/dotnet_statement.html


I Guess It Ran Windows

It’s the only explanation…

http://www.dailycamera.com/bdc/city_news/article/0,1713,BDC_2422_16 61447,00.html


3) Linux Resources


CFEngine in Perl

CFEngine is a piece of software that lets you centrally manage configuration files across your enterprise. This article is a look at a perl fork of it, which promises to enhance what CFEngine can do. The author is writing this as a series of articles on how to write an application, so if you’re looking to become a developer (or just hack around like I do) then it’s worth a read.

http://www-106.ibm.com/developerworks/linux/library/l-road6.html


It’s All About Efficiency

Sure, your PHP Nuke site might hold up under light load. But when your hits increase tenfold in a matter of minutes, what happens? This article looks at the design of the aceshardware web site, which uses efficient caching to avoid database hits, among other things. Even if you’re not a developer, it’s an interesting look at what works and what doesn’t.

http://www.aceshardware.com/read.jsp?idP000347


Is It Really Gone?

While I’m sure many people get rid of hard drives without even so much as a formatting, I was surprised at how much can be recovered even after this has been done. If security of the data is important, and you need it wiped, remember to do a pass or two of “dd if=/dev/zero of=/dev/DEVICE” before you give it away. (/dev/urandom might also be a good source of data to write to the drive.)

http://sfgate.com/cgi-bin/article.cgi?f=/news/archive/2003/01/15/na tional1617EST0765.DTL


RPM Really Broken?

I’ve seen RPM really blow it a couple of times, requiring that you rebuild the system RPM database. It’s not too hard, as long as you know the command (because the command to fix it and destroy it are usually presented together in the documentation, leading to confusion). My hero, Shadowman, lends his thoughts on the issue, along with a couple of other handy ones.

http://www.redhat.com/advice/ask_shadowman.html


VNC Not Working?

Cramsession reader Cody (who is really getting the hang of this Linux thing) had a bit of trouble with VNC not showing him the display he was used to. Looks like some config files were out of order. Read the thread to learn more.

http://boards.cramsession.com/boards/vbm.asp?mh8127


4) App o’ the Week

I’m not sure why I haven’t mentioned AbiWord before. It’s a great Word Processor, and is very lightweight. I use it mainly for viewing Word attachments within email, but it’s a fully-functional piece of software.

http://sourceforge.net/projects/abiword/


(C) 2003 BrainBuzz.com, Inc. All Rights Reserved.


      This message is from CramSession

You are currently subscribed to the following list Hottest Linux News and Resources as: sean@ertw.com

To un-subscribe from this newsletter by e-mail, send a blank email message to: mailto:leave-linuxnews-3825955Y@list.cramsession.com

To subscribe to this newsletter and many others visit our site at: http://newsletters.cramsession.com/signup/default.asp