How Web Documents Work

Submitted by Syscrusher on Mon, 2005/06/06 - 22:56.

Before you begin creating web pages, it's a good idea to understand how the Web itself works, at least superficially.

A web page is (generally) a file that is stored on the hard disk of a computer called a web server. The person (you!) who wants to view that particular page, generally from a personal computer (PC), uses a program called a web browser or simply a browser. (Actually, the technical term is a "user agent" but this tutorial will stick with the common terminology instead.) Examples of browsers are programs like Netscape Navigator, Microsoft Internet Explorer, Sun Microsystems' HotJava browser, Opera, Amaya, Lynx, and Charlotte.

In a browser, you choose a particular Uniform Resource Locator (URL), which is also sometimes called a web address, by manually typing it into the browser's address line or by selecting a hyperlink from a previous page. The URL specifies four things:

  1. The protocol (communication method) that will be used to retrieve the page. Most web pages use the Hypertext Transfer Protocol (HTTP), abbreviated http: in web addresses.
  2. The name of the computer (server) where the web page resides. This is also called the host name of the computer. Each computer on the Internet has a name that is unique in all the world. Many companies assign their web servers a name that begins with "www" so that potential customers will easily remember it, but this is just a convention and is not really a requirement. For example, Netscape's main web server is named "www.netscape.com" but there is a different computer called "developer.netscape.com". In the URL "http://www.netscape.com" it is the "http:" and not the "www" that tells your browser that this is a web site and not a newsgroup or e-mail destination. The part of the host name that is before the first period is the name of the computer, and the rest of the host name is the domain name of the company, organization, or individual who owns that computer. In the URL http://www.netscape.com, there is a computer named "www" that is owned by "Netscape Communications" as indicated by the "netscape.com" domain.
  3. The directory and/or filename of the page on the server. Sometimes what appears to be a "filename" is really the name of a program that is running on the server each time someone requests that "page". In other words, the "page" doesn't really exist, but is created especially for each user, as needed. This is very common with interactive pages such as online shopping carts and search engines.
  4. If the "page" is really a program, as explained above, then the URL may also contain instructions (parameters) to tell the program what to do. These parameters are optional, and their format and meaning depends on the design of the particular web site or program.

Here's an example: http://www.example.com/marketing/products.pl?topic=vcr. In this URL, the hypertext protocol will be used to run a program called "products.pl", which is stored in the "marketing" folder (directory) on a computer called "www". That computer is owned by whatever company owns the domain, example.com. When running the program, the parameter "topic=vcr" tells the computer what kind of equipment (in this case, VCRs) you would like to examine and, perhaps, purchase online.

For both normal web pages, such as the ones you will learn how to create, and these complex programs used by large corporations, the next step is the same. The web server either creates the document (for a program) or reads it from a disk file (for regular pages), and sends it over the Internet back to your browser.

Finally, your browser reads the document and decides how to display it on your computer screen. Perhaps the document contains some graphic images, such as icons or photographs, which are to be displayed along with the text. In truth, the document itself does not "contain" these, but rather specifies what images are to be displayed where. The browser makes an internal list of all the images that it needs, and then requests each one from the server separately.

It is very important to understand that the web server can only control what information (the document) is sent to the browser -- it has no control over exactly how the browser will display that information. As the web designer, you create the documents that come from the server, but your site's visitors control their browsers. In this tutorial, you will learn how to work within this constraint and even to use it constructively to make your pages better and easier to use.

( categories: | )