Guide to HTML
Basic Document Structure

Basic | Doc Type Definition | Common remarks

Basic HTML Document

Each and every HTML document, regardless of its contents, should match the following structure.

<html>
<head>
  <title>Document title</title>
</head>
<body>
  Document contents
</body>
</html>

More information can be found in the Enhanced Document Structure.

Top


Document Type Definition

Prior to the <html> tag the document type statement can be used as the first line of the document to identify the document as HTML and to specify which version of HTML is used. If no browser specific extensions are used (i.e. levels 0, 1, and 2) this line looks like this:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html>
<head>
  ...
</head>
<body>
  ...
</body>
</html>

The part between quotes is the public identifier.

For HTML 2.0 you can use the identifier of the Internet Engineering Task Force (IETF):

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

For the (expired) HTML 3 proposal use:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 3.0//EN">

Though it is recommended to use the W3C identifier for HTML 3.0 and up:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3 1995-03-24//EN">

Or, if you are using features from HTML 3.2:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

or

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

Or for HTML 4.0:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">

The doc type statement is not mandatory. Your pages will work without them. For more information on the Document Type Definition (DTD), you can visit http://www.w3.org.

Sometimes your editor can insert a DOCTYPE. However, this does not always reflect the HTML generated. (A bad example is MS Publisher.)

Top


Common Remarks

It does not matter whether you use capitals for the HTML tags or not, the result will be the same. It is a matter of taste. However, keep in mind that URLs as used in links are case sensitive in general.

Hard returns will be ignored by the browser. Using sufficient carriage returns in the code and using remarks here and there just makes it easier to maintain the page.

More than one subsequent space will be ignored by the browser. If you need more than one subsequent space, use non-brake space symbols: &nbsp;.

If a sentence in the source code of your HTML document is divided over more than one line, the browser will automatically insert a space between the last word of the line and the first word of the next.

Some browser do the same with images: defining a series of images on one line or defining each image on a separate line may give different results depending on the browser.

When composing HTML, you can either use a WYSIWYG editor or a text based editor. The drawback of using an editor where you don't have to be concerned about the actual HTML code itself, is that the code generated by the editor might not always comply with the rules of syntax.

Verify some of your documents by an HTML verifier. This way you will get a pretty good idea about your authoring skills. A very good verifier can be found at:

Most modern browsers are quite tolerant about the code they receive and errors will not always be noticed. By using a verifier on the Web, you can find out about forgotten closing tags, improperly used tags, and other common mistakes.

When specifying attributes/modifiers in HTML tags, surround them using double quotes (") -- some browsers may choke on single quotes ('). When an attribute consists of letters, digits, hyphens, and periods it is not mandatory to use quotes but using them is a good habit.

URLs always have to be in quotes.

When you started with an opening quote, do not forget the closing quote or your code might be mis-interpreted by the browser.

Avoid interlocked elements, i.e. always orderly group your tags: <b><i>...</i></b>. With a simple line your browser will still be able to render the text properly, but it can get nasty when you make that mistake in nested lists or tables.



Paragraph separator