This is a reference manual for producing documents in HTML, the hypertext markup language used on the World Wide Web. This manual conforms to standard accepted by most common browsers. The information was gathered basically to keep it all in a convenient place for my own personal reference. Youre welcome to use it if you like. This reference manual will provide you, the reader, with sufficient information to allow you to generate source level HTML that can be used on the Web. You can develop working Web Pages using a simple text editor like . The information here was gleaned from many sources. Although I've tried to keep it up to date, new features are always being introduced and may not have been included. |
To denote the various elements in an HTML document, you use tags. HTML tags consist of a left angle bracket (<), a tag name, and a right angle bracket (>). Tags are usually paired (e.g., <H1> and </H1>) to start and end the tag instruction. The end tag looks just like the start tag except a slash (/) precedes the text within the brackets. HTML tags are listed below.
Some elements may include an attribute, which is additional information that is included inside the start tag. For example, you can specify the alignment of images (top, middle, or bottom) by including the appropriate attribute with the image source HTML code. Tags that have optional attributes are noted below.
NOTE: HTML is not case sensitive. <title> is equivalent to <TITLE> or <TiTlE>. There are a few exceptions noted in Escape Sequences below.
Not all tags are supported by all World Wide Web browsers. If a browser does not support a tag, it will simply ignore it. Any text placed between a pair of unknown tags will still be displayed, however.
Required elements are shown in this sample bare-bones document:
<html> <head> <TITLE>A Simple HTML Example</TITLE> </head> <body> <H1>HTML is Easy To Learn</H1> <P>Welcome to the world of HTML. This is the first paragraph. While short it is still a paragraph!</P> <P>And this is the second paragraph.</P> </body> </html>The required elements are the <html>, <head>, <title>, and <body> tags (and their corresponding end tags). Because you should include these tags in each file, you might want to create a template file with them. (Some browsers will format your HTML file correctly even if these tags are not included. But some browsers won't! So make sure to include them.)
Click to see the formatted version of the example. A longer example is also available but you should read through the rest of the guide before you take a look. This longer-example file contains tags explained in the next section.
This is an excellent way to see how HTML is used and to learn tips and constructs. Of course, the HTML might not be technically correct. Once you become familiar with HTML and check the many online and hard-copy references on the subject, you will learn to distinguish between "good" and "bad" HTML.
Remember that you can save a source file with the HTML codes and use it as a template for one of your Web pages or modify the format to suit your purposes.
The syntax of the heading
element is:
<Hy>Text
of heading </Hy>
where y is a number
between 1 and 6 specifying the level of the heading.
Do not skip levels of headings in your document. For example, don't start with a level-one heading (<H1>) and then next use a level-three (<H3>) heading.
In the bare-bones example shown in the Minimal HTML Document section, the first paragraph is coded as
<P>Welcome to the world of HTML. This is the first paragraph. While short it is still a paragraph!</P>In the source file there is a line break between the sentences. A Web browser ignores this line break and starts a new paragraph only when it encounters another <P> tag.
Important: You must indicate paragraphs with <P> elements. A browser ignores any indentations or blank lines in the source text. Without <P> elements, the document becomes one large paragraph. (One exception is text tagged as "preformatted," which is explained below.) For example, the following would produce identical output as the first bare-bones HTML example:
<H1>Level-one heading</H1> <P>Welcome to the world of HTML. This is the first paragraph. While short it is still a paragraph! </P> <P>And this is the second paragraph.</P>To preserve readability in HTML files, put headings on separate lines, use a blank line or two where it helps identify the start of a new section, and separate paragraphs with blank lines (in addition to the <P> tags). These extra spaces will help you when you edit your files (but your browser will ignore the extra spaces because it has its own set of rules on spacing that do not depend on the spaces you put in your source file).
NOTE: The </P> closing tag may be omitted. This is because browsers understand that when they encounter a <P> tag, it means that the previous paragraph has ended. However, since HTML now allows certain attributes to be assigned to the <P> tag, it's generally a good idea to include it.
Using the <P> and </P> as a paragraph container means that you can center a paragraph by including the ALIGN=alignment attribute in your source file.
<P ALIGN=CENTER> This is a centered paragraph. [See the formatted version below.] </P>
It is also possible to align a paragraph to the right instead, by including the ALIGN=RIGHT attribute. ALIGN=LEFT is the default alignment; if no ALIGN attribute is included, the paragraph will be left-aligned.
Unnumbered Lists
To make an unnumbered, bulleted list,
<UL> <LI> apples <LI> bananas <LI> grapefruit </UL>The output is:
Numbered Lists
A numbered list (also called an ordered list, from which the tag name derives) is identical to an unnumbered list, except it uses <OL> instead of <UL>. The items are tagged using the same <LI> tag. The following HTML code:
<OL> <LI> oranges <LI> peaches <LI> grapes </OL>produces this formatted output:
A definition list (coded as <DL>) usually consists of alternating a definition term (coded as <DT>) and a definition description (coded as <DD>). Web browsers generally format the definition on a new line and indent it.
The following is an example of a definition list:
<DL> <DT> ITEMS <DD> the National Center for Supercomputing Applications, is located on the campus of the University of Illinois at Urbana-Champaign. <DT> Cornell Theory Center <DD> CTC is located on the campus of Cornell University in Ithaca, New York. </DL>The output looks like:
The COMPACT attribute can be used routinely in case your definition terms are very short. If, for example, you are showing some computer options, the options may fit on the same line as the start of the definition.
<DL COMPACT> <DT> -i <DD>invokes Netscapeor other browser Windows using the initialization file defined in the path <DT> -k <DD>invokes Netscapeor other browser Windows in kiosk mode </DL>The output looks like:
Lists can be nested. You can also have a number of paragraphs, each containing a nested list, in a single list item.
Here is a sample nested list:
<UL> <LI> A few New England states: <UL> <LI> Vermont <LI> New Hampshire <LI> Maine </UL> <LI> Two Midwestern states: <UL> <LI> Michigan <LI> Indiana </UL> </UL>The nested list is displayed as
<PRE> #!/bin/csh cd $SCR cfs get mysrc.f:mycfsdir/mysrc.f cfs get myinfile:mycfsdir/myinfile fc -02 -o mya.out mysrc.f mya.out cfs save myoutfile:mycfsdir/myoutfile rm * </PRE>display as:
#!/bin/csh cd $SCR cfs get mysrc.f:mycfsdir/mysrc.f cfs get myinfile:mycfsdir/myinfile fc -02 -o mya.out mysrc.f mya.out cfs save myoutfile:mycfsdir/myoutfile rm *The <PRE> tag can be used with an optional WIDTH attribute that specifies the maximum number of characters for a line. WIDTH also signals your browser to choose an appropriate font and indentation for the text.
Hyperlinks can be used within <PRE> sections. You should avoid using other HTML tags within <PRE> sections, however.
Note that because <, >, and & have special meanings in HTML, you must use their escape sequences (<, >, and &, respectively) to enter these characters. See the section Escape Sequences for more information.
In the example:
<P>Omit needless words.</P> <BLOCKQUOTE> <P>Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts. </P> <P>--William Strunk, Jr., 1918 </P> </BLOCKQUOTE>the result is:
Omit needless words.
Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts.--William Strunk, Jr., 1918
National Center for Supercomputing Applications<BR> 605 East Springfield Avenue<BR> Champaign, Illinois 61820-5518<BR>The output is:
National Center for Supercomputing
Applications
605 East Springfield Avenue
Champaign, Illinois 61820-5518
You can vary a rule's size (thickness) and width (the percentage of the window covered by the rule). Experiment with the settings until you are satisfied with the presentation. For example:
<HR SIZE=4 WIDTH="50%">displays as:
In the ideal SGML universe, content is divorced from presentation. Thus SGML tags a level-one heading as a level-one heading, but does not specify that the level-one heading should be, for instance, 24-point bold Times centered. The advantage of this approach (it's similar in concept to style sheets in many word processors) is that if you decide to change level-one headings to be 20-point left-justified Helvetica, all you have to do is change the definition of the level-one heading in your Web browser. Indeed, many browsers today let you define how you want the various HTML tags rendered on-screen using what are called cascading style sheets, or CSS. CSS is more advanced than HTML, though, and will not be covered in this Primer. (You can learn more about CSS at the World Wide Web Consortium CSS site.)
Another advantage of logical tags is that they help enforce consistency in your documents. It's easier to tag something as <H1> than to remember that level-one headings are 24-point bold Times centered or whatever. For example, consider the <STRONG> tag. Most browsers render it in bold text. However, it is possible that a reader would prefer that these sections be displayed in red instead. (This is possible using a local cascading style sheet on the reader's own computer.) Logical styles offer this flexibility.
Of course, if you want something to be displayed in italics (for example) and do not want a browser's setting to display it differently, you should use physical styles. Physical styles, therefore, offer consistency in that something you tag a certain way will always be displayed that way for readers of your document.
Try to be consistent about which type of style you use. If you tag with physical styles, do so throughout a document. If you use logical styles, stick with them within a document. Keep in mind that future releases of HTML might not support certain logical styles, which could mean that browsers will not display your logical-style coding. (For example, the <DFN> tag -- short for "definition", and typically displayed in italics -- is not widely supported and will be ignored if the reader's browser does not understand it.)
To use one of the three characters in an HTML document, you must enter its escape sequence instead:
NOTE: Unlike the rest of HTML, the escape sequences are case sensitive. You cannot, for instance, use < instead of <.
HTML's single hypertext-related tag is <A>, which stands for anchor. To include an anchor in your document:
<A HREF="MaineStats.html">Maine</A>This entry makes the word Maine the hyperlink to the document MaineStats.html, which is in the same directory as the first document.
<A HREF="AtlanticStates/NYStats.html">New York</A>These are called relative links because you are specifying the path to the linked file relative to the location of the current file. You can also use the absolute pathname (the complete URL) of the file, but relative links are more efficient in accessing a server. They also have the advantage of making your documents more "portable" -- for instance, you can create several web pages in a single folder on your local computer, using relative links to hyperlink one page to another, and then upload the entire folder of web pages to your web server. The pages on the server will then link to other pages on the server, and the copies on your hard drive will still point to the other pages stored there.
It is important to point out that UNIX is a case-sensitive operating system where filenames are concerned, while DOS and the MacOS are not. For instance, on a Macintosh, "DOCUMENT.HTML", "Document.HTML", and "document.html" are all the same name. If you make a relative hyperlink to "DOCUMENT.HTML", and the file is actually named "document.html", the link will still be valid. But if you upload all your pages to a UNIX web server, the link will no longer work. Be sure to check your filenames before uploading.
Pathnames use the standard UNIX syntax. The UNIX syntax for the parent directory (the directory that contains the current directory) is "..". (For more information consult a beginning UNIX reference text such as Learning the UNIX Operating System from O'Reilly and Associates, Inc.)
If you were in the NYStats.html file and were referring to the original document US.html, your link would look like this:
<A HREF="../US.html">United States</A>In general, you should use relative links whenever possible because:
scheme://host.domain[:port]/path/filename
where scheme is one of
For example, to include a link to this document, enter:
<A HREF="http://PCExpress.TheOldDub.com/pub/"> My Reference Manual to HTML</A>This entry makes the text My Reference Manual to HTML a hyperlink to this document.
There is also a mailto scheme, used to hyperlink email addresses, but this scheme is unique in that it uses only a colon (:) instead of :// between the scheme and the address. You can read more about mailto below.
For more information on URLs, refer to:
This guide is a good example of using named anchors in one document. The guide is constructed as one document to make printing easier. But as one (long) document, it can be time-consuming to move through when all you really want to know about is one bit of information about HTML. Internal hyperlinks are used to create a "table of contents" at the top of this document. These hyperlinks move you from one location in the document to another location in the same document. (Go to the top of this document and then click on the Links to Specific Sections hyperlink in the table of contents. You will wind up back here.)
You can also link to a specific section in another document. That information is presented first because understanding that helps you understand linking within one document.
Enter the HTML coding for a link to a named anchor:
documentA.html: In addition to the many state parks, Maine is also home to <a href="MaineStats.html#ANP">Acadia National Park</a>.Think of the characters after the hash (#) mark as a tab within the MaineStats.html file. This tab tells your browser what should be displayed at the top of the window when the link is activated. In other words, the first line in your browser window should be the Acadia National Park heading.
Next, create the named anchor (in this example "ANP") in MaineStats.html:
<H2><A NAME="ANP">Acadia National Park</a></H2>With both of these elements in place, you can bring a reader directly to the Acadia reference in MaineStats.html.
NOTE: You cannot make links to specific sections within a different document unless either you have write permission to the coded source of that document or that document already contains in-document named anchors. For example, you could include named anchors to this primer in a document you are writing because there are named anchors in this guide (use View Source in your browser to see the coding). But if this document did not have named anchors, you could not make a link to a specific section because you cannot edit the original file on the server.
For example, to link to the ANP anchor from within MaineStats, enter:
...More information about <A HREF="#ANP">Acadia National Park</a> is available elsewhere in this document.Be sure to include the <A NAME=> tag at the place in your document where you want the link to jump to (<A NAME="ANP">Acadia National Park</a>).
Named anchors are particularly useful when you think readers will print a document in its entirety or when you have a lot of short information you want to place online in one file.
<A HREF="mailto:emailinfo@host">Name</a>For example, enter:
<A HREF="mailto:radsquared@aol.com"> RADČ</a>to create a mail window that is already configured to open a mail window for the RADČ alias. (You, of course, will enter another mail address!)
To include an inline image, enter:
<IMG SRC=ImageName>where ImageName is the URL of the image file.
The syntax for <IMG SRC> URLs is identical to that used in an anchor HREF. If the image file is a GIF file, then the filename part of ImageName must end with .gif. Filenames of X Bitmap images must end with .xbm; JPEG image files must end with .jpg or .jpeg; and Portable Network Graphic files must end with .png.
For example, to include a self portrait image in a file along with the portrait's dimensions, enter:
<IMG SRC=SelfPortrait.gif HEIGHT=100 WIDTH=65>NOTE: Some browsers use the HEIGHT and WIDTH attributes to stretch or shrink an image to fit into the allotted space when the image does not exactly match the attribute numbers. Not all browser developers think stretching/shrinking is a good idea, so don't plan on your readers having access to this feature. Check your dimensions and use the correct ones.
Aligning Text with an
Image
By default the bottom of an image is aligned with the following text, as
shown in this paragraph. You can align images to the top or center of a
paragraph using the ALIGN= attributes TOP and
CENTER.
This text is aligned with the top of the image (<IMG SRC = "BarHotlist.gif" ALT="[HOTLIST]" ALIGN=TOP>). Notice how the browser aligns only one line and then jumps to the bottom of the image for the rest of the text.
And this text is centered on the image (<IMG SRC = "BarHotlist.gif" ALT="[HOTLIST]" ALIGN=CENTER>). Again, only one line of text is centered; the rest is below the image.
Images without Text
To display an image without
any associated text (e.g., your organization's logo), make it a separate
paragraph. Use the paragraph
ALIGN= attribute to center the image
or adjust it to the right side of the window as shown below:
<p ALIGN=CENTER> <IMG SRC = "BarHotlist.gif" ALT="[HOTLIST]"> </p>which results in:
The ALT attribute lets you specify text to be displayed instead of an image. For example:
<IMG SRC="UpArrow.gif" ALT="Up">where UpArrow.gif is the picture of an upward pointing arrow. With graphics-capable viewers that have image-loading turned on, you see the up arrow graphic. With a text-only browser or if image-loading is turned off, the word Up is shown in your window in place of the image.
You should try to include alternate text for each image you use in your document, which is a courtesy for your readers -- or, for users who might be visually impaired, a necessity.
The TITLE attribute lets you specify text to be displayed as a bubble when the cursor is placed over an image. For example:
<IMG SRC="Auto.gif" TITLE="The RadČ Car!" ALIGN=CENTER> |
You should try to include Title text for any image you want to have a pop-up message for.
<A HREF="hotlist.html"><IMG SRC="BarHotlist.gif" ALT="[HOTLIST]"></A>Produces the following result:
(Note that this link doesn't actually go anywhere.) The blue border that surrounds the image indicates that it's a clickable hyperlink. You may not always want this border to be displayed, though. In this case you can use the BORDER attribute of the IMG tag to make the image appear as normal. Adding the BORDER attribute and setting it to zero:
<A HREF="hotlist.html"><IMG SRC="BarHotlist.gif" BORDER=0 ALT="[HOTLIST]"></A>Produces the following result:
The BORDER attribute can also be set to non-zero values, whether or not the image is used as a hyperlink. In this case, the border will appear using the default text color for the web page. For instance, if you wanted to give your image a plain black border to help it stand out on the page, you might try this:
<IMG SRC="BarHotlist.gif" BORDER=6 ALT="[HOTLIST]">And get the following result:
Background images can be a texture (linen finished paper, for example) or an image of an object (a logo possibly). You create the background image as you do any image.
However you only have to create a small piece of the image. Using a feature called tiling, a browser takes the image and repeats it across and down to fill your browser window. In sum you generate one image, and the browser replicates it enough times to fill your window. This action is automatic when you use the background tag shown below.
The tag to include a background image is included in the <BODY> statement as an attribute:
<BODY BACKGROUND="filename.gif">
Always preview changes like this to make sure your pages are readable. (For example, many people find red text on a black background difficult to read!) In general, try to avoid using high-contrast images or images that use the color of your text anywhere within the graphic.
You change the color of text, links, visited links, and active links (links that are currently being clicked on) using further attributes of the <BODY> tag. For example:
<BODY BGCOLOR="#000000" TEXT="#FFFFFF" LINK="#9690CC">This creates a window with a black background (BGCOLOR), white text (TEXT), and silvery hyperlinks (LINK).
The six-digit number and letter combinations represent colors by giving their RGB (red, green, blue) value. The six digits are actually three two-digit numbers in sequence, representing the amount of red, green, or blue as a hexadecimal value in the range 00-FF. For example, 000000 is black (no color at all), FF0000 is bright red, 0000FF is bright blue, and FFFFFF is white (fully saturated with all three colors).
These number and letter combinations are generally rather cryptic. Fortunately an online resource is available to help you track down the combinations that map to specific colors and there is software available for you to do this on your workstation:
For some basic colors -- typically those in the standard sixteen-color Windows 3.1 palette -- you can also use the name of the color instead of the corresponding RGB value. For example, "black", "red", "blue", and "cyan" are all valid for use in place of RGB values. However, while not all browsers will understand all color names, any browser that can display colors will understand RGB values, so use them whenever possible.To include a reference to an external image, enter:
<A HREF="MyImage.gif">link anchor</A>You can also use a smaller image as a link to a larger image. Enter:
<A HREF="LargerImage.gif"><IMG SRC="SmallImage.gif"></A>The reader sees the SmallImage.gif image and clicks on it to open the LargerImage.gif file.
Use the same syntax for links to external animations and sounds. The only difference is the file extension of the linked file. For example,
<A HREF="AdamsRib.mov">link anchor</A>
specifies a link to a QuickTime movie. Some common file types and their extensions are:
Think of your tabular information
in light of the coding explained below. A table has heads where you explain
what the columns/rows include, rows for information, cells for each item.
In the following table, the first column contains the header information,
each row explains an HTML table tag, and each cell contains a paired tag
or an explanation of the tag's function.
<TABLE> <!-- start of table definition --> <CAPTION> caption contents </CAPTION> <!-- caption definition --> <TR> <!-- start of header row definition --> <TH> first header cell contents </TH> <TH> last header cell contents </TH> </TR> <!-- end of header row definition --> <TR> <!-- start of first row definition --> <TD> first row, first cell contents </TD> <TD> first row, last cell contents </TD> </TR> <!-- end of first row definition --> <TR> <!-- start of last row definition --> <TD> last row, first cell contents </TD> <TD> last row, last cell contents </TD> </TR> <!-- end of last row definition --> </TABLE> <!-- end of table definition -->You can cut-and-paste the above code into your own HTML documents, adding new rows or cells as necessary. The above example looks like this when rendered in a browser.
The <TABLE> and </TABLE> tags must surround the entire table definition. The first item inside the table is the CAPTION, which is optional. Then you can have any number of rows defined by the <TR> and </TR> tags. Within a row you can have any number of cells defined by the <TD>...</TD> or <TH>...</TH> tags. Each row of a table is, essentially, formatted independently of the rows above and below it. This lets you easily display tables like the one above with a single cell, such as Table Attributes, spanning columns of the table.
Using table borders with images can create an impressive display as well. Experiment and see what you like.
This processing of incoming data is usually handled by a script or program written in Perl or another language that manipulates text, files, and information. If you cannot write a program or script for your incoming information, you need to find someone who can do this for you.
The forms themselves are not hard to code. They follow the same constructs as other HTML tags. What could be difficult is the program or script that takes the information submitted in a form and processes it. Because of the need for specialized scripts to handle the incoming form information not everything can be detailed in Fill-Out Form Support specification. Check the Additional Online Reference section for more information.
<B>This is an example of <I>overlapping</B> HTML tags.</I>The word overlapping is contained within both the <B> and <I> tags. A browser might be confused by this coding and might not display it the way you intend. The only way to know is to check each popular browser (which is time-consuming and impractical).
In general, avoid overlapping tags. Look at your tags and try pairing them up. Tags (with the obvious exceptions of elements whose end tags may be omitted, such as paragraphs) should be paired without an intervening tag in between. Look again at the example above. You cannot pair the bold tags without another tag in the middle (the first definition tag). Try matching your coding up like this to see if you have any problem areas that should be fixed before you release your files to a server.
<H1><A HREF="Destination.html">My heading</A></H1>Do not embed HTML tags within an anchor:
<A HREF="Destination.html"> <H1>My heading</H1> </A>Although most browsers currently handle this second example, the official HTML specifications do not support this construct and your file will probably not work with future browsers. Remember that browsers can be forgiving when displaying improperly coded files. But that forgiveness may not last to the next version of the software! When in doubt, code your files according to the HTML specifications (see For More Information below).
Character tags modify the appearance of the text within other elements:
<UL> <LI><B>A bold list item</B> <LI><I>An italic list item</I> </UL>Avoid embedding other types of HTML element tags. For example, you might be tempted to embed a heading within a list in order to make the font size larger:
<UL> <LI><H1>A large heading</H1> <LI><H2>Something slightly smaller</H2> </UL>Although some browsers handle this quite nicely, formatting of such coding is unpredictable (because it is undefined). For compatibility with all browsers, avoid these kinds of constructs. (The Netscape <FONT> tag, which lets you specify how large individual characters will be displayed in your window, is not currently part of the official HTML specifications.)
What's the difference between embedding a <B> within a <LI> tag as opposed to embedding a <H1> within a <LI>? Within HTML the semantic meaning of <H1> is that it's the main heading of a document and that it should be followed by the content of the document. Therefore it doesn't make sense to find a <H1> within a list.
Character formatting tags also are generally not additive. For example, you might expect that:
<B><I>some text</I></B>would produce bold-italic text. On some browsers it does; other browsers interpret only the innermost tag.
You can run your coded files through one of several on-line HTML validation services that will tell you if your code conforms to accepted HTML. If you are not sure your coding conforms to HTML specifications, this can be a useful teaching tool. Fortunately the service lets you select the level of conformance you want for your files (i.e., strict, level 2, level 3). If you want to use some codes that are not officially part of the HTML specifications, this latitude is helpful.
Updating is particularly important when the file contains information such as a weekly schedule or a deadline for a program funding announcement. Remove out-of-date files or note why something that appears dated is still on a server (e.g., the program requirements will remain the same for the next cycle so the file is still available as an interim reference).
You could spend a lot of time making your file "look perfect" using your current browser. If you check that file using another browser, it will likely display (a little or a lot) differently. Hence these words of advice: code your files using correct HTML. Leave the interpreting to the browsers and hope for the best.
Comments such as the name of the person updating a file, the software and version used in creating a file, or the date that a minor edit was made are the norm.
To include a comment, enter:
<!-- your comments here -->You must include the exclamation mark and the hyphens as shown.
on the PCExpress | Please send comments to: WebMaster |