HomeSite Help

Validation & Validation

 

We're moving!

This whole site is being moved to a shiny new server - as are all my sites, in fact. Apologies for the bumpy road ahead, but at the end of that road things will become fast and smooth.

Once the site at the new server is ready, this message will automatically disappear!

Meanwhile, you can see how the move is progressing at the status page.

 
   Home | HTML section | VTML section
On this page:
Types of validation
Classification matrix
Off-line products: short reviews Updated
On-line products: short reviews Updated
Conclusions
 
 

Types of validation

There's validation and then there is validation. In general, the term validation refers to the process of checking that your source for a web page is OK. But what is "OK"? That can mean many things, and the various types of things it means is one way of classifying validation.

Validating programs can be classified in different ways. The first is by what the program is validating.

DTD
The first thing validation can mean in this context, and the meaning the the "purists" attach to it, is conformance to a particular DTD, and more in particular one of the DTD's for HTML that are standards of the World Wide Web Consortium (W3C).
Syntax
Syntax validation generally does not take any particular DTD into account but checks if the syntax of tags and their attributes is correct. Generally only those tags that the program actually know about. Some programs are configurable and can be taught new rules.
Spelling
Spelling checkers for web pages are able to skip markup-code and just spell-check the actual text of a page.
Links
A links checker will check all your local links (links within pages, in-line images, and links to other pages of your site) and usually also all external links.
Compatibility
Compatibility validation means checking whether the code of your web pages is compatible with a particular type of browser. That can mean any type of browser, not just the two main graphical browser brands, but also things like WebTV or speech browsers.

A second way of classifying validation programs and functions is by how they can be used.

Off-line
These are programs that you can run on your own computer without needing a live web connection.
On-line
These are validators run as a service on a web site. Some will let you wait for the result while others will e-mail it to you. Most require that the page or site you want to validate is already on-line somewhere; but some can also accept a page as a file upload or a chunk of code that you can paste into an on-line form.
 to menu
 
 

Classification

The matrix below classifies some validation programs and services according to the two classification types outlined above. The links are to a short review below.

  Off-line On-line
DTD
Syntax
Spelling
Links
Compatibility  

 to menu


 

Product reviews: Off-line products

Spyglass HTML Validator

This is a nice program, and the only off-line one that I've been able to find that actually checks against a DTD. You choose which one you want to validate against from a drop-down box: it ignores the DOCTYPE declaration in your code. The interface is clean and easy to use, and the results window has a simple editor for quick repairs. It can only check against four pre-compiled DTDs however, it's not able to read new ones. I wish they'd publish a more up-to-date one: the latest official DTD they have is for HTML 3.2, based on what was the draft at that time. Still, when I tested it it found a real error in this page, that CSE HTML Validator simply cannot see because of the way its configuration file is constructed. Be aware that the error messages are not always very clear (but this is the case for most validators): I needed some knowledge of HTML to figure out what the error message it came up with actually meant. And if you're using anything that is either a browser extension or actually HTML 4.0, it will of course flag that as an error. As with all validators: interpret with care and common sense. One note: this program is by now so old, that it has apparently never heard of Windows NT 4.0. It refuses to run under that and tells you it can only run under Windows 95; I can see no technical reason why it couldn't also run on NT: it seems the OS test is simply too restrictive. If you do have Windows 95, and are still using HTML 3.2 (actually, I don't recommend doing either!) you should consider this program: if used with care you may find errors that CSE HTML Validator can't see. Best of all: it's freeware.
(Note: Spyglass was taken over by OpenTV; but the validator was still available for download from the OpenTV site for a while but they kept moving it around; by now, it has disappeared and it is not longer supported anyway.)  to menu

CSE HTML Validator outside link

If you are a HomeSite or ColdFusion Studio user you may already know this program: it comes bundled with HomeSite 3.01 and ColdFusion Studio 3.11 (but not with the versions 4.0 of these programs!). It uses a configuration file that contains the rules it checks against, in combination with some internal rules that you can't get at. It doesn't check against a DTD (and ignores the DOCTYPE declaration if you have one) but it does come close. Still, as mentioned with Spyglass (above) it does not find all real errors. I also found a few real errors in its configuration file. For instance, it will insist on closing tags for <TR> and <TD> though the closing tags really are optional. But the nice thing about this program is that such errors are easily repaired: the configuration file can be configured by the user. Tags and their attributes are organized into categories and this makes configuration on a high level possible. The interface for editing the configuration file is at first a bit cryptic but everything you need is there. And because it's so highly configurable, I was able to create a configuration file for VTML Validation, the language that the visual editors in HomeSite and Cold Fusion Studio are written in.

Because the program is updated regularly and now the configuration contains all the HTML 4.0 stuff, people often think that if a page validates with CSE 3310, it will be correct HTML 4.0. Alas, as it comes out of the electronic box, every category is active; to validate against HTML 4.0 you'll have to deactivate a number of them. More on this in Tougher validation. What many people who use this program exclusively from within their HTML editor often overlook is that it contains a number of other powerful functions as well: there's a Template function, and it can strip all HTML tags from your files. Start up the program by itself, and explore the help files!

Although version 4.0 of HomeSite and ColdFusion Studio now have an internal validator, it's less capable than CSE (it misses quite a few real errors that CSE flags) and less configurable to boot. Luckily, these programs can also make use of CSE as an external validator (even the one that was bundled with the previous versions of these programs) so you can still get the best of both worlds.  to menu

Weblint

NOTE: it seems the original Weblint site no longer lives, there just a dummy "portal" right now. I'll update when I find a new address (if any) or remove this section if I find it's truly disappeared.  to menu

HomeSite outside link and ColdFusion Studio outside link

Since this site is meant for HomeSite and ColdFusion Studio users I'm not going to wax lyrical here about these programs' many qualities as an HTML editor. This section of HomeSite Help is about validation, so I'll highlight three of their validation functions: spell checking and link verification.

The internal spell checker is very capable and comes not only with a (US) English dictionary but also with alternative ones for other languages (British English, Dutch, French, German, Italian and Spanish) as well as two specialized dictionaries for Legal and Medical terms. Unless you have a document that contains multiple languages, I'd advise you to put a check mark only for the dictionary that you need to check against on the Spelling tab of HomeSite's settings, or you'll be confused by too many alternatives when the checker finds something it doesn't recognize. There is also an HTML dictionary but this doesn't seem to have support for HTML 4.0 tags and if I use "<HTRML>" it will flag this as an error but not come up with the correct tag as an alternative. It's better to set up HomeSite to skip tags when doing spell checking, and leave the checking of tags to CSE 3310. Leave the HTML dictionary active though: if you don't the spell checker will stumble over all entities like "&quot;" since these are not tags that can be skipped. You can add your own terms to a user dictionary (USERDCT.TXT) from within the spell checker; this is a simple text file with one word to a line, like many spell check programs use. If you use a spell checker with another program, check out the format of its user dictionary: you may be able to share it with HomeSite.

Something that's actually worse in version 4.0 than in 3.x: you can no longer spell-check a marked section of your text.

Version 4.0 can now also make use of an MS Word spellchecker if installed on your system; in principle useful for those languages that are not available in HomeSite's own suite of dictionaries. However, the current version (4.0) can not correct handle any language that make use of high-ASCII characters: if you don't write English the usefullness of this option is therefore very limited.

The links checker has grown up since its first incarnation in version 2.5. You can now check either a single page or a whole project, and it's become quite fast, too. Unless you maintain a site with a huge number of links, this is probably all you ever need for a links checker, with the advantage that you never need to leave HomeSite's environment. If you do have many external links (more than a few hundred or so), LinkBot is still the best tool, and it integrates nicely with HomeSite (or whatever your favorite HTML editor is).

Although version 4.0 of HomeSite and ColdFusion Studio now have an internal HTML validator, it's less capable than CSE (it misses quite a few real errors that CSE flags) and less configurable to boot. Luckily, these programs can also make use of CSE as an external validator (even the one that was bundled with the previous versions of these programs) so you can still get the best of both worlds.  to menu

HTML PowerAnalyzer outside link

HTML PowerAnalyzer is a module in Talicom's suite HTML PowerTools (available in 32 and 16 bit versions). It does two types of validation, syntax checking and link checking.

For syntax checking, like CSE HTML Validator, it does not actually check against a DTD but uses a configuration file; Talicom calls them Rule Bases and they are freely downloadable from their web site. An HTML Rulebase Editor outside link is included in the suite. They now also have a rulebase avaialble for HTML 4.0 as the latest official standard, as well as Netscape and Microsoft extensions. But unlike CSE HTML Validator, there is no support provided for other markup languages and extensions like Cold Fusion tags. Conclusion: recommended only if you work on a Win 3.x platform. Otherwise CSE HTML Validator is a lot more powerful.

As a links checker it will confirm that all of your Website's internal links are valid. Limited support for validating external links.  to menu

HTML PowerSpell outside link

HTML PowerSpell is a module in Talicom's suite HTML PowerTools (available in 32 and 16 bit versions). Like HomeSite, it can spell-check a whole site and has support for multiple languages and a user dictionary. It also has some nice options to skip words in CAPS and words with numbers, and to include text in IMG ALT tags.

If you can't use HomeSite because you still work on a Win 3.x platform, this may be be the spell checker you need since it understands HTML. If you do use HomeSite, don't bother: HomeSite comes with more dictionaries than PowerSpell (which doesn't have Dutch, Spanish or the dictionaries with legal and medical terms) and PowerSpell can only work with two dictionaries simultaneously.  to menu

LinkBot outside link

If you maintain a site with many links, especially many external links, LinkBot from Tetranet Software is the tool to use. It quickly produces a complete site map and notes what's wrong with links there, then goes on-line and checks external links, with up to ten sockets going at the same time. It's simply blazingly fast! It checks not only HTTP links but also some other protocols like FTP and MAILTO (but not NEWS). It produces extensive reports than can immediately be viewed in your favorite browser; if you have Microsoft Internet Explorer, that can also be installed as an internal browser. It does a lot more than just link validation and will also produce reports pointing out what's new and what's old (you set the criteria), and orphaned files. Highly recommended!  to menu

 

Product reviews: On-line products

W3C HTML Validation Service outside link

This should be regarded as the "official" on-line validating service since it's run by the World Wide Web consortium itself. It needs a URL as input and can validate against a large number of DTDs which are listed on-line outside link ; it looks at the DTD in your document to decide which one to use so there really must be a valid DOCTYPE declaration at the top of your source. Optionally it also includes Weblint (no support for HTML 4.0 yet!) which is really helpful in interpreting the results: the messages from the W3C Validator are not always very clear. By matching the two, you'll get a much better idea of what's wrong.  to menu

Weblint Gateways

See the NOTE above!  to menu

Doctor HTML outside link

While the three services above basically are HTML parsers that check against a given DTD, Doctor HTML is more a general web site analysis program. It can do a lot of different tests which have clear on-line descriptions; an easy interface makes choosing the tests a snap. If you have JavaScript enabled, you can also get a display of your page in a separate browser window for reference. It includes a spell checker (English only), and an option to check all your images to see if they contain HEIGHT, WIDTH and ALT attributes. It also serves as a link checker though its output is a bit strange: it does not list the full URLs from the links. Output is clearly grouped by type of test, and line numbers are listed with everything, neatly grouped if a particular element occurs on multiple lines.

As you can see in the classification table, Doctor HTML v5 covers exactly the same types of validation that the HomeSite/CSE HTML Validator does, so if you have HomeSite, you don't really need this. If you don't this is a nice general-purpose validation service with a friendly interface.

There is also a subscription service called RxHTMLpro available, which can check a whole site rather than a single page; you can even schedule checks.

 to menu

End User Computing Limited - Hyperlink Verification outside link

This is purely a link checker; it's a free service, and the result will be e-mailed to you. But it's like all the other on-line services: one page at a time: it can't check a whole site. The advantage over the link checking function of Doctor HTML is that you don't have to sit and wait on-line for your results, just give your e-mail address and a URL, and log off or go somewhere else.  to menu

Bobby outside link

Bobby is completely different from all of the other programs and services listed here. But it's an important difference. To quote: "Bobby is a web-based service that will help you make web pages accessible to people with disabilities." It actually does more than that and can also test for browser compatibility. Both are important in checking that your pages are not only correct but actually usable by a wide audience. Of course it can not look at all aspects of usability but if you fail Bobby's tests, you do probably have some work to do. All tests are clearly described and reasons given for why an aspect may be important.

Output can be either text-only (not very clear) or in the form of your own page displayed graphically with markers indicating the problems found, and a summary report at the end. The report lists the problems found in order of importance so you can start at the top and work your way downwards for repairing things. Highly recommended for a different look at your pages and building some awareness of accessibility problems.  to menu

Lynx Viewer outside link

This service simply shows you your page as it would look in a Lynx (text-only) browser - it's not the latest but still a recent version which knows how to send the host name along to the server so you don't get stuck on a different page when testing some pages hosted on a virtual domain (earlier Lynx versions didn't support this, and if you'd try those on HomeSite Help you'd end up at Java Woman). Alas, the somewhat more flexible Lynx-View I mentioned here previously has disappeared. But since Lynx is now available for many platforms, why not get your own version? The output will also give you some idea about the usability with speech browsers and especially the effect of your IMG ALT attributes (or the lack of them).  to menu

Web Page Purifier outside link

A different approach to validating against a particular DTD: Here you can give a URL and choose one of the six DTDs provided, and the program will show you your page "purified" i.e., everything that doesn't belong in your code according to that DTD is simply filtered out and the result sent to your browser for display. You can also check "text-only": in that case all images are filtered out as well. No reports, just what the page would look like if everything that doesn't belong is filtered out.

This does not sound very useful since you can't see what was actually filtered but a few of the choices actually are worthwhile in some cases: One of the choices is HTML 3.2; using this, you'd see what a browser with a pretty clean implementation of this standard would show. One such browser is Opera which or course you can (should) buy if you're working on Windows. If you happen to work on a Mac, this on-line service will give you a pretty good idea what Opera users will be seeing. Another option is WebTV and this is about the only way to see what your page would look like on that short of buying the box (which you can't do if you don't happen to live in the USA). There's a special setup page to resize your browser to the WebTV window size before using the filter.  to menu

Web Page Backward Compatibility Viewer outside link

Another approach at browser compatibility. Here you can enter your URL and check or uncheck a number of features that the supposed browser can or cannot handle. Optionally you can enter a browser ID string (an extensive but not very well laid out list is provided) such as the browser sends to a server but this does not have any effect on the filtering. It will then show you your own page as it would look with the unchecked features filtered out. By the same programmer as the Web Page Purifier.  to menu

 
 

Conclusions

HomeSite together with CSE HTML Validator

As you can see from the classification matrix, HomeSite together with CSE HTML Validator covers three of the five types of classification. It also does that quite capably. That's quite a lot to get with your HTML editor! But it's also obvious by now that this is not all there is. If you create pages for a controlled environment like an Intranet, this combo may be all you need; but if your pages are to be on the open Internet, read on. Also read on if you use any kind of server-side scripting that generates HTML!

Validation against a standard DTD

What's really missing in the HomeSite/CSE combo is actual validation against a standard DTD. While CSE HTML Validator can come close, especially when configured to be tougher than it comes out of the electronic box (see Tougher validation) it can't do it all. There are some types of HTML errors that CSE simply cannot see (unless you do a lot of editing of its configuration file). I'll add a page later explaining why this is so; for now, just accept it as a fact. While the most usual browsers probably won't have a problem with such errors, you can never be sure what a browser will make of it: browsers are written to do their best when they encounter an error and try to guess what the intention was. But they are guessing and they all do that differently. You'll have much more certainty that your page will display and work as intended if it doesn't keep the browsers guessing. So if you want that ensurance, you still need to do a real DTD validation.

If you need to minimize on-line time, both the Spyglass program (if you have Windows 95) or Weblint (if you can run Perl) would be useful additions to your toolbox. While both only recognize HTML 3.2 and Weblint does not really work like a DTD validator, they can help weed out some errors before going on-line for a final check.

For real DTD validation against all standards, there is now only one good choice left: the W3C HTML Validation Service which has the same friendly output as KGV but which will only check pages already on-line. Both include the option to use WebLint (which does not yet support HTML 4.0). There's an extra advantage to using the W3C service if you use a server-side scripting language like PHP or ColdFusion: the program sees what a browser sees so you can use this to validate the generated HTML!

If you need to check more than a single page, keep a list of URLs handy to cut and paste, and save or print the report pages. If you're checking pages with the W3C service you can put them in a "hidden" directory if you don't want them to publicly viewable before you're done checking.

Update: The source code outside link for the W3C HTML validation service is now available (under the terms of the W3C Software Copyright outside link )!

Compatibility checking

Depending on the intended audience for your site, some type of compatibility checking can also be quite useful. Apart from gathering as many browsers as you can find, there's no realistic off-line option.

A Win32 version of the latest version of Lynx can be downloaded from downloading Lynx for Win32 outside link or a Japanese site outside link ; this Win32 port uses the standard Winsock and is the best way to check how your pages would look when browsed with a text-only browser. The ZIP archive includes a DOS version for 386 and higher processors. It can be installed easily as one of the external browsers for HomeSite. A text-only view of your pages can also give you a fair idea of what speech browsers would make of them. Check the Lynx homepage outside link for versions of Lynx for other platforms.

I'd also advise you to run at least some of your pages through Bobby occasionally; it will build your awareness of accessibility issues. There is also a beta version available for checking your pages off-line with Bobby!

The Web Page Purifier is useful to get an idea of what your page would look like with WebTV.  to menu