HTML: reading between the lines
HTML is a technology that is intended to mark content. An HTML document is sent over the HTTP and gets interpreted by user-agents. Documents that are marked with HTML suggests that the information within is a hypertext i.e., a Web document. HTML can refer to other resources using hyperlinks i.e., a
element, and other request types. This is also known as a pull technology where other document types (e.g. CSS, JavaScript, images, flash) are fetched and integrated onto HTML by user-agents.
HTML implicitly suggests:
- Validity
- A well-formed (valid) document follows the grammar rules that are outlined in a DTD. Consequently, it allows parsers (i.e., HTML validator) to know how to analyze or check for errors and warnings with respect to syntax. Machine validation is only concerned with its syntactical correctness. It allows the user-agents to construct the DOM of the document.
- Conformance
- Conformance in contrast to validity requires both machine and human checking. HTML documents should conform to a specification because a valid document is not necessarily compliant (e.g. appropriate use of the
alt
attribute). - Semantics
- Semantics has to do with meaning. Encapsulation of the data is done by using the closest corresponding (or most appropriate) HTML element. Some data can be paired with alternative or replacement data (i.e., usage of
title
attribute in theabbr
element). While others are used to supplement the content of the element (i.e., usage oftitle
attribute in thespan
element). It is also appropriate to follow existing, widely adopted, standard outlines (e.g., vCard, calender data exchange, geo, Atom, tagging). Both visible and hidden (meta data) can also be captured here. - Order
- The structure of the document is provided by following a logical sequence to the information. Internal references can be done by pointing one data (e.g.,
<a href="#foo"
) to another location in the document. External references provide exit points from its structure. - Relevancy
- Information in the document that has a strong or close relationship with one another can be grouped and categorized using containers or placed nearby related data. Relationships are also handled when the user-agent requests other resources that are outlined in the HTML document e.g.,
<link rel="stylesheet" href="/foo.css"
,<img src="/foo.png"
,<a rel="tag" href="/tag/foo"
.
- Tags