HTML – An Unauthorised Biography

What is HTML?
Where did HTML come from?
what HTML variants are there?

In this post I will answer all these questions and more.

What is HTML?

HTML is one of the languages (in fact the main one) used to build modern day web pages but it’s been around for over 20 years in a number of different forms. There are a number of other langages that are also used in conjunction with HTML such as CSS, JavaScript and server-side languages such as PHP, Classic ASP, C# and ASP.net, I will cover those and their relationships in another post though.

HTML’s role is to define the structure of the page presented to you by your web browser while other languages define the styling and movement involved to a lesser or greater extent.

Tim Berners-Lee, the inventor of the world wide web
Tim Berners-Lee, the inventor of the world wide web

Before there was HTML

In 1989 a man Named Tim Berners-Lee invented this cool thing called the internet whilst working at C.E.R.N. in Switzerland. Originally designed to link together articles on particle physics, it’s aim was to effectively create a “web” of documents across the world.

He didn’t realise that what he was building would become the biggest change to hit the globe since the industrial revolution.

Tim’s “web browser” prototype was created in 1990 and from there things really began to take off.

There were lots of developments before HTML was really created (in fact hypertext had been in use since the 1940’s) but I will leave those for another article, should enough people want to hear it. I could also go into things like DNS (once known as Distributed Name Service now known as Domain Name System) but I want to focus this post purely on HTML so we’ll also skip that for now.

The Birth of HTML

Tim needed something very simple to link all these files together and so he came up with the language HTML.

HTML Stands for HyperText Markup Language

Tim’s HTML was influenced greatly by a method for marking up text into structural units such as headings, paragraphs, lists etc. called SGML (Standard Generalized Mark-up Language).

SGML is where the use of paired tags comes from in HTML but we will get to that in a moment.

What SGML lacked though was a way to link documents together, it was designed to be self-contained.

The hypertext link (shortened to Hyperlink) was all Tim’s work as was the concept of domain names for addressing specific machines (www.name.domain), and those two things are what really began the world wide web.

The Basics of HTML

Before we go any further with the life story of HTML I thought I should give you a quick insight into exactly what HTML is and how it works.

HTML is actually a fairly simple language to get to grips with.

Text on the page is surrounded by various pairs of “tags” which take the format of <tag>text</tag>.

For instance, to make something the main header for your page you would do the following:

<h1>Page Heading</h1>

Notice the closing tag has a forward slash at the start. Simple right?

To make this even more useful, HTML tags can be nested which allowed for multiple tags to be applied to a single piece of text etc.

To be compliant to HTML standards though, nested HTML tags should be closed in the reverse order from which they are opened like so:

<h1><em>Page Heading</em></h1>

“em” is short for emphasis and is the equivalent of italic text in the word processing package of your choice.

As with most things in life though there are always exceptions to the rule. In HTML there are some tags which do not naturally have a closing tag because they do not surround anything.

A prime example of this is the tag used for inserting a horizontal rule into your page: <hr>

This is dealt with by browsers which know those tags require no close, although from the inception of XHTML a new way of dealing with these elements was introduced which allowed you to put a space and closing slash on the end of a tag like so: <hr />

This was necessary because ALL tags MUST be closed in XML (eXtensible Markup Language) and XHTML for them to be interpreted correctly by relevant software such as browsers.

All HTML documents are flat text files. Yes! Really!  You can edit ANY HMTL document with a program as simple as Notepad, in fact for a long time among purists it was the editor of choice.

All HTML documents should take the following format:

<!DOCTYPE html>
<html>
  <head>
    <title>Hello HTML</title>
  </head>
  <body>
    <p>Hello World!</p>
  </body>
</html>
The doctype definition in this page is for HTML5, if this is omitted the browser will operate in quirks mode for rendering purposes.

To explain this further each page has a DTD followed by <html>  tags surrounding the whole page.

Each page is then split into a <head>  and <body>  section.

The easiest way to look at it is that the head contains elements that are not seen, such as links to stylesheets and JavaScript files. The body contains the elements actually displayed by the browser.

  • head = settings
  • body = display

The only real exception to this is the <title></title>  tags which surround the text which is displayed in the bar at the top of some browsers.


HTML The Early Years

From 1990 – 1993 there was a lot of excitement over HTML and new formats were developed including HTML+ which sought to standardise features that had previously been implemented in HTML such as tables and fill out forms.

The first real web browsers were created (Linx, Arena & Mosaic – which eventually became Netscape, remember that?), but all of them introduced different ways of interpreting the HTML code

It wasn’t until 1994 that the IETF (Internet Engineering Task Force) created an HTML Working Group, which in 1995 finalised “HTML 2.0”.

Dan Connolly of the IETF also wrote a Document Type Definition for HTML 2, a kind of mathematically precise description of the language which is still used to this day in various forms.

The Document Type Definition of DTD for short is the first thing you see in any properly structured web page and looks something like this:

<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”&gt;

This has been greatly simplified in HTML5 but yet again I’m getting ahead of myself.

Further development of HTML was derailed by competing interests, in fact many people at the time mistakenly though that Netscape has created the internet as they were driving forward the development of HTML without talking to anyone else.

So to address this in 1994 the World Wide Web Consortium (W3C) was formed with Tim Berners-Lee at it’s head, and including people like Dan Connelly and Dave Raggett who were both also very instrumental in the creation of the internet as it is today. From 1996 until now they have been responsible for the maintenance of web standards including HTML.

Over the next couple of years the “browser wars” began with Microsoft jumping into the fray in 1995 and deciding that they, like Netscape, also knew better on how to correctly render HTML.

Then in 1997 came another new version.


HTML The Teenage Years

in January of 1997 W3C released HTML 3.2 Recommendation which dropped math formulas, sorted out the overlap between various proprietary extensions and included most of the visual markup tags from Netscape.

What does that actually mean? Well, it means that HTML had started to take the relative shape it would have for the majority of it’s lifespan and was attempting to be a single set of rules to be used by all browsers. As any one who was around in web development back then knows however the truth was somewhat different.

Both Netscape and Microsoft still had different ideas about what should have been included in the HTML recommendation and by mutual consent actually left two elements out of it: Netscape’s “blink” element and the Microsoft “marquee” element (both of which still send shivers down my spine).

1996 saw Microsoft’s Internet Explorer (known as I.E. in the industry) released for Windows 3.1 on PC and Mac and also saw a certain developer you may know (yes me) creating his first website.

As with humans though the teenage years were short lived and in December of 1997 HTML 4.0 recommendation was published by the W3C.

HTML or HyperText Markup Language
HTML or HyperText Markup Language

HTML Grows Up

HTM4.0 came in three different formats:

  • Strict, where deprecated elements are forbidden,
  • Transitional, in which deprecated elements are allowed,
  • Frameset, in which mostly only frame related elements are allowed;

One of the biggest changes in HTML 4 was the introduction of CSS (Cascading Style Sheets) to define the look and feel of a page (although they had been taking shape since 1995). The flip side to this was the depreciation of inline styling which was the visual markup features originally taken from Netscape. This is where each element is given styling on a tag by tag basis like so: <h1 style=”font-size: 10px”>hello</h1>

CSS is a way to tell the HTML how certain structural elements should be presented by using an external file to hold all the rules for an entire site. Effectively trying to separate structure from presentation.

CSS is also a subject for another post really, but hasten to say, modern web design frowns heavily upon in-line styling and it can even affect your SEO (search Engine Optimisation).

Is it me or are there just too many acronyms in development as a whole?? 🙂

Another big change was the introduction of the <object>  tag which allowed for the embedding of other technologies such as applets and eventually flash which we mention a little further on.

It also saw features introduced for the disabled, international language support, extensions to forms, scripting and more.

It was a much more complicated beast than it’s predecessor.

HTML 4.0 had a number of minor edits but it was the flavour of choice up until December 1999.

HTML Gets Complicated But Flexible

In December of 1999  HTML 4.01 was published (with a later update in May 2001).

Over the course of the next few years various things came about.

DHTML (Dynamic HTML) really took off with CSS and JavaScript helping to make previously very static pages come alive.

Server-side technologies grew in popularity, creating not only dynamic pages, but dynamically driven content as well.

Also in 2000 XHTML was born, effectively treating HTML as XML (yes, another post).

This meant developers had to be much stricter with their code in order for it to parse correctly, but that also lead to tighter standards in code and therefore a much better chance of pages being rendered in the same way by different browsers.

HTML 4.01 is still the current published standard for HTML to this day and it has been stretched to it’s limits.

This leads us nicely into the much more recent past and helps us look to the future.

HTML Spreads It’s Wings

HTML5 is being used now but won't be a standard until 2014
HTML5 is being used now but won't be a standard until 2014

In January 2008 a Working Draft of HTML5 was published by the W3C with a last call published in May of 2011. This is an invitation to people  inside and outside the W3C to look at the specification and confirm that it is sound.

W3C aim to publish the recommendation for HTML5 in 2014, but browser support for a lot of HTML5 is growing and many developers are already implementing HTML5 features.

So what’s going to be new in HTML5?

Well the short answer is tonnes of stuff!!

It’s the first real update to the specification in 10 years so a lot is being done to make sure HTML5 will have a similar sort of shelf life.

Here’s a glimpse at what new elements are being included:

<video> , <audio>  & <canvas>  elements and the integration of Scalable Vector Graphics (SVG).

All of which are designed to make multimedia and graphical content easier to handle without having to resort to external plugins etc.

It was these new elements that inspired the late Steve jobs to write a public letter titled “Thoughts on Flash”, where he pretty much nailed the lid on the coffin of Flash stating that “new open standards created in the mobile era, such as HTML5, will win”.

Combine that with the lack of flash support on the iPhone and even Adobe (the people behind flash) had to call it quits and instead are now focusing on HTML5 tools.

Flash is a multimedia system created by Adobe and used to add animation, video, and interactivity to web pages.

I should point out though that HTML5 on it’s own will not create animations, CSS3 or JavaScript (or both) are required.

Other new elements such as <article>  , <section>   and <nav>  are designed replace the generic <div>  and <span>  tags to make the code of your page easier to read by machines and search engines.

There are new attributes for some elements, and legacy elements such as <center>  and <font>  have been removed in favour of CSS.

They are even bringing back the math lost all those years ago with MathML for mathematical formulae.

So what does this all mean for you?

Well it means rather than needing additional applications on your iPad or phone or even Man or PC, HTML5 will bring rich websites and apps that are consistent and pleasing to visit.  Interfaces should become more intuitive and user experience on the whole should get better.

Summary

HTML is the language used to build websites.

It’s over 20 years old and is being constantly updated

It uses paired (generally) hypertext <tags>  to markup the pages of a site to give them structure and form

Tags can be nested and should be closed in the opposite order to that in which they are opened.

Now and going forward HTML is and will be everywhere: on your laptop, tablet and even your phone.

When combined with CSS & JavaScript it can now produce animation and intuitive UIs (User Interfaces)

 

I hope you have found this article useful or at least educational. It was written on request for members of a business group I am in on Facebook and hopefully it has answered their questions.

Advertisements

3 Comments Add yours

  1. Great post Simon! I’m not knowledgeable about coding at all so this helps me make quite a lot more sense of all that gobbledygook in !
    Thank you!

  2. Bee Collins says:

    Engaging and informative article that has certainly spiked my interest in all things computer language (just don’t tell my husband!).

  3. Glad you both liked it and hope we can keep your interest spiked Bee 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s