Tasty Technology! By Tim Scarfe.

innerHTML VS DOM.

  • Written: 17/04/2002
  • Document Updated: 16th August 2002 (Owen van Dijk Comment)
  • Document Updated: 14th November 2002 (Tweaks and new comment added).

If you would like to comment or add to this debate, please email Tim Scarfe or Alex Russell.

Preface

Some time ago, Mr Alex Russell and Mr Tim Scarfe got talking about the use of innerHTML on the web. Our thoughts were not parallel by any means, but they crossed at certain points.

At that time, the aforementioned gentlemen decided to write an online debate covering the pros and cons for using innerHTML as apposed to the DOM methods.

What does innerHTML and its counterparts do?

innerHTML

innerHTML is a read/write property (It is read-only on certain elements) that allows you to retreive or set data on a node.

Example 1.a
 
<p>hello my name is <em>Tim</em>!</p>
 
<input 
 type="button" 
 value="get Value" 
 onclick="alert(this.previousSibling.innerHTML)" />
 

Example 1.a would (in IE) yield an alert box with "hello my name is <EM>Tim</EM>!" (Note the messy capital HTML tags.)

Step by step, innerHTML will normalize all text nodes (Concatenates them), then it will insert the HTML tags as a delimiters between all the text data in the resulting string.

outerHTML

outerHTML is identical to innerHTML except it will also return the HTML for the node you are calling the property on (Note: This will also show up any attributes that exist on the node.).

Example 1.b
 
<p id="p_1">hello my name is <em>Tim</em>!</p>
 
<input 
 type="button" 
 value="get Value" 
 onclick="alert(this.previousSibling.outerHTML)" />
 

Example 1.b would (in IE) yield an alert box with "<P id="p_1">hello my name is <EM>Tim</EM>!<P>" (Note the messy capital HTML tags again.)

Owen van Dijk Comments (August 2002):

Hello Tim,

I just read your article about innerhtml and i have a little note. In the first paragraph you talk about the messy uppercase html tags when alerting the innerhtml of a node.

However, when you go to this spec, the DOM1 level recommendations, you would read the following:

2.5.3. Exposing Element Type Names (tagName) The element type names exposed through a property are in uppercase. For example, the body element type name is exposed through the "tagName" property as "BODY".

Also in the DOM Level 3 working draft is the following:

Note that this is case-preserving in XML, as are all of the operations of the DOM. The HTML DOM returns the tagName of an HTML element in the canonical uppercase form, regardless of the case in the source HTML document.

So it should be returning the nodes in uppercase.

Regards,

Owen van Dijk

innerText and outerText

These properties return the textual contents of a node, They will return exactly the same as innerHTML, but with the HTML tags stripped out.

The difference between innerText and outerText, is that outerText is not read-only.

Support of innerHTML in browsers.

The innerHTML property is supported in these browsers;

  • Microsoft Internet Explorer 4+
  • Microsoft Internet Explorer 5 Mac
  • Opera 7+ (November 2002)
  • Mozilla M17+
  • Netscape 6+
  • Konqueror 2.2+
  • IceStorm 5.
  • iCab 2.x+
  • MS Pocket IE 3.x+

Thanks to Jim Ley for pointing out a few browsers we missed (01/05)

innerHTML was first introduced in 1997 with the advent of Explorer 4 for the PC. Explorer 4 also added other similar properties;

  • innerText
  • outerText
  • outerHTML

Note that these properties are generally not supported in other browsers and should not be relied on.

Objections to innerHTML

         The first type of objection is that of those whom feel that innerHTML pollutes the purity of the DOM and shouldn't be allowed on the basis that it does not serve all possible DOM implementations. It is DTD specific, and is useful almost exclusively in browsers.

Alex Russell Comments:

It is this line of objection that I feel may merit the most attention, since it's most compelling to the standards warrior in me.

The fact that the innerHTML property is only useful in documents that correspond to the HTML DTD is indeed its most troubling aspect. Were the feature more aptly named (and implemented as) innerXML my reservations would be much allayed, but as it is,innerHTML doesn't make the grade on this account. But this may not be enough to disqualify it from our toolkit just yet. A set of methods, perhaps getInnerXML and setInnerXML that took both a string and a DTD URI as arguments would be a step in the right direction for replacing innerHTML.

         The second common type of objection to innerHTML is that the browser shouldn't be creating DOM nodes out of strings when perfectly good methods for creating them already exist via the DOM.

Alex Russell Comments:

My response to this point is "why shouldn't the browser do that for us?"

While it is true that this may seem unclean when called from inline code and can lead to very stringy code, I don't feel that there is anything intrinsically wrong about requesting that the browser expose to scripters a service, which it already performs when a page is initially parsed. Put another way, the browser converts strings to nodes when the page loads, why shouldn't we be able to count on this after the page loads as well?

Tim Scarfe Argues:

I can't think of any situations where this would (or should) become an issue if the project was correctly thought out in the first place (See my later comments about structure in the behavioural element of the application).

         innerHTML leads many a well-intentioned newbie astray.

Alex Russell Comments:

This is a valid point, but I don't think it's an open-and-shut discussion.

There are parallels in other languages (and disciplines) of powerful techniques that are too easily abused. However, we tend to require these powerful techniques when certain thorny problems arise and would be at a great loss were they not available when we need them most. My personal feeling on the topic is that innerHTML should not be taught to newbies. It should be excised from sample code and ignored when proposed as a solution to simple problems. That said, there are some times when innerHTML is the only good answer. In the situations I feel that it's use should be discussed among the competent and generally acceptable use situations laid out by those in teaching positions in the community. A similar approach to the way the C/C++ community deals with the goto directive may be in order.

Tim Scarfe Agrees:

innerHTML can allow new users to develop without learning about text nodes and how they are used with element nodes to form document fragments. This may hinder their development ability long-term.

         innerHTML is not standard.

Tim Scarfe Comments:

It is very unlikely that innerHTML will ever become a web standard, certainly not in the DOM core.

Alex Russell Comments:

Due to some of the failings listed above, I doubt very much that innerHTML will ever become a standard property, even of DOM level 0. Despite this, I believe that it fills a useful purpose and that functionality should not be abandoned entirely. Different packaging innerXML may be the key to moving forward in the respect, but for the time being, innerHTML will have to do in those sticky situations where one really needs it.

How is innerHTML abused?

Almost always, the thing that gets me with developers is when they have a load of HTML in a string in JavaScript. Then they write it into the document.

HTML is structured data, it doesn't belong in the Behaviour element (JavaScript).

Examples of this?

Let's take a look at Scott le Pera's "DOM Windows" (Article here).

The data that is supposed to go inside the windows is in the JavaScript section of the application instead of the HTML. At the moment, if the browser does not happen to be one of the correct browsers (Yes, he should be using standards detection as well), They will just see an empty screen. However, if the HTML for the windows was in a division in the page, it would become degradable. Any browser or device could view the content.

If x amount of W3C standards were supported, He could then kick-in, create the windows, clone the HTML content into them and remove the origionals.

While an old example, it is unfortunatly pretty typical. The truth of the matter is that many developers have decided that since Microsoft has conclusively won the browser war, learning the standard is a waste of time. Microsoft has a track record of providing backwards compatibility at all costs and many DHTML authors are willing to count on this when developing their pages.

From Microsoft:

Microsoft believes very strongly in Internet standards and the standards process, and is committed to implementing appropriate standards when driven by customer demand. However, standards compliance is part of a larger effort that includes many constituencies. By innovating, and driving customer requirements into Internet Explorer and then into the standards groups, we'll make the Internet a richer platform for all users.

The position is very clear-because a standard exists, that does not mean Microsoft will automatically implement it. Microsoft will implement appropriate standards that we believe are useful to our customers.

I believe that this is an appropriate position for any company to take. Some may take the position that because it's a standard, it must be good, and if it isn't in a standard, then it must be bad. This isn't a position I can subscribe to myself. I find it both refreshing and interesting that Netscape/Mozilla recognized the usefulness of innerHTML functionality and included it in their latest efforts, despite the fact that this is not part of any standard recommendation.

What is the DOM way to do it?

HTML and Nodes

Let's talk about nodes.

There are about 12 types of nodes as described by the W3C, but we are going to talk about only 2 of those types. The first thing one needs to be able to do is take some HTML and transform it into an image like Example 1.c in your head.

Consider this HTML:

<body>
 <div><p>Hello<em>Tim</em>How Are You?</p></div>
 <div>Developer-x.com</div>
</body>

Can you turn it into an image in your mind like this?

Diagram 1.c

You are probably wondering why there are 2 text nodes after the em HTML element in my diagram. I have put them there in that fashion because it is quite possible for that to happen if you are manipulating the nodes with the DOM (or if your parser is especially poor).

How do I make the two odd text nodes become one?

To clean up adjacent text nodes, one can use a DOM method called .normalize( ).

body.firstChild.firstChild.normalize( )

This would then produce a diagram like this;

Diagram 1.d

It would also be possible to do the same thing by using basic DOM principals:

var str = new String;
var p = body.firstChild.firstChild;
str += p.childNodes[2].nodeValue +" "
 + p.childNodes[3].nodeValue
p.removeChild( p.childNodes[2] )
p.removeChild( p.childNodes[3] )
p.appendChild( document.createTextNode( str ) )

Step by step we;

  1. Define a string.
  2. Define a reference to the paragraph element.
  3. Build up a string with data from both text nodes.
  4. Remove the How text node.
  5. Remove the Are You? text node.
  6. Append the paragraph element with a new text node containing the data from both origionals.
Common misconceptions

Some people that are so used to innerHTML are blissfully unaware that all data inside a document is contained in a type of node. This of course includes textual data. Let's imaging we could represent this in HTML.

<body>
 <div><p><CDATA:text>Hello
 </CDATA:text><em>
 <CDATA:text>Tim</CDATA:text></em>
 <CDATA:text>How</CDATA:text><CDATA:text>
 Are You?</CDATA:text></p></div>
 <div><CDATA:text>Developer-x.com</CDATA:text></div>
</body>

For this reason, when using the DOM propertys .nodeValue or .data, tell-tale mistakes constantly come up:

var string = body.firstChild.firstChild.nodeValue; // WRONG

body.firstChild.firstChild is a p element node. It is NOT a text node. Asking for the node value of p will return undefined, null or perhaps an error depending on the browser you use.

If we just wanted to return a string that said "Hello Tim How Are You?", innerHTML seems really tempting right now.

var string = body.firstChild.firstChild.innerHTML;
//string == "Hello Tim How Are You?"
  • innerHTML is helpful if you don't know how many text nodes there are.
.cloneNode

Coders with a couple of large projects under their belt almost instinctively seperate structural data from behavioural code, and for good reason. Coupling of behaviour and data can create quite a mess, and innerHTML is the embodyment of this mess inside of browsers.

.cloneNode allows developers to pull data from a node (including sub nodes if one chooses). Coders can use this to copy replicas of nodes elsewhere.

Let's consider an example where we would like to copy a division on a page.

// .html
 
<body>
 <div>
 <p><em>This</em> is <strong>a</strong> 
<em>test</em>!</p>
 </div>
</body>
 
// .js
 
// innerHTML way:
 
body.innerHTML += body.innerHTML
 
// DOM way:
 
body.appendChild( body.firstChild.cloneNode( true ) )

Which way is better, Direct, standard DOM manipulation, or messy strings with capital HTML tags in?

Why do people use innerHTML

People just getting into DHTML will immediatly see innerHTML as an easy-to-use property, especially if they have attempted and failed to use .nodeValue.

innerHTML means developers do not need to consider text nodes and how they work to a large extent.

innerHTML saves a lot of time when writing data, for example:

// the innerHTML way:
 
var str = "<p>Tim<strong>says <em>hello</em> "
 +"to</strong> Kez and <em>Dan</em></p>"
div.innerHTML = str;
 
// the DOM way:
 
var p = document.createElement("p")
var em_1 = document.createElement("em")
var em_2 = document.createElement("em")
var strong = document.createElement("strong")
 
 p.appendChild( document.createTextNode("Tim") )
  strong.appendChild( document.createTextNode("says") )
   em_1.appendChild( document.createTextNode("hello") )
  strong.appendChild( em_1 )
  strong.appendChild( document.createTextNode("to") )
 p.appendChild( strong )
 p.appendChild( document.createTextNode("Kez and") )
   em_2.appendChild( document.createTextNode("Dan") )
 p.appendChild( em_2 )
div.appendChild( p )

However, this makes us think of a couple of points.

  • If you ever get into this sitation, you need to re-think the way you are attacking your application anyway.
  • Your structural data like this should be in your HTML or XML. use .cloneNode

At this time, we are starting to see DHTML done by a lot of individuals the way it should be. We are talking about having a structured, degradable HTML foundation to start with in the first place, before we kick in with the CSS and Javascript.

This, however, doesn't mean the death of innerHTML. Certain situations which require parsing of text strings (provided by any number of data sources) are most certianlly handled best by innerHTML. In fact, having the ability to call the parser at will is not such a bad thing, however the DOM is not a definition for a parser interface, but rather a data structure. Until the W3C decides to publish a standard interface for a parser interface for DOM implementations, innerHTML will be with us.

Arguments for and against

For

  • Convinient and fast.
  • Easy to use and understand.
  • Works in general situations where some factors are unknown.
  • Great for writing data to a node, This could take along time with the DOM.
  • In situations where you need innerHTML, nothing else will do
  • innerHTML is significantly faster than the DOM in Gecko and IE (See Aaron's and Stephen's comments below).

Against

  • It is NOT a W3C DOM standard. It won't likely become one either.
  • The DOM is more powerful.
  • Its name contains HTML, although it could be used for SGML/XML documents.
  • It is lazy and can produce unstructured, lazy code.
  • It can lead new developers away from learning about text nodes.
  • Code will become hard to port to XML apps and won't be future proof.
  • Should the browser be parsing HTML strings and creating nodes?!
  • innerHTML can mean structural data inside the behavioural element of your app.
  • innerHTML is very buggy (to say the least) in the Gecko browsers.

User comments

Dr. Tom Trenka:

Reminds us that Microsoft have added an innerXML property in the System.XML namespace of .NET.

Thinks we should have a standard parsing system for XML strings and files, Perhaps a method that can take a string or a URI and return a valid XML node document fragment.

Tim Scarfe Responds:

The W3C DOM Activity Statement currently point to a load and save system in DOM level 3 just like this Tom.

Dave Schontzler Adds

There are JavaScript parsing tools for this sort of thing available at http://jsxml.sourceforge.net/.

Aaron Boodman Comments

In JavaScript, string manipulation is much more efficient than DOM wrangling.

So if you are in a situation where large chunks of DOM must be created and or edited at runtime in a very responsive manner, creating XML/HTML strings and plugging them into the DOM with .innerHTML or .xml will probably be your best bet.

Paul Sowden Comments

I've thought about this issue a bit more, and I can safely say that this property has no place in the DOM Core. The load/save part of the specification isn't compulsory and the innerHTML property goes no where near the core.

However, I think that innerHTML does have a place in the mamoth web browsers of today. There are good reasons for it.

Something that people don't seem to remember is that the DOM is not just for the web. It is for any application that uses and needs to manipulate XML. It provides a standard API. The web is only half of it.

Stephen W. Cote Comments

In both IE and Mozilla, I've noticed that innerHTML is faster at rendering large blocks of content. Refer to the following test for an example:

http://www.developer-x.com/content/innerhtml/dom_vs_innerHTML_perf_test.html

Note: this will take several seconds to run.

Moved your example to my server for URI persistance Steven - TRS

Conclusion

Tim Scarfe Comments:

innerHTML, Even if it bugs me, is here to say, At least for the forseeable future.

Tim Scarfe Comments (November 2002):

innerHTML, is a tool. A tool that can be abused like anything else. I believe it has no place in the DOM. I do believe it has a place in functional DOM web browsers, perhaps under a different name. There is no point saying that web browsers should not incorporate it simply because some people might not "get it", or might break the semantic model of DHTML. Hey. It's useful.

We have outlined a basic scope of the situation, and gone into some hands-on examples of its use. The time has come to make your own decision on whether or where you will choose to use it.

Would you like to comment on this article? Would you like to add to this article? Please e-mail us.

External Reading

top

Tim Scarfe is an IT Consultant/Web Developer from West London, England. Tim works for the EnCana Corporation, the world's largest independant Oil and Gas company. Tim is very keen on W3C Standards and Accessibility. Tim runs http://www.developer-x.com

Alex Russell is a Web and security application developer. He is the primary developer and maintainer of the netWindows.org project, an Open Source component framework for DOM DHTML interfaces. Alex is currently looking for full time work in computer security engineering or back-end Web development. You can visit his weblog at http://alex.netWindows.org or reach him via email at alex at netWindows.org.


Copyright Tim Scarfe © 1999-2006. All rights reserved.
Dot Net Solutions