Using In A JavaScript Literal

Today I got bit by a very interesting bug involving the tag. If you’re writing code that generates code, you want to know about this.

I’m currently working on an application that takes content from various web resources, munges the content, stores it in a database, and on demand generates interactive web pages, which includes the ability to annotate content in a web editor. Things were humming along great for weeks until we got a stream of data which made the browser burp with a JavaScript syntax error.

Problem was, when I examined the automatically generated JavaScript, it looked perfectly good to my eyes.

So, I reduced the problem down to a very trivial case.

What would you suppose the following code block does in a browser?

<HTML>
<BODY>
  start
  <SCRIPT>
    alert( "</SCRIPT>" );
  </SCRIPT>
  finish
</BODY>
</HTML>

Try it and see.

To my eyes, this should produce an alert box with the simple text </SCRIPT> inside it. Nothing special.

However, in all browsers (IE 7, Firefox, Opera, and Safari) on all platforms (XP/Vista/OS X) it didn’t. The close tag inside the quoted literal terminated the scripting block, printing the closing punctuation.

Change </SCRIPT> to just <SCRIPT>, and you get the alert box as expected.

So, I did more reading and more testing. I looked at the hex dump of the file to see if perhaps there was something strange going on. Nope, plain ASCII.

I looked at the JavaScript documentation online, and the other thing they suggest escaping are the single and double quotes, as well as the backslash which does the escaping. (Note we’re using forward slashes, which require no escapes in a JavaScript string.)

I even got the 5th Edition of JavaScript: The Definitive Guide from O’Reilly, and on page 27, which lists the comprehensive escape sequences, there is nothing magical about the forward slash, nor this magic string.

In fact, if you start playing with other strings, you get these results:
  <SCRIPT> …works
  <A/B> …works
  </STRONG> …works
  <\/SCRIPT> …displays </SCRIPT>, and while I suppose you can escape a forward slash, there should be no need to. Ever. See prior example.
  </SCRIPT> …breaks
  </SCRIPTX> …works (note the extra character, an X)

With JavaScript, what’s in quotes is supposed to be flat, literal, uninterpreted, meaningless test.

It was after this I turned to ask for help from several security and web experts.

Security Concerns


Why security experts?

The primary concern is obviously cross site scripting. We’re taking untrusted sites and displaying portions of the data stream. Should an attacker be able to insert </SCRIPT> into the stream, a few comment characters, and shortly reopen a new <SCRIPT> block, he’d be able to mess with cookies, twiddle the DOM, dink with AJAX, and do things that compromise the trust of the server.

The Explanation


The explanation came from Phil Wherry.

As he puts it, the <SCRIPT> tag is content-agnostic. Which means the HTML Parser doesn’t know we’re in the middle of a JavaScript string.

What the HTML parser saw was this:

<HTML>
<BODY>
  start
  <SCRIPT>alert( "</SCRIPT>
  " );
  </SCRIPT>
  finish
</BODY>
</HTML>

And there you have it, not only is the syntax error obvious now, but the HTML is malformed.

The processing of JavaScript doesn’t happen until after the browser has understood which parts are JavaScript. Until it sees that close </SCRIPT> tag, it doesn’t care what’s inside – quoted or not.

Turns out, we all have seen this problem in traditional programming languages before. Ever run across hard-to-read code where the indentation conveys a block that doesn’t logically exist? Same thing. In this case instead of curly braces or begin/end pairs, it was the start and end tags of the JavaScript.

Upstream Processing


Remember, this wasn’t hand-rolled JavaScript. It was produced by an upstream piece of code that generated the actual JavaScript block, which is much more complex than the example shown.

It is getting an untrusted string. Which, to shove inside of a JavaScript string not only has to be sanitized, but also escaped in such a way that the HTML parser cannot accidentally treat the string’s contents as a legal (or illegal!) tag.

To do this we need to build a helper function to scrub data that will directly be emitted as a raw JavaScript string.


  1. Escape all backslashes, replacing \ with \\, since backslash is the JavaScript escape character. This has to be done first as not to escape other escapes we’re about to add.
  2. Escape all quotes, replacing ' with \', and " with \" — this stops the string from getting terminated.
  3. Escape all angle brackets, replacing < with \<, and > with \> — this stops the tags from getting recognized.

private String safeJavaScriptStringLiteral(String str) {

  str = str.replace(“\\”,”\\\\”); // escape single backslashes
  str = str.replace(“'”,”\\'”); // escape single quotes
  str = str.replace(“\””,”\\\””); // escape double quotes
  str = str.replace(“<“,”\\<“); // escape open angle bracket
  str = str.replace(“>”,”\\>”); // escape close angle bracket
  return str;
}

At this point we should have generated a JavaScript string which never has anything that looks like a tag in it, but is perfectly safe to an XML parser. All that’s needed next is to emit the JavaScript surrounded by a <![CDATA[]]> block, so the HTML parser doesn’t get confused over embedded angle brackets.

From a security perspective, I think this also goes to show that lone JavaScript fragment validation isn’t enough; one has to take it in the full context of the containing HTML parser. Pragmatically speaking, the JavaScript alone was valid, but once inside HTML, became problematic.

Firefox: Problem Loading Page

All of the sudden, Firefox started reporting it had problems loading pages and that the proxy server was refusing connections. Problem is, I don’t have a proxy server. …the problem, however was my own doing and very easy to fix.

I was checking mail in GMail, and when I went to my Sent folder, I got a loading message and then nothing. I went to check some other random website and got the message: Problem loading page with a more detailed message of:
The proxy server is refusing connections
Firefox is configured to use a proxy server that is refusing connections.
Check the proxy settings to make sure that they are correct.
Contact your network administrator to make sure the proxy server is working.

Odd. I don’t use a proxy server. So I try Safari. Things are working.

I shutdown Firefox. Re-open. Same thing.

A google search showed that a small number of people were having this problem, and they were being redirected to preference screens, virus checking their systems, etc.

That couldn’t be it — things were working just a moment before.

Cause & Fix
The cause of the problem was an accidental click on the Tor button in my browser’s status bar, which subtly switched me over to using Tor for web browsing. Only problem was, I hadn’t started the Tor virtual tunnels.

Firefox was correct – it was using a proxy server that wasn’t responding. I just hadn’t realized I’d activated it.

Simply clicking the “Tor Enabled” once put it back to “Tor Disabled” and suddenly Firefox was working again.

Guess I just happened to bump the mouse at just the right place on the screen. Opps.

Anyhow, it was a five minute mystery, and since I suspect others have fallen victim to their own demise and not been able to figure out why, or worse, giving Firefox a bum rap, I’d share the solution so others can get back to browsing.

Registry Mechanic PCTLicHelper.dll Missing

Got a message from Registry Mechanic that the file PCTLicHelper.dll was missing or corrupt. Here’s the workaround until PC Tools fixes the problem.

I’m an avid fan of Registry Mechanic and can’t say enough nice things about the product.

I recently downloaded Registry Mechanic 6 (version 6.0.0.750 w/ engine 2.0.0.560), installed, and ran it. It worked great – I even like the slightly sleeker interface. However, after doing the Smart Update, Registry mechanic displayed a message that C:\Program Files\Registry Mechanic\PCTLicHelper.dll is missing or invalid, and it suggested I uninstall and reinstall.

The uninstall worked fine. The reinstall worked fine. The re-running worked fine. But upon another update, I got the same error message.

My next thought was that the file might be getting deleted accidently during the upgrade, and to simply install the software, copy the file to a safe haven, do the upgrade, and put it back. No such luck, the file doesn’t exist in the pre-upgraded application. This is a new file needed by the latest upgrade, and judging by the file name, I believe it’s highly related to PC Tools’s license scheme that is subscription based.

A little experimentation with uninstalling and reinstalling shows that the fault lies in the Smart Update’s getting of the updated Registry Mechanic. I suppose they wrote the code but forgot to bundle the DLL.

Simply uncheck the entry that says Registry Mechanic 1.43Mb, but leave all the other items checked. You’ll be able to scan just fine.

I tried reporting this problem to PC Tools at their website support page, but that reported an error: Fatal error: Undefined class name ‘datetime’ in /home/shared/include/tickets/rawmessage.php on line 222. I guess they have bigger problems than they thought.

After calling their USA support line, 1-800-764-5783, I got a recording that said they could also be reached at support@pctools.com. That would have been nice to have on the website somewhere obvious.

This trivial issue aside, Walt gives Registry Mechanic a big thumbs up!

UPDATE 22-Feb-2007: As Mike suggested, I got the latest update this morning and this time there was an updated program, a PC Tools License Helper, and a revision to the white list. I installed the update, and Registry Mechanic worked just fine. I never did hear back from PC Tools about the issue, or its website being broken.

UPDATE 24-Feb-2007: This morning got an email from PC Tools informing me that the problem had been fixed and to simply reinstall, commenting there was no need to re-enter license information. They made no comment on the fact that their website was broken, other than to generically tell me it was a “great source of knowledge for future questions.”