Skip to main content

Using The Coldfusion XMLValidate Function To Validate User Content

Sometimes you want to allow a limited set of HTML tags in user generated content.  This can be done with the XMLValidate function in Coldfusion as long as the content is valid XHTML.

Several years ago I read an article about escaping form values posted by Ben Nadel on his site www.bennadel.com. Some discussion came up in the comments about allowing a limited set of html tags for paragraphs, bold text, and so on.  I had a need to do this for forum comments on a site that I was working on. This site was written in Coldfusion so I was looking at some of the same options mentioned in Ben's article. I ended up doing something a little different though.

We were using TinyMCE for the forum comments. TinyMCE produces XHTML code so I was able to use Coldfusion's abilities to handle XML to accomplish this task. Using the XMLValidate function and an XML schema that was modified to accept a small list of tags and attributes the comments were correctly limited. Here is the relevant portion of the code.  Below is an explanation of how it works. 



The first line takes the content from the form and wraps a content tag around it. This is done because valid XML has to have a root element. Naming this element content was just an arbitrary decision. It could have been anything. Also before parsing non-breaking spaces are escaped in the content. If I remember right the non-breaking spaces were causing the XMLFormat function to error.

The next section is a cfxml tag containing the XML schema. This XML was created from some examples and then modified to include the desired tags.

The schema allows the following tags:

  • any number of br, strong, em, ul, ol, u, strike, li, spans with style and class attributes, 
  • a tags with href, title, target, and class attributes, 
  • img tags with src, alt, height, and width attributes, 
  • p tags with align, class, and style attributes, 
  • and finally content tags 


As mentioned above the content tag is just a container added to act as the root document element. Also the href attribute on a tags can only contain urls not javascript.

Then XMLValidate is called with the content and the schema. This returns a struct with information about the validity of the document. Finally the status key is used to decide if the content is acceptable or we need to reject it.

This article was rewritten from an article that I wrote several years ago on a different blog.  That blog is no longer online.  I think the information is still useful though.  I have a few other articles from that old site that I am going to re-publish.

Comments

Popular posts from this blog

Getting Hidden App Data From Your Google Drive

Some Android applications use space on your Google Drive to store data.  You can't see this data by browsing drive the normal way.  You can get access to it if you get the access token from the app and do a few other things.

I use an app to keep some notes and other things.  They require getting the paid version of the app to backup your data.  I wanted to see if I could get it myself without paying for the app.  I found out how to do this from this Stackoverflow post http://stackoverflow.com/questions/22832104/see-hidden-app-data-in-google-drive.  The example there uses php. Since I have been learning Python recently, I decided to see if I could do the same thing with Python.

 There is a pretty good explanation of why you need to go through all the steps you need to go through to get your data in the accepted answer for that post, so I won't go into too much detail about that here.


Using Brave

For a few weeks now I have been using the Brave browser.  It is based on Chromium, so switching from Chrome was a quick adjustment since I have been using Chrome for years.  Brave includes an ad blocking system that blocks what they call "very intrusive ads" and most attempts to track users across sites.  They have also setup a system using BAT currency that allows you to directly pay content providers that you visit.  I have also setup my site as a content provider so I can receive tokens.


Browser




The browser was created by Brendan Eich, the creator of Javascript.  The browser has worked well so far.  It updates automatically similarly to the way Chrome does.  They have a system for syncing bookmarks between your devices.  You also have a wallet built in for your BAT tokens.  I am using it on a Mac, and my Android phone.



Ads

The creators of the browser have also setup an ad network.  There are several ad providers involved, but I haven't looked into this too much yet.  Th…

Using IR codes from Broadlink backup file

This is a continuation of yesterdays post about the Broadlink RM2 wifi remote.  I was able to test out using the codes to control my remote today.  They are stored in JSON format in the broadlink backup file.  The backup file is a zip file.  After you extract it there is a folder named SharedData.  There are several files in the folder.  The one that seems to have all the codes is named jsonIrCode.  The data looks like an array of objects, each with a code and some other information.