Skip to main content

Using The Coldfusion XMLValidate Function To Validate User Content

Sometimes you want to allow a limited set of HTML tags in user generated content.  This can be done with the XMLValidate function in Coldfusion as long as the content is valid XHTML.

Several years ago I read an article about escaping form values posted by Ben Nadel on his site www.bennadel.com. Some discussion came up in the comments about allowing a limited set of html tags for paragraphs, bold text, and so on.  I had a need to do this for forum comments on a site that I was working on. This site was written in Coldfusion so I was looking at some of the same options mentioned in Ben's article. I ended up doing something a little different though.

We were using TinyMCE for the forum comments. TinyMCE produces XHTML code so I was able to use Coldfusion's abilities to handle XML to accomplish this task. Using the XMLValidate function and an XML schema that was modified to accept a small list of tags and attributes the comments were correctly limited. Here is the relevant portion of the code.  Below is an explanation of how it works. 



The first line takes the content from the form and wraps a content tag around it. This is done because valid XML has to have a root element. Naming this element content was just an arbitrary decision. It could have been anything. Also before parsing non-breaking spaces are escaped in the content. If I remember right the non-breaking spaces were causing the XMLFormat function to error.

The next section is a cfxml tag containing the XML schema. This XML was created from some examples and then modified to include the desired tags.

The schema allows the following tags:

  • any number of br, strong, em, ul, ol, u, strike, li, spans with style and class attributes, 
  • a tags with href, title, target, and class attributes, 
  • img tags with src, alt, height, and width attributes, 
  • p tags with align, class, and style attributes, 
  • and finally content tags 


As mentioned above the content tag is just a container added to act as the root document element. Also the href attribute on a tags can only contain urls not javascript.

Then XMLValidate is called with the content and the schema. This returns a struct with information about the validity of the document. Finally the status key is used to decide if the content is acceptable or we need to reject it.

This article was rewritten from an article that I wrote several years ago on a different blog.  That blog is no longer online.  I think the information is still useful though.  I have a few other articles from that old site that I am going to re-publish.

Comments

Popular posts from this blog

Getting Hidden App Data From Your Google Drive

Some Android applications use space on your Google Drive to store data.  You can't see this data by browsing drive the normal way.  You can get access to it if you get the access token from the app and do a few other things.

I use an app to keep some notes and other things.  They require getting the paid version of the app to backup your data.  I wanted to see if I could get it myself without paying for the app.  I found out how to do this from this Stackoverflow post http://stackoverflow.com/questions/22832104/see-hidden-app-data-in-google-drive.  The example there uses php. Since I have been learning Python recently, I decided to see if I could do the same thing with Python.

 There is a pretty good explanation of why you need to go through all the steps you need to go through to get your data in the accepted answer for that post, so I won't go into too much detail about that here.


My First Memories of Coding

The first time I remember writing code was in grade school in the computer lab.  We were learning Basic.  I'm not sure how old I was, but it would have been sometime in the late '80s or early '90s.  I remember they had us write a program that would take some numbers from the user and print out the average.  I had to ask how to do division. I had only seen it with the standard division symbol in math class, not the forward slash.  Soon after that I was able to get Basic on my home PC.  Actually I think it was included in MS-DOS.

At school and in the beginning at home I was using Basic where you had to use line numbers to write your code.  Then to edit a line you had to retype that line with the number.  Soon after that I was able to get QBasic and a book to help me get started.


Python control of Broadlink RM2 wifi remote

I recently got a Broadlink wifi remote.  I have been playing around trying to send commands to it with python.  I found code on github that does most of what I need.  It looks like the device itself doesn't learn any remote codes.  It is all in the app.  The code I found can send codes to the device and get codes from learning mode. I have a bunch of buttons programmed in the app already.  I wanted to see if I could get those codes.  I think I got them today.  The app, called eControls, allows you to backup your setup.  Today I was able to get my backup file from their backup site. Tomorrow I will try to use the codes from the backup.