Posts tagged ‘script html’

Too Much Strippping :)

I spent the better part of my day stripping off…… guess what? Run your imagination as wild as possible.

Ha Ha Ha, well, it is striping off  HTML tags, JavaScript code, CSS code from a html file and return simple text. yes, the simple text is what I am interested in and input is a URL u.

As far as my knowledge go, there are 2 ways of programming it

1. Use DOM API

2. Use Regular Expression

I have been using DOM API lately, but somehow had a sixth sense that the code will be better with regular expression and googling supported my view.  I came across code for stripping HTML tags but not even a single piece of code for stripping Javascript and css.

There was a time when I decided to move over to DOM API, but finally came up with the solution. Here it is.

The regular expression for strippping JavaScript from any URL is

“(<[ \n\r]*script[^>]*>.*?<[ \n\r]*/script[^>]*>)”

The regular expression for stripping CSS style sheet from any URL is

“(<[ \n\r]*style[^>]*>.*?<[ \n\r]*/style[^>]*>)”

The regular expression for stripping HTML tags from any URL is

“<(.|\n)*?>”

If you need java code for the same, send me your email ID.

November 2, 2008 at 7:09 pm Leave a comment


Recent Posts

Blog Stats

  • 9,462 hits

Top Clicks

  • None

Follow

Get every new post delivered to your Inbox.