Strip HTML tags from an existing HTML file (VB6 and up)

Version Compatibility: Visual Basic 6


More information: Note: Thanks to Paul King for fixing a bug in this program that caused it to strip away <> characters that were part of the content rather than the HTML (07/10/2001).

At http://www.freevbcode.com/ShowCode.Asp?ID=110, there is an example that strips HTML tags from a string and optionally saves the output to a text file. This application offers an alternate method. As compared to the other example, this application:

Provides a complete UI for opening the file, displaying the HTML, stripping the tags, and saving. The other example is a function that strips tags from a string and optionally saves to a file via a parameter provided to the function.
Because of the UI, you can edit the resultant text, or run the strip procedure again, if you don't like the output. With the other example, you would have to build a UI yourself to do this.
This example runs faster. On the other hand, the other example tries to do more, including output tables intelligently, display numbered and bulleted lists, and convert all special character codes (this one converts the more commonly used special character codes).
Neither example is perferct in all cases (which would be impossible) but both should work fine when the HTML is well formed
This examples requires VB6; the other does not.
It is recommended that if you need this functionality, you look at both examples and determine which is best for you.
This code has been viewed 78010 times.

Instructions: Click the link below to download the code. Select 'Save' from the IE popup dialog. Once downloaded, open the .zip file from your local drive using WinZip or a comparable program to view the contents.

code/StripTags.zip

No comments: