Skipping part of a text stream?

Results 1 to 2 of 2

Thread: Skipping part of a text stream?

  1. #1
    Join Date
    Dec 1969

    Default Skipping part of a text stream?

    I have been doing some work with HTML created dynamically from Word using ASP, and the HTML it is creating seems to be fine.<BR><BR>I have managed to filter most of the stuff I don&#039;t want from Word, part of my problem now is that I am trying to read the HTML as a text stream and insert this in to a database.<BR><BR>However the insert falls over at the stylesheet contained within the HTML page.<BR><BR>Does anyone know how I could filter the stylesheet out:<BR><BR>Stylesheet currently looks like this:<BR>&#060;style&#062;<BR>&#060;!--<BR> /* Font Definitions */<BR> @font-face<BR> {font-family:"Lucida Sans Unicode";<BR> panose-1:2 11 6 2 3 5 4 2 2 4;}<BR>@font-face<BR> {font-family:"Comic Sans MS";<BR> panose-1:3 15 7 2 3 3 2 2 2 4;}<BR>@font-face<BR> {font-family:"Century Gothic";<BR> panose-1:2 11 5 2 2 2 2 2 2 4;}<BR>@font-face<BR> {font-family:"Trebuchet MS";<BR> panose-1:2 11 6 3 2 2 2 2 2 4;}<BR>@font-face<BR> {font-family:Verdana;<BR> panose-1:2 11 6 4 3 5 4 4 2 4;}<BR> /* Style Definitions */<BR> p.MsoNormal, li.MsoNormal, div.MsoNormal<BR> {margin:0cm;<BR> margin-bottom:.0001pt;<BR> font-size:11.0pt;<BR> font-family:Verdana;}<BR>@page Section1<BR> {size:595.3pt 841.9pt;<BR> margin:72.0pt 90.0pt 72.0pt 90.0pt;}<BR>div.Section1<BR> {page:Section1;}<BR>--&#062;<BR>&#060;/style&#062;<BR><BR>ASP looks like this:<BR><BR>Set fs=Server.CreateObject("Scripting.FileSystemObject ")<BR><BR>Set f=fs.OpenTextFile(Server.MapPath("test/html/2.htm"), 1)<BR>strStream = f.ReadAll<BR>f.Close<BR><BR>Set f=Nothing<BR>Set fs=Nothing

  2. #2
    Join Date
    Dec 1969

    Default You have the entire string

    in a variable called "strStream" so you could use string functions(mid,instr,replace,etc..) to strip out everything between "&#060;style&#062; ..&#060;/style&#062;" tags. Now this is assuming you only have one style reference in there.<BR><BR>On a different note, you might want to consider using the ADO Stream object if you are reading large files:<BR><BR> Set objStream = Server.CreateObject("ADODB.Stream")<BR> objStream.Open<BR> objStream.LoadFromFile server.mappath("test/html/2.htm")<BR> <BR> strStream = objStream.Read()<BR> <BR> objStream.Close<BR> Set objStream = Nothing<BR> <BR>If you had a complex patter that you had to remove or find you might also consider using regular expressions. However, they would be overkill for this particular case.<BR><BR>Good luck<BR>Pete

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts