
Mark Brownell gizmotron at
Thu Feb 12 12:07:08 EST 2004


Here are three functions for pulling tags out:
What you might need is one of my Parallel Numerical Lineal Parsers:

-- get the title
-- put your HTML into tZap
-- put PNLPgetElement("<title>", "</title>", tZap) into theTitle

-- get the paragraphs in two steps
-- put getPNLPelements("<p>", "</p>", tZap) into theParagraphArray
-- put theParagraphArray[1] into parOne
-- put theParagraphArray[2] into parTwo

You can even get the attributes from this:
> -- <BODY TEXT="#000000" BGCOLOR="#F8D0B8" LINK="#999999" 
> VLINK="#000000" ALINK="#FF0000">

-- put PNLPgetElement("<body", "</body>", tZap) into theBody
-- put PNLPgetAttribute("TEXT", theBody) into theTEXTAttribute
-- put PNLPgetAttribute("BGCOLOR", theBody) into theBGCOLORAttribute

function PNLPgetElement tStTag, tEdTag, stngToSch
   put empty into zapped
   put the number of chars in tStTag into dChars
   put offset(tStTag,stngToSch) into tNum1
   put offset(tEdTag,stngToSch) into tNum2
   if tNum1 < 1 then
     return "error"
     exit PNLPgetElement
   end if
   if tNum2 < 1 then
     return "error"
     exit PNLPgetElement
   end if
   put char (tNum1 + dChars) to (tNum2 - 1) of stngToSch into zapped
   return zapped
end PNLPgetElement

-- put getPNLPelements("<record>", "</record>", tZap) into theArray
function getPNLPelements tStartTag, tEndTag, StringToSearch
   put empty into tArray
   put 0 into tStart1
   put 0 into tStart2
   put 1 into tElementNum
   put the number of chars in tStartTag into dChars
     put offset(tStartTag,StringToSearch,tStart1) into tNum1
     put (tNum1 + tStart1) into tStart1
     if tNum1 < 1 then exit repeat
     put offset(tEndTag,StringToSearch,tStart2) into tNum2
     put (tNum2 + tStart2) into tStart2
     if tNum2 < 1 then exit repeat
     --if tNum2 < tNum1 then exit repeat
     put char (tStart1 + dChars) to (tStart2 - 1) of StringToSearch into 
     put zapped into tArray[tElementNum]
     add 1 to tElementNum
   end repeat
   return tArray
end getPNLPelements

-- put PNLPgetAttribute("name", tZap) into theAttribute
function PNLPgetAttribute tAttribute, strngToSearch
   put empty into zapA
   put quote into Qx
   put tAttribute & "=" & Qx into tAttributeX
   put the number of chars in tAttributeX into dChars
   put offset(tAttributeX,strngToSearch) into tNum1
   if tNum1 < 1 then
     return "error"
     exit PNLPgetAttribute
   end if
   put tNum1 + 1 into tNum2
   put offset(Qx,strngToSearch,tNum2) into tNum3
   put offset("=",strngToSearch,tNum2) into tNum4
   if tNum3 < 1 then
     return "error"
     exit PNLPgetAttribute
   end if
   if tNum4 < tNum3 then
     return "error"
     exit PNLPgetAttribute
   end if
   put char (tNum2 + 1) to (tNum3 - 1) of stngToSch into zapA
   return zapA
end PNLPgetAttribute

Mark Brownell

P.S. I would love a multi-character delimiter.

On Thursday, February 12, 2004, at 05:10  AM, Thomas McGrath III wrote:

> Hello everybody,
> I was wondering if the itemDelimiter can be more than one character? I 
> mean instead of "," or ":" can it be "<!--display paragraphs-->" ?
> If the answer is NO then can someone help me understand how to sift 
> through the html below to just extract the title text and the 2 
> paragraph texts? I know I can offset("<title>") field "HTML" but then 
> what can I do to extract the text after that?
> It may be obvious but for the life of me I can't 'see' it.
> Thanks
> TOm
> <html>
> <head>
> <title>Just for Today Meditation</title>
> </head>
> <BODY TEXT="#000000" BGCOLOR="#F8D0B8" LINK="#999999" VLINK="#000000" 
> ALINK="#FF0000">
> <!--calculate day of the year -->
> <!-- numdays=datediff("d",firstday,now)+1 -->
>  <!--display paragraphs-->
>    <p>Our fantasies and expectations about the future may be so 
> extreme that, on our first date with someone, we find ourselves 
> wondering which lawyer we'll use for the divorce. Almost every 
> experience causes us to remember something from the past or begin 
> projecting into the future. </p>
>    <!--display paragraphs-->
>    <p>At first, it's difficult to stay in the moment. It seems as 
> though our minds won't stop. We have a hard time just enjoying 
> ourselves. Each time we realize that our thoughts are not focused on 
> what's happening right now, we can pray and ask a loving God to help 
> us get out of ourselves. If we regret the past, we make amends by 
> living differently today; if we dread the future, we work on living 
> responsibly today. </p>
> </body></html>
> Thomas J McGrath III

More information about the use-livecode mailing list