Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Solved] HTML parsing galore
#1
Hello,

i'm now fighting with html text extraction.

I've got several different type of data to extract. I tried hard with HTMLDoc class, with provided examples, but it's not enough.

NB: red line is wanted text

1)
<div class='text'>
<b>Covers</b><br/>
http://xxxyyyyzzzz.com/somefile.html <- i want that
</div>

2)
<a href="http://xxxyyyyzzzz.com/somefile.html" target="_blank">http://xxxyyyyzzzz.com/somefile.html</a></div>

3)
<div class="image">
<a href="http://xxxyyyyzzzz.com/somefile" target="_blank"><img src="http://xxxyyyyzzzz.com/somefile.jpeg"

4)
<a href="http://xxxyyyyzzzz.com/somefile" target="_top">Download</a><br>

5)
dd.d3.getElementById("lgpd").outerHTML : why d3, what is it.

6)where to find containerTag & containerNameOrIndex reference?

Long post but long time search :/

kind regards,
Laurent.


Messages In This Thread

Forum Jump:


Users browsing this thread: 1 Guest(s)