01-09-2016, 06:26 AM
I have made it further after finding the following code on the forum. However, this still doesn't fire on all cylinders for me because I'm missing the industry, city, state and the date the contact was added. Both of which I need to have from my extraction.
Gintaras, I sure would appreciate it if you could help me figure out the last piece of the puzzle here. I was trying to use the .className to identify the "location" and "industry" classes but for some reason the for loop being used doesn't allow for a sel case to be used to capture this data separately. Finally the last piece of this puzzle is getting the data into columns and rows of a tab deliminated csv file. Any help you could provide with this would be great too.
Thanks Again,
Paul
Gintaras, I sure would appreciate it if you could help me figure out the last piece of the puzzle here. I was trying to use the .className to identify the "location" and "industry" classes but for some reason the for loop being used doesn't allow for a sel case to be used to capture this data separately. Finally the last piece of this puzzle is getting the data into columns and rows of a tab deliminated csv file. Any help you could provide with this would be great too.
str s=
<BODY>
<div class="detail-container">
<div class="name-row">
<a href="/contact/b070f5e9-30d7-3da5-bc39-780c3455b71e">Mitch Acker</a>
</div>
<div class="search-result-subheadline">
<span class="large-black-text">President, Sales Executive at </span>
<span class="contact-company-name"><a href="/company/66819229-e58e-36e8-a282-c11f68eb2453" class="clickable">Martinaire Inc</a></span>
</div>
<div class="compact-section">
<div class="location">Addison,
Texas,
United States
<div class="contact-industry">Airlines</div>
</div>
<div class="compact-section">
<div class="small-data-label">Main:</div>
<div class="inline-block black-text"><span id="gc-number-20" class="gc-cs-link" title="Call with Google Voice">972-349-5700</span></div>
<div>
<div class="small-data-label">Email:</div>
<a class="black-text" href="mailto:[email protected]">[email protected]</a>
</div>
</div>
<div class="">
</div>
</div>
</div>
<div class="detail-container">
<div class="name-row">
<a href="/contact/10611e14-c5b5-3cac-9679-7b69997eb75d">Alex Abadi</a>
</div>
<div class="search-result-subheadline">
<span class="large-black-text">Chief Executive Officer at </span>
<span class="contact-company-name"><a href="/company/d0a95324-611b-36b7-8a5b-b753ab957e36" class="clickable">Image Microsystems, Inc.</a></span>
</div>
<div class="compact-section">
<div class="location">Austin,
Texas,
United States
<div class="contact-industry">Computer and Peripheral Equipment Manufacturing</div>
</div>
<div class="compact-section">
<div class="small-data-label">Main:</div>
<div class="inline-block black-text"><span id="gc-number-24" class="gc-cs-link" title="Call with Google Voice">512-623-5621</span></div>
<div>
<div class="small-data-label">Direct:</div>
<div class="inline-block black-text"><span id="gc-number-25" class="gc-cs-link" title="Call with Google Voice">512-623-5642</span></div>
</div>
<div>
<div class="small-data-label">Email:</div>
<a class="black-text" href="mailto:[email protected]">[email protected]</a>
</div>
</div>
<div class="">
</div>
</div>
</div>
</BODY>
out
s.findreplace("span" "a")
HtmlDoc d.InitFromText(s)
ARRAY(MSHTML.IHTMLElement) h2 div
int i j
d.GetHtmlElements(div "div")
for i 0 div.len
str cn=div[i].className
if cn="detail-container"
d.GetHtmlElements(h2 "a" "" div[i].sourceIndex)
for j 0 h2.len
out h2[j].innerText
Thanks Again,
Paul