Extraction from HTML Page for inclusion in CSV

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

insidemagic
Posts: 1
Joined: Mon Mar 13, 2006 12:37 pm

Extraction from HTML Page for inclusion in CSV

Postby insidemagic » Mon Mar 13, 2006 12:47 pm

Hi:

I am evaluating textpipe and webpipe. I have 1300 articles written and published in html that I want to move into a Drupal site so I need to load them into mysql. I tried mark the portions of the source code for each page by things like <!-- item --> // <!-- enditem --> and <!-- content --> //<!-- endcontent -->.

I confess that I am not bright but at least I'm lazy. I want to be able to process through the webpages, pull out the text (html code) in between the ad hoc section markers and put them into a CSV file in separate columns headed by the description so all of the content for a given article would be in the column "Content" and the title for the same article would be under "Title" etc.

Is there a way to do this? I appreciate any help at all. The two products have blown me away with their stability and power. Outstanding!

BTW, I could avoid all of this if there was a way to convert my web pages into rss to input into Drupal.

Thank you in advance for any thoughts or suggestions!

Tim

User avatar
DataMystic Support
Site Admin
Posts: 2138
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Postby DataMystic Support » Tue Mar 14, 2006 9:28 am

Hi Tim,

Are these the only two markers?

If so, a judicous search/replace can do it. Can you email us a sample file so we can show you how?
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments


Return to “TextPipe Tips and Tricks, Questions and Support”

Who is online

Users browsing this forum: Majestic-12 [Bot] and 1 guest