Extract certain HTML tags

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

Post Reply
Posts: 3
Joined: Thu Mar 15, 2007 10:19 pm

Extract certain HTML tags

Post by JimC » Tue Oct 02, 2012 6:01 am

I am trying to extract some data from webpages. All of the content I need is contained within two:
<div class="someclass">content</div>
What is best FILTER to extract just these two tage from a file and then proceed with further processing?

Something like an extract HTML/XML pair would be perfect, but I dont see that as an option

User avatar
DataMystic Support
Site Admin
Posts: 2275
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia

Re: Extract certain HTML tags

Post by DataMystic Support » Tue Oct 02, 2012 9:11 am

Hi Jim,

Just a perl pattern:

Code: Select all

<div class="someclass">(.*)</div>
replace with

Code: Select all

and check 'Extract'.

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests