Multiple patterns simultaneously

A discussion of how to use EasyPatterns, EasyPattern Helper and using the EasyPattern library.

Moderator: DataMystic Support

Stevod

Multiple patterns simultaneously

Postby Stevod » Fri Sep 09, 2005 10:58 pm

I need to scan HTML files in order to extract URLs.

I can create individual easypatterns to recognise and extract one particular form of URL, however how can I pick up on a number of forms without rescanning the file e.g. I want those which are part of an <a href> statement, but also those starting "www." in plain text.

Can I combine the two forms in a single easypattern :?

User avatar
DataMystic Support
Site Admin
Posts: 2138
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Postby DataMystic Support » Wed Sep 14, 2005 3:06 pm

The best approach for 2 or more extractions is to read our whitepaper:
http://www.datamystic.com/docs

Essentially you output all extracted text to a new line with a marker, then you discard all lines without the marker.
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments


Return to “EasyPatterns Support”

Who is online

Users browsing this forum: No registered users and 2 guests