Multiple patterns simultaneously

A discussion of how to use EasyPatterns, EasyPattern Helper and using the EasyPattern library.

Moderator: DataMystic Support

Post Reply

Multiple patterns simultaneously

Post by Stevod » Fri Sep 09, 2005 10:58 pm

I need to scan HTML files in order to extract URLs.

I can create individual easypatterns to recognise and extract one particular form of URL, however how can I pick up on a number of forms without rescanning the file e.g. I want those which are part of an <a href> statement, but also those starting "www." in plain text.

Can I combine the two forms in a single easypattern :?

User avatar
DataMystic Support
Site Admin
Posts: 2328
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia

Post by DataMystic Support » Wed Sep 14, 2005 3:06 pm

The best approach for 2 or more extractions is to read our whitepaper:

Essentially you output all extracted text to a new line with a marker, then you discard all lines without the marker.

Simon Carter, - Insulin dose calculator for Type 1 diabetes - 250,000 free software downloads - send huge email attachments

Post Reply