Word Pair or Word Triplet Extracts

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

Post Reply

Word Pair or Word Triplet Extracts

Post by jgomberg » Tue Oct 14, 2003 12:38 am

I am trying to write a Regex expression to match all adjacent word pairs and triplets for any given string of words. For example, the sentence:

"registrars have been contracted to perform services at very low prices" would produce the following word pairs:

"registrars have", "have been", "been contracted", "contracted to", "to perform", "perform services", etc.

or the following triplets:

"registrars have been", "have been contracted", "been contracted to", "contracted to perform", etc.

I can extract the first two words from a search string, such as:
(.*Subject: )(\w* ){2} filter out the first back reference, but I am stuck writing an expression that will pull all of the concurrent word pairs from a string.

Any suggestions how this can be done with regex alone?



User avatar
DataMystic Support
Site Admin
Posts: 2206
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia

Post by DataMystic Support » Mon Oct 20, 2003 11:10 am

Hi Jeff,

I'm pretty sure you can't do this with regex alone. You can however get a regex to match each word, and then use a VBScript subfilter to process those words - keeping an array of 2 or three words and outputting them.

It looks like you're trying to generate a word concordance, which is something we started buiding into TP along time ago but never finished to our satisfaction.

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

Posts: 18
Joined: Tue Sep 23, 2003 3:13 am


Post by jring » Wed Oct 22, 2003 2:28 am

hmmmm, interesting Q/A. I don't know how to write vb script, and I was able to create 2 filters that appear to have given me a solid jump on the problem. Simon - you know my email, send me and the guy who asked the original question a note, and I'll reply with my filters. Perhaps we can nip this one...what do you think?

"registrars have been","have been contracted","been contracted to","contracted to perform","to perform services","perform services at","services at very","at very low","very low prices"

"It looks like","looks like youre","like youre trying","youre trying to","trying to generate","to generate a","generate a word","a word concordance","word concordance which","concordance which is","something we started","we started buiding","started buiding into","buiding into TP","into TP along","TP along time","along time ago","time ago but","ago but never","but never finished"

joseph ring

Post Reply

Who is online

Users browsing this forum: No registered users and 8 guests