Text formatting

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

Aircut
Posts: 12
Joined: Sun Oct 28, 2012 2:09 pm

Text formatting

Postby Aircut » Sun Oct 28, 2012 2:16 pm

I face the job of cleaning up malformed essays.

some of the writers leave no space after the full stop and other have an extra space before... same for commas, exclamation marks and question marks.

my question, is how to create a filter that removes unwanted space between the words and the full stop point, and adds one space after it, doing it to the entire block of text BUT skipping email addresses and URLs...

thank for any hints

User avatar
DataMystic Support
Site Admin
Posts: 2136
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Text formatting

Postby DataMystic Support » Mon Oct 29, 2012 11:29 pm

The perl pattern you want to use is:

Code: Select all

 *?([,\!\?]) *?


Replace with

Code: Select all

$1


For emails and URLs, you will need to use a different strategy for handling periods, perhaps replace periods in urls and hyperlinks with tabs temporarily (using a restriction), then use a perl pattern of:

Code: Select all

 *?([\.,\!\?]) *?

Replace with

Code: Select all

$1
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments


Return to “TextPipe Tips and Tricks, Questions and Support”

Who is online

Users browsing this forum: Baidu [Spider] and 2 guests