Best way to replace upper-case with title/proper case

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

markwelch
Posts: 2
Joined: Thu Jan 22, 2009 12:54 am

Best way to replace upper-case with title/proper case

Postby markwelch » Thu Jan 22, 2009 1:08 am

I've just paid for TextPipe Pro, despite the lack of response to either pre-sales support questions and "in-software chat," and despite the fact that both telephone numbers (US toll-free and Australian direct-dial) are not answered with any company name. It is clearly very useful software, even if I never get support from DataMystic.

Here's the question I didn't get answered: What is the best way to convert UPPER CASE TEXT to "Proper" or "Title" case text? I am working with a large number of merchant datafeeds, some of which contain some product titles in ALL CAPS, and some of which contain titles that have indidual words in CAPS. I want to "fix what's broken, and not fix what's not broken." So what I want is:

(1) If the title contains only UPPER CASE letters, then convert the entire title to Proper case.
(2) If the title contains some lower-case letters, but contains any ALL-CAPS words longer than __ characters, convert that word to Proper case. (I'm not sure whether to say 4 or 5 characters.)

It's important to me that I not attempt to convert listings already in "reasonably OK" case; if I just convert everything to Proper Case, then I'd need more filters to fix capitalization of articles (e.g. "Gone With The Wind").

Examples:

"TOP GUN MOVIE POSTER" --> "Top Gun Movie Poster"
"REEBOK Air Jordan Basketball Shoes" --> "Reebok Air Jordan Basketball Shoes"
"Gone with the Wind MOVIE POSTER" --> "Gone With the Wind Movie Poster" (not "Gone With The Wind Movie Poster")
"Chrommatic ARG 368D digital camera" --> unchanged
"OSHA Compliance Manual" --> unchanged

Thanks for any help.

markwelch
Posts: 2
Joined: Thu Jan 22, 2009 12:54 am

Re: Best way to replace upper-case with title/proper case

Postby markwelch » Thu Jan 22, 2009 3:41 am

Clarification: capitalization of articles is not the only reason I don't want to convert everything to Proper case; there are thousands of acronyms and abbreviations and designations that should remain all-upper-case or "special case." Most of these are 2 to 4 characters long (GB, MHz, dB, mA, OSHA, IBM, FBI), but of course some are longer (MRMIP, HICAP). I'm trying to find a "happy medium" that will improve the overall quality of the data.

User avatar
DataMystic Support
Site Admin
Posts: 2164
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Best way to replace upper-case with title/proper case

Postby DataMystic Support » Thu Jan 22, 2009 3:19 pm

Hi Mark,

We have holidays here in Australia too, and the Aus phone number (at least) has DataMystic branding, the US phone number probably has LeapFrog branding.

Anyway, the best approach is to use a perl search/replace as a restriction, with text of
[A-Z ]{5,}
and replacement text of
$0
Ensure Match Case is checked.

Add a Convert to TItle Case filter as a subfilter.
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments


Return to “TextPipe Tips and Tricks, Questions and Support”

Who is online

Users browsing this forum: No registered users and 8 guests