Unicode line separator U+2028

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

DFH
Posts: 644
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Unicode line separator U+2028

Postby DFH » Wed Jul 27, 2011 2:51 am

How does TextPipe handle the Unicode line separator U+2028 ?

e.g. If the Files to be Processed have these as the EOL marker.

Assume that these are Unicode files - encoded in either UTF-16 LE or UTF-8 (with or without BOM).

Also how about in Perl pattern matching?
e.g. In the Patterns options button [...] dialog that include the tick option '.' matches newline.

David

PS. The attachment contains a simple TP filter to convert EOLs to U+2028.
Attachments
Change EOLs to U+2028.zip
TextPipe filter to change EOLs to U+2028.
(762 Bytes) Downloaded 303 times

User avatar
DataMystic Support
Site Admin
Posts: 2154
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Unicode line separator U+2028

Postby DataMystic Support » Wed Jul 27, 2011 9:20 am

Thanks David - we've included your filter in a new 'Unicode' filter subfolder.

I don't believe that PCRE (the library we use) pattern matching handles anything other \r, \r\n and \n line feeds.
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

DFH
Posts: 644
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Re: Unicode line separator U+2028

Postby DFH » Wed Jul 27, 2011 10:47 pm

Well, well, well.

The help page entitled Unicode Pattern Reference includes this:
Definitions

Separator - any one of U+2028, U+2029, NL, CR.

So this suggests that TextPipe ought to be able to handle U+2028 and U+2029.

Something overlooked, perhaps?

David

PS. Of the various Unicode compatible text editors (for Windows) that I use regularly, only SC Unipad handles these correctly.

DFH
Posts: 644
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Re: Unicode line separator U+2028

Postby DFH » Wed Jul 27, 2011 10:52 pm

FWIW. Here's a similar filter to change EOLs to U+2029 Paragraph Separator.
Attachments
Change EOLs to U+2029.zip
TP filter to change EOLs to U+2029
(767 Bytes) Downloaded 290 times


Return to “TextPipe Tips and Tricks, Questions and Support”

Who is online

Users browsing this forum: No registered users and 3 guests