Unicode sorts?

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

DFH
Posts: 636
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Unicode sorts?

Postby DFH » Wed Jul 14, 2010 6:10 am

Sorting lines of Unicode text is a huge topic in its own right, yet TextPipe doesn't yet offer to sort UTF-8 (or other encodings) even on the basis of codepoint values.

Although this is no substitute for intelligent sorting for the text of various languages, it would still have some useful applications, such as for analysis of character frequencies in Unicode text files.

cf. My recent post on this topic in the Help and Support section.

User avatar
DataMystic Support
Site Admin
Posts: 2136
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Unicode sorts?

Postby DataMystic Support » Wed Jul 14, 2010 8:58 am

Thanks David - we'll look into adding it shortly.
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

User avatar
DataMystic Support
Site Admin
Posts: 2136
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Unicode sorts?

Postby DataMystic Support » Wed Jul 14, 2010 10:23 am

Hi David,

Windows does not provide functions for natively sorting anything except Ansi and Unicode.

SO utf-8 is out of the question - you would need to convert the text from utf-8 to UTF16LE first, then sort (with a new widestring sort we can add), and then convert it back later.

How does that sound?
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

DFH
Posts: 636
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Re: Unicode sorts?

Postby DFH » Thu Jul 15, 2010 2:05 am

Might be a useful (though somewhat awkward) workaround.

For now I'm content to just use Notepad++ | TextFX Tools | Sort.

Yet I can imagine that other users may find the wide sort of UTF16LE a benefit.


Return to “TextPipe Tips and Tricks, Questions and Support”

Who is online

Users browsing this forum: Baidu [Spider] and 1 guest