Convert XML & HTML entities to Unicode?

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

DFH
Posts: 636
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Convert XML & HTML entities to Unicode?

Postby DFH » Tue May 15, 2012 5:36 pm

Hi Simon,

TextPipe has a filter to convert Numeric HTML entities to text.

This is fine as far as it goes, but rather narrow in scope, as it only works for numerical entities.

Please consider to enhance TextPipe to provide a filter to convert all XML & HTML entities to Unicode.

See http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

This would greatly improve the usefulness of TextPipe.

David

User avatar
DataMystic Support
Site Admin
Posts: 2136
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Convert XML & HTML entities to Unicode?

Postby DataMystic Support » Wed May 16, 2012 4:09 pm

This filter has been changed to convert ALL entities, and the name and documentation updated accordingly.

Thanks for the link! It proved very useful.
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

DFH
Posts: 636
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Re: Convert XML & HTML entities to Unicode?

Postby DFH » Wed May 16, 2012 8:29 pm

That's great, Simon.

As Hannibal Smith used to say, "I love it when a plan comes together".

http://en.wikipedia.org/wiki/John_%22Hannibal%22_Smith

David

User avatar
DataMystic Support
Site Admin
Posts: 2136
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Convert XML & HTML entities to Unicode?

Postby DataMystic Support » Wed May 16, 2012 11:30 pm

I love that A-Team movie! Very amusing...
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

DFH
Posts: 636
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Re: Convert XML & HTML entities to Unicode?

Postby DFH » Thu May 17, 2012 4:12 pm

Afterthoughts:

If the input file is XML and/or the output file is/will be XML, then the user may wish to exclude the XML entities.
Therefore please provide a tick box option for the XML entitles to be excluded/included.
Give the user choice for each subset, as this makes good sense, and the filter becomes more versatile.

    ☑ Include the predefined XML entities
    ☑ Include the defined HTML entities
    ☑ Include NCRs (Numeric character references)
NB. For the latter, one might also find it useful to distinguish between the decimal and hexadecimal forms of NCR.

btw. The original name for this filter was somewhat inaccurate, in that it referred to Numeric HTML entities,
whereas in fact the proper name for those is numeric character reference (NCR).
See http://en.wikipedia.org/wiki/Numeric_character_reference

David


Return to “TextPipe Tips and Tricks, Questions and Support”

Who is online

Users browsing this forum: No registered users and 1 guest