Replacing bad HTML

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

bpobiak
Posts: 7
Joined: Wed Oct 13, 2004 9:37 am
Location: New York City
Contact:

Replacing bad HTML

Postby bpobiak » Thu Jan 25, 2007 4:04 am

I have a number of html pages converted from Word that have variations of bad paragraph endings peppered throughout that affect the space between paragraphs:

<br>
&nbsp;&nbsp;&nbsp; <br>

which should be replaced with

<p>

An exact match works, of course, but I don't trust the exact layout of this example to be universal, so I want to code an inclusive search between any pair of <br> tags ignoring whitespace with oneormore forced spaces ('&nbsp;')

I've tried a number of EZ Pattern variations but am stumped and my trial runs always miss the pattern.

Here is the trial data:

<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<br>
&nbsp;&nbsp;&nbsp; <br>
You have nominated great State and national tickets, your Governor,<br>
your Senators, your Congressmen, your State officers.<br>
&nbsp;&nbsp;&nbsp; <br>

Thanx in Advance. Textpipe is a miracle worker! :D
-Regards

Bernie Pobiak
Pubcomm Group NYC

User avatar
DataMystic Support
Site Admin
Posts: 2154
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Postby DataMystic Support » Thu Jan 25, 2007 2:15 pm

Thanks Bernie,

Just use

Code: Select all

<br>[ 0+ whitespace or '&nbsp;' or cr or lf ]<br>


and replace with

<p>
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

bpobiak
Posts: 7
Joined: Wed Oct 13, 2004 9:37 am
Location: New York City
Contact:

Almost there...

Postby bpobiak » Fri Jan 26, 2007 6:57 am

That makes sense - but I tried it and for the example below it replaces with many <P>, not a single one. (see result below)

How can it be limited to acting between the <br> tags only once?

Thanx, Simon!

b

New Result:

<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<p>
<p><p><p> <p>
You have nominated great State and national tickets, your Governor,<p>
your Senators, your Congressmen, your State officers.<p>
<p><p><p> <p>


Sample:

<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<br>
&nbsp;&nbsp;&nbsp; <br>
You have nominated great State and national tickets, your Governor,<br>
your Senators, your Congressmen, your State officers.<br>
&nbsp;&nbsp;&nbsp; <br>
-Regards



Bernie Pobiak

Pubcomm Group NYC

User avatar
DataMystic Support
Site Admin
Posts: 2154
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Postby DataMystic Support » Fri Jan 26, 2007 2:21 pm

Sorry, it should be:

Code: Select all

<br>[ longest 0+ whitespace or '&nbsp;' or cr or lf ]<br>
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

bpobiak
Posts: 7
Joined: Wed Oct 13, 2004 9:37 am
Location: New York City
Contact:

Postby bpobiak » Fri Jan 26, 2007 10:49 pm

Hmmm... still multiple <p> result (see new results below). Is the application of a OneOrMore for the occurrances of &nbsp; possible?

New Result:

<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<p><p><p><p> <p>You have nominated grea

Thanx.
-Regards



Bernie Pobiak

Pubcomm Group NYC

User avatar
DataMystic Support
Site Admin
Posts: 2154
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Postby DataMystic Support » Mon Jan 29, 2007 10:20 am

Sorry, second mistake.

It should be:

Code: Select all

<br>[ longest 0+ (whitespace or '&nbsp;' or cr or lf) ]<br>
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

bpobiak
Posts: 7
Joined: Wed Oct 13, 2004 9:37 am
Location: New York City
Contact:

Bingo!

Postby bpobiak » Mon Jan 29, 2007 11:14 am

Perfect! That works exactly right! Thank you Simon!

So that I learn from the experience, let me try to break down the easy pattern:

<br>[ longest 0+ (whitespace or '&nbsp;' or cr or lf) ]<br>

means

Find occurrances where the are 2 <br> codes containing between them the highest number of repetitions of zero or more repetitions of either whitespace or '&nbsp;' or cr or lf

I think I get it. Thanx again!
-Regards



Bernie Pobiak

Pubcomm Group NYC

User avatar
DataMystic Support
Site Admin
Posts: 2154
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Postby DataMystic Support » Wed Jan 31, 2007 8:33 am

Yep - that's correct!
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments


Return to “TextPipe Tips and Tricks, Questions and Support”

Who is online

Users browsing this forum: No registered users and 4 guests