CSV-like data merging

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

simicar
Posts: 5
Joined: Thu Feb 15, 2007 7:25 am

CSV-like data merging

Postby simicar » Wed Feb 28, 2007 7:39 pm

Hi.

I encountered problems by using "extract matching lines" with context lines (1 line - before and after) selected, because I get this:

Code: Select all

;Trial Input;noxious gases away from the users of the machine. ;
;Trial Input;Indoor generators and furnaces can quickly fill an enclosed s;
;Trial Input;pace with carbon monoxide or other poisonous exhaust gases ;

The problem is that I need to convert those three lines to only one row in excel (not three), ex. with the word (*enclosed*) to get:

Code: Select all

;Trial Input;noxious gases away from the users of the machine.
Indoor generators and furnaces can quickly fill an enclosed s
pace with carbon monoxide or other poisonous exhaust gases ;

How can I do this?
I've already tried the "convert end of lines option" or using headres and footers - but when using h/f I get ';' only at the beginning and ending,
but of the extracted file - not rows (3 rows here) as needed.

The above example I've generated by:
    File input:..
    Extract matching [*enclosed*]
    Replace ; with ,
    Insert column 1 [;@inputFilename;]
    Insert column 0 [;]
    Merge to file...


Please help

User avatar
DataMystic Support
Site Admin
Posts: 2138
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Postby DataMystic Support » Thu Mar 01, 2007 8:30 am

To join every 3 lines into one, use an EasyPattern like this:

Code: Select all

 [ capture(0+ not cr or lf), cr, lf,
    capture(0+ not cr or lf), cr, lf,
    capture(0+ not cr or lf), cr, lf ]


Replace with

Code: Select all

  $1 $2 $3
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

simicar
Posts: 5
Joined: Thu Feb 15, 2007 7:25 am

Not quite..

Postby simicar » Fri Mar 02, 2007 12:53 am

Unfortunately when I've used the suggested:
    File input:..
    Extract matching [*tool*] - (here extraction of the 1 line before and after)
    Replace ; with ,
    Insert column 1 [;@inputFilename;]
    Insert column 0 [;]
    EasyPattern [...as given above...]
    Merge to file...

on the trial input:

Code: Select all

TextPipe provides a single point of maintenance for all your text processing tasks.
You learn one tool, rather than learning 4 or more - and their associated languages,
command line options, debugging schemes, idiosyncrasies and operating system differences and dependencies.

I got a result:

Code: Select all

;Trial Input;TextPipe provides a single point of maintenance for all your text processing tasks. ;
;Trial Input;You learn one tool, rather than learning 4 or more - and their associated languages, ;
;Trial Input;command line options, debugging schemes, idiosyncrasies and operating system differences and dependencies.;

instead of:

Code: Select all

;Trial Input;TextPipe provides a single point of maintenance for all your text processing tasks.
You learn one tool, rather than learning 4 or more - and their associated languages,
command line options, debugging schemes, idiosyncrasies and operating system differences and dependencies.;


Maybe the position of Replace -> Find EasyPattern is wrong,
I still can't sort this thing out...

User avatar
DataMystic Support
Site Admin
Posts: 2138
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Postby DataMystic Support » Fri Mar 02, 2007 7:28 am

If you only want the input filename shown once, then move the EasyPattern up just underneath the Extract filter.
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

simicar
Posts: 5
Joined: Thu Feb 15, 2007 7:25 am

How about duplicates

Postby simicar » Fri Mar 02, 2007 9:42 pm

Thanks. It really works - but only when key word appears once per file. The problem appears, when there are two or more of them.

Ex. from - (extracting word tool):

Code: Select all

TextPipe provides a single...
...one tool, rather than
command line options, debugging schemes,

TextPipe provides a single...
...one tool, rather than
command line options, debugging schemes,

TextPipe provides a single...
...one tool, rather than
command line options, debugging schemes,

I get:

Code: Select all

;Trial Input;TextPipe provides a single... ...one tool, rather than command line options, debugging schemes,TextPipe provides a single... ...one tool, rather than command line options, debugging schemes,TextPipe provides a single...;
;Trial Input;...one tool, rather than;
;Trial Input;command line options, debugging schemes,;

instead of (where the Trial Inupt should be the same file):

Code: Select all

;Trial Input;TextPipe provides a single... ...one tool, rather than command line options, debugging schemes,;
;Trial Input;TextPipe provides a single... ...one tool, rather than command line options, debugging schemes,;
;Trial Input;TextPipe provides a single... ...one tool, rather than command line options, debugging schemes,;

User avatar
DataMystic Support
Site Admin
Posts: 2138
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Postby DataMystic Support » Sun Mar 04, 2007 7:23 am

Try this filter (note - this text comes from File\Export\Export to Clipboard):

Code: Select all

|--Extract lines matching [tool]
|     [ ] Include line numbers
|     [ ] Include filename
|     [ ] Match case
|     [ ] Count matches
|     Pattern type: 0
|     Context before: 1
|     Context after: 1
|   
|--EasyPattern [[ capture(0+ not cr or lf), cr, lf,\r\n    capture(0+ not cr or lf), cr, lf,\r\n    capture(longest 0+ not cr or lf), longest optional( cr, lf)  ]] with [@inputFilename;$1 $2 $3;\r\n]
|     [ ] Match case
|     [ ] Whole words only
|     [ ] Case sensitive replace
|     [X] Prompt on replace
|     [ ] Skip prompt if identical
|     [ ] First only
|     [ ] Extract matches
|     Maximum text buffer size 4096

Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments


Return to “TextPipe Tips and Tricks, Questions and Support”

Who is online

Users browsing this forum: No registered users and 1 guest