slow running program

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

sheridany
Posts: 39
Joined: Thu Nov 15, 2007 4:20 am

slow running program

Postby sheridany » Thu May 21, 2009 9:16 am

Does anyone have a suggestion for making this go faster. it is abysmally slow. It is about 400,000 records in a csv file and we have to remove a lot of colums individually.

Filter List
-----------
Filter options
| [ ] Log to file
| [X] Append to logfile
| Log filename: textpipe.log
| Threshold 500
|
|--Input from file(s)
| [ ] Confirm before processing each file
| [ ] Confirm before processing read/only files
| [ ] Delete input files after processing
| Skip binary files
| Sample size 4 characters
|
|--Remove fields:Comma-delimited field 185 .. field 185
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 162 .. field 162
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 139 .. field 139
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 115 .. field 115
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 101 .. field 101
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 82 .. field 82
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 72 .. field 72
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 62 .. field 62
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 58 .. field 58
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 30 .. field 30
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 26 .. field 26
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 18 .. field 18
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 15 .. field 16
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 13 .. field 13
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 11 .. field 11
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 7 .. field 9
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 4 .. field 4
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 1 .. field 2
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Replace list: C:\Documents and Settings\youngs\Desktop\email analysis 2008\cleanuptaxonomy.xls EasyPattern
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
|
|--EasyPattern [[1 or more char'/.']] with [_]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
|
|--EasyPattern ["] with []
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
|
|--EasyPattern [ 12:00,] with [,]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
|
+--Merge output to file C:\Documents and Settings\youngs\Desktop\finalirtaxonomybp.csv

User avatar
DataMystic Support
Site Admin
Posts: 2138
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: slow running program

Postby DataMystic Support » Tue May 26, 2009 11:29 am

How many entries are in your replace list C:\Documents and Settings\youngs\Desktop\email analysis 2008\cleanuptaxonomy.xls ?

It will be much faster if you load this as a .csv or .tab format instead of .xls
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

sheridany
Posts: 39
Joined: Thu Nov 15, 2007 4:20 am

Re: slow running program

Postby sheridany » Tue May 26, 2009 11:02 pm

There are approximately 190 entries in the replace list. I will try the csv format.

sheridany
Posts: 39
Joined: Thu Nov 15, 2007 4:20 am

Re: slow running program

Postby sheridany » Wed May 27, 2009 1:52 am

It is still running very slow even after converting the search replace list into a csv file. :shock:

sheridany
Posts: 39
Joined: Thu Nov 15, 2007 4:20 am

Re: slow running program

Postby sheridany » Wed May 27, 2009 7:45 am

The issue seems to be in the search and replace list. It is exact in nature so perhaps I am using the wrong filter? There are 190 search and replace rows that look like this?

Sales & Service Taxonomy./Cashback,Cashback
Sales & Service Taxonomy./Address Change,Addr_Chg
Sales & Service Taxonomy./Legal Ref,Legal_Ref
Sales & Service Taxonomy./Rewards,Rewards
Sales & Service Taxonomy./Cancellations/Savings Account,Cancel_SAV_Acct
Sales & Service Taxonomy./Cancellations/Checking Account,Cancel_CHK_Acct

User avatar
DataMystic Support
Site Admin
Posts: 2138
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: slow running program

Postby DataMystic Support » Wed May 27, 2009 8:25 am

I'd use a perl search/replace or EasyPattern search/replace- Exact matches can be slower.
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

sheridany
Posts: 39
Joined: Thu Nov 15, 2007 4:20 am

Re: slow running program

Postby sheridany » Wed May 27, 2009 8:48 am

The only way I can see to do it currently is break it up into two processes. the first step is remove the unwanted columns and output to a csv file. Step 2 is a separate fll that does the search and replace using EP and some other EP cleanups. It runs much faster this way.

User avatar
DataMystic Support
Site Admin
Posts: 2138
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: slow running program

Postby DataMystic Support » Wed May 27, 2009 10:47 am

Strange that it should run faster like that.

Is it possible to send us your filter, the test file and the search/replace list so we can try and optimize it?
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

sheridany
Posts: 39
Joined: Thu Nov 15, 2007 4:20 am

Re: slow running program

Postby sheridany » Fri May 29, 2009 8:21 am

I finally got it run much better all in one program but I am not see the results I expected in removing columns. I read in the help guide to remove columns from right to left but it is still leaving in columns that I wanted to remove. Do I have to shift the position by -1 every time I remove a column to get the correct next column.

Filter List
-----------
Filter options
| [ ] Log to file
| [X] Append to logfile
| Log filename: textpipe.log
| Threshold 500
|
|--Input from file(s)
| [ ] Confirm before processing each file
| [ ] Confirm before processing read/only files
| [ ] Delete input files after processing
| Process binary files
|
|--Remove fields:Comma-delimited field 190 .. field 190
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 187 .. field 187
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 185 .. field 185
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 162 .. field 162
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 146 .. field 146
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 139 .. field 140
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 134 .. field 134
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 130 .. field 130
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 127 .. field 127
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 115 .. field 116
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 110 .. field 110
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 101 .. field 101
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 89 .. field 89
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 43 .. field 43
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 82 .. field 82
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 75 .. field 75
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 72 .. field 72
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 62 .. field 63
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 58 .. field 58
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 43 .. field 43
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 36 .. field 36
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 30 .. field 30
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 26 .. field 26
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 18 .. field 18
| Delimiter Type: 0
| Custom delimiter:
| [X] Has Header
|
|--Remove fields:Comma-delimited field 15 .. field 16
| Delimiter Type: 0
| Custom delimiter:
| [ ] Has Header
|
|--Remove fields:Comma-delimited field 13 .. field 13
| Delimiter Type: 0
| Custom delimiter:
| [ ] Has Header
|
|--Remove fields:Comma-delimited field 11 .. field 11
| Delimiter Type: 0
| Custom delimiter:
| [ ] Has Header
|
|--Remove fields:Comma-delimited field 7 .. field 9
| Delimiter Type: 0
| Custom delimiter:
| [ ] Has Header
|
|--Remove fields:Comma-delimited field 4 .. field 4
| Delimiter Type: 0
| Custom delimiter:
| [ ] Has Header
|
|--Remove fields:Comma-delimited field 1 .. field 2
| Delimiter Type: 0
| Custom delimiter:
| [ ] Has Header
|
|--Replace list: C:\Documents and Settings\youngs\Desktop\email analysis 2008\cleanuptaxonomy.csv EasyPattern
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
|
|--EasyPattern ["] with []
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
|
|--EasyPattern [ 12:00,] with [,]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
|

User avatar
DataMystic Support
Site Admin
Posts: 2138
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: slow running program

Postby DataMystic Support » Fri May 29, 2009 9:08 am

If you process columns from right to left then no - you don't have to adjust column numbers. If you started from left to right then you would have to take into account your deletions when choosing column numbers, which is much trickier.
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments


Return to “TextPipe Tips and Tricks, Questions and Support”

Who is online

Users browsing this forum: Bing [Bot] and 3 guests