Using line count in global variable

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

tahoar
Posts: 10
Joined: Tue Sep 23, 2008 10:35 am

Using line count in global variable

Postby tahoar » Sun Jun 21, 2009 12:08 pm

Ok, I know there's a trick to this, and I just can't find it.

The first I count the line count in a file and save the number to an external text file. Next I read the number in a text file to calculate x percent (5%) of the line count. Third, I use a vbscript to create an array of random numbers with x (5%) elements where each element's value ranges from 1 to total lines. Finally, I use vbscript to add a tag to the random lines.

Right now, I do all four of these steps in two different textpipe scripts. I can't assign the first step's line count to a global variable and use it later in the same script. I think there's a trick to using the T-filter, but I haven't been able to make it work. A similar functionality would be to insert the total line count as a file header. I can get the T-filter to allow me to insert the global variable to the footer, but not to the header (or left / right margins).

Can you point me to a sample filter?

Thanks,
Tom

User avatar
DataMystic Support
Site Admin
Posts: 2138
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Using line count in global variable

Postby DataMystic Support » Mon Jun 22, 2009 11:10 am

Hi Tom,

Given that TextPipe is designed to handle files that are Gigabytes in size, it doesn't let you add a line count to the start of a file because it doesn't know it yet. It is also designed to only pass over the source file once - whereas you want it to pass over it twice, which is inefficient. And TextPipe hates being inefficient!

A header can easily be output in the startFile() function.

You can easily add a left or right margin by prepend or appending it to line in the processLine(line, EOL) function
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

tahoar
Posts: 10
Joined: Tue Sep 23, 2008 10:35 am

Re: Using line count in global variable

Postby tahoar » Tue Jun 23, 2009 11:02 am

Yes, multi-gigabyte processing is great. My three-pass solution saved the "Output count of matches" to a one-line temp file, then a batch file read the temp file into an environment variable which a second textpipe pass used to calculate a percentage of the line total. Your suggestion to use the startFile() function now reduces that processing to one pass using the function below. The following VBscript works fine.

Thanks,
Tom

function startFile()
TextPipe.setGlobalVar "linecount", getLineCount( TextPipe.fullInputFilename )
startFile = ""
end function

function getLineCount( srcFile )
Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objReadFile = objFSO.OpenTextFile(srcFile, ForReading)
Do Until objReadFile.AtEndOfStream
b = b + 1
strLine = objReadFile.Readline
Loop
objReadFile.Close
Set objReadFile = nothing
Set objFSO = nothing
getLineCount = b
end function

User avatar
DataMystic Support
Site Admin
Posts: 2138
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Using line count in global variable

Postby DataMystic Support » Tue Jun 23, 2009 12:50 pm

I shudder to think how long that vbscript will take to run on a file with 100,000 lines or more!
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

tahoar
Posts: 10
Joined: Tue Sep 23, 2008 10:35 am

Re: Using line count in global variable

Postby tahoar » Wed Jun 24, 2009 7:37 pm

100,000? Try 700,000 lines!

Surprisingly, not to bad. The entire pipeline included 7 total Textpipe filters tied together with a command-line batch file. Two of the filters made character-by-character passes on paired data using a PERL replace filter. One filter made two separate passes on paired halves of the data in a different VBscript.

I started the pipeline at 3:00 AM this morning and it finished at 9:30AM. I don't have a breakdown of which filters took how long, but this VBscript count was one of the least-challenging tasks.

If I were a programmer, I could be dangerous!

Tom

tahoar
Posts: 10
Joined: Tue Sep 23, 2008 10:35 am

Re: Using line count in global variable

Postby tahoar » Wed Jun 24, 2009 7:42 pm

Oops, make that 960,000 lines.


Return to “TextPipe Tips and Tricks, Questions and Support”

Who is online

Users browsing this forum: Google [Bot], Majestic-12 [Bot] and 1 guest