Filbreak

PURPOSE   OPERATION   OPTIONS   PARAMETER FILE   COMMAND LINES   RELATED PROGRAMS


Author: Dan Mares, dmares @ dmares . com
Portions Copyright © 1998-2021 by Dan Mares and Mares and Company, LLC
Phone: 678-427-3275
Last UPDATE of the program: November 11, 2021
Get the 32 bit .exe

All programs are command line programs.
MUST be run within a command window as administrator.

One liner: "Breaks" up records to user selected fields

Sample Maresware Batches  an executable with data that demonstrates various Maresware software. Download and run the appropriate _08_xx batch for filbreak demo.


top

Purpose

To reformat the record structure of a file.

Filbreak is a program that will allow you to select sections of an input record and place them into an output record of a different layout. Maresware's Search program has a subset of the filbreak operation. For the most part, Filbreak's options are available in the Search program and are identical to these.

You can take selected fields of the input record, rearrange them, and write them to a new output record which is formatted to your specifications.

You can use this program to create a data record that will be formatted as if it were a final report. Then use a word processor, or copy the output directly to a printer.

With the use of Pagefmt or Mouse you can create on the fly reports easily and without the use of a database.


top

Operation

After reading the contents of the parameter file Filbreak begins to read the input file in blocks up to 32000 bytes. It then processes each record in accordance with the options found on the command line and the parameter file requirements.

After it processes the record it writes the record to an output buffer.

When the output buffer gets filled up (approx 32000 bytes), it writes that block of data to the output file.


top

Command Lines

C:> filbreak inputoutputparameter.fle-[ options]

Item 1:Program name [filbreak].
Item 2:Input file name [max of 40 characters].
Item 3:Output file name [max of 40 characters].
Item 4:Name of a parameter file [max of 40 characters].
Item 5:Options (if used).

C:> filbreak  myfile.in  yourfile.out  param.fle
C:> filbreak  myfile.in  yourfile.out  paramfile  -c  -h  3  -M  9999

where:
filbreak == program name
myfile.in ==input file name
yourfile.out == output file name
param.fle== parameter file name

and

-c -h: are the options

3 is the number of headers (including and EOF) to be found on the input tape.

-M 9999: Means replace a field with the characters 9999  (see ‘m’ in parameter file)


top

Parameter File

The parameter file is an ascii text file created with a text editor, not word processor.

The contents of the parameter file communicates to the program information about the input file blocksize (if using tape) or what blocksize you wish to use if reading off the disk. It also communicates information about the record length and what pieces or fields of the input record you wish to place in the output record.

The format of the parameter file is one line for each item described here.

Line 1: Input blocksize (max of 32767)[must be a multiple of the record length, or tape blocksize]. As of August 2010, if you add after the number portion of the blocksize the letter "A" for "AUTO" blocksizeing, the program will automatically calculate the largest acceptable input blocksize. This recalculation will often increase processing speed. The record length provided on line 2 is actually used to resize the maximum blocksize. This makes the program more efficient. The autoblock line might look like: 1234A

Line 2: Input record length. If using the filbreak option of Search, line 1 and 2 are ignored by the Search program, but MUST exist. This value is used to recalculate a more efficient blocksize.

The succeeding lines are used to build the output record.(The first character x of the record is considered displacement 0000.) The succeeding lines must have the following format:

Pos. 1-4  Displacement in input record to place to begin copy. First character of input record is displacement = 0000. There must be leading zeros in this item if necessary to fill to 4 characters. It must be 4 characters long. (ie: 0030...., start at position 30 from zero, which is actually character 31 and perform the following)

Pos. 5:  Any one of the following:

=   An Equal sign “=” indicating take this many characters. =030 (30 characters, etc.)

c   Convert this field to spaces (default). (see also, command option -C)

C   Convert this field to UPPER CASE. Use when primarily when converting hex fields. C004 does this, 4 characters: abcd becomes ABCD

i   Means insert what follows in this next field of the output record. Whatever is typed up to the next carriage return is included in the record. There is a maximum of 70 characters per field. However, fields can be consecutively added after each other. (0000iTHIS IS ADDED HERE)

I   Means insert an 'I'ndex number here. The format is: 0000Innnn. The nnnn is the starting index for this field, and the width of the field produced is determined by the number of n's in the field. So, 0000I010000, would be a field 6 characters wide, and starts at 010000. (Leading zeros always). The value increases once for each successive record.

g   This field is gregorian date format,  convert it to julian format (ie., YYDDD ). Current gregorian formats recognised are: YYMMDD, YY-MM-DD, YYYYMMDD, MMDDYY, MM-DD-YY, MMDDYYYY.

j   This field is a 7 character julian date field (ie., of the format YYYYDDD ). Convert it to a gregorian date field. (ie., MMDDYYYY ).

m   Use if the field contents are to be modified(replaced) in the output record. See -M command line option argument.

$   “$”(Dollar sign) If you wish to convert this field from signed numbers to numbers with signs. See SIGNED below.

n   To convert a name field that is comma delimited with the last name, first name, middle initial in a fixed position and fixed length to a fixed length lastname, first name. SEE EXPAND below. (Not available in the Search program.)

r   To insert a 2 bytes DOS carriage return (0x0d0a) in the record at this point. This is probably a good line to use as the end of the record. 0000r002

R   To insert a UNIX carriage return (0x0a) in the record at this point.

#   Insert decimal record number of this record. Use this as last item of output record. The field width is determined by width of the number following the #. (ie: #0000, will have the record number 4 digits wide. make sure your total record count will not be wider than the nnnn width)

B   Insert binary (4 byte) record number of this record. Similar to the #, except this produces an output compatible with COBOL with hi-byte first, and lo-byte last. This is what you would normally expect to see.

t   To total this field with others so designated and place total in a 15 character field at end of record. See TOTAL below (not available in Search program).

The following 4 parameter file options are an outdated mainframe holdover and may not work in windows

p   Indicates that this is a packed decimal field and do not convert it from EBCDIC along with the rest of the record. It does not unpack the field, it merely copies it without an EBCDIC to ASCII conversion. If you want the field converted (expanded) you must list an additional line using the “u”npack parameter file designation. If the command line option -w (no write) is used, this field is not written to the output record. This assumes you have used the “u” designation to write the field in unpacked fashion. (-p option not available in Search program; -u is.)
NOTE: If the -w (no write) command line option is chosen, none of the -p fields are written to the output.

u   This tells the program that this field is a packed decimal and to unpack the field. It will naturally double the size of the field when it unpacks the field. It will also pad a blank on rightmost character if the expansion leads to an odd length field and it needs the padding to create an even length. (ex., a packed decimal SSN will equate to a left justified 9 digits and a blank fill on the right, like: "123456789 ", and a zip code will be "30341"‘). It will also handle COBOL comp-3 signed fields, and place a sign in the leftmost position of the expanded field.

x   Indicates the field beginning at this displacement is an infomix date field that you wish to convert to a string. Position 6-8 should be a length of either 6 or 8, ie., 006 or 008. If a 006 is used, then the string will be of the format yymmdd. If the 8 is used, the string will be of the format mm/dd/yy. Later enhancements will let you further define the forward or reverse format of the strings. If anything other than 6 or 8 is entered for the length, 6 is defaulted.

l    Indicates that the field beginning at this displacement is an infomix integer field of 4 bytes in length. The program expands this integer to a 15 character ASCII string, and places it in the output record at this point.

Pos. 6-8   =Number of characters to copy to the output record. This also must have leading zeros. Depending on the parameter option, can be wider than 3 characters. So a 1 becomes 001, etc.

LAST LINE   The last line of the parameter file MUST end with eight zeros to indicate to the program that this is the end of the parameters. OR, a blank line right after the last line is now acceptable.

It would look like this.

00000000

Two sample parameter files might look like this: (My comments are inside the slash asterisk. \* comments *\)

8000 \*blocksize*\
800 \*input record length*\
0000=010 \*begin at first character in record and copy 10 *\
0020=005 \*copy 5 from displacement 20 *\
0150=200 \*copy 200 from displacement 150 *\
0000r002 \*terminate the record with a CR/LF *\
00000000 \* last line must be zeros *\

8000
800
0000=010
0011c003 \*convert this field to blanks or dashes*\
0020m005 \*modify option used*\
0000 i This text is to be inserted at this point in the record
0150=200 /* at displacement 150 take 200 characters */
0120$010 \*money field expansion*\
0050n030 \*name field expansion is to take place*\
0002t005
0010t005 \*total both fields beginning at 2 and 10 *\
0030p005 \*assumes ebcdic input and this is a packed decimal field*\
0030u005 \*unpack this field to a 10 digit ascii field*\
0052x006 \*turn the infomix date into 6 characters *\
0052x008 \*and turn it into 8 characters *\
0015l004 \*turn these 4 characters into a 15 character number string */
0000r002 /* carriage return */
00000000 /* terminate the parameter file*/

The “c” on the line representing the second field indicates that after the first 10 characters are written we want to convert the next 3 characters of the output record to either a space or dash, depending on what option was used in the command line. Default is a space conversion.

The “m” on the line representing the third field indicates that after the first 13 characters, we want the next 5 to be those characters which were entered from the command line after the -M (Modify) option. NOTE: on the “m” character modification option: The output field position referenced by this item will be replaced in the output record by the string of characters input at the command line after the -M argument. It is used to replace a string of characters with another string. A possible use is to change/modify one field in the record before placing it in output. A date or period may be one item to do this with. You can use this to hardcode a date, period, etc.

The “$” indicates that this field is to be converted to a SIGNED field in the output file.

The “t” indicates that the two fields beginning at 2 and 10 each 5 digits long are to be totalled and the result placed at the end of the record.

NOTE: the ‘t’ option and the $ option CANNOT operate on the same field in the same pass of the program. If you have two $ fields that are to be converted and added, then you must run the program twice--once to convert the $ and a second time to add the fields.

After the last line and a blank line, comments may be added to the parameter file.


top

Signed Fields

Occasionally you will have a file which has a numeric field containing signed numbers off the mainframe computer. A signed field might look like this: 1234A or 1234M. The alpha characters are there to take the place of a sign + or - and a digit 0-9. The mainframe does this to save space.

Informix, and a lot of databases don’t like alpha characters in numeric fields. So the $ modifier in the parameter file tells the program to take the alpha character, and convert it to the proper sign and digit.

What happens is that the program decodes the alpha character and places the sign to the left of the field. So, 1234A would end up as +12341 (because an A is a +1).

The only thing you have to remember is that if the field is 10 characters long you should allow for this extra final character of the + or -. You do this by adding one to the size of the field, and backing up one from your normal starting point of the field in the parameter file.

There is a problem with some files, in that the signed field sometimes has an S in it. This S does not conform to any known standard. You will have to find your own way out of those records.

EXAMPLE:

The signed field is 10 characters long, and it begins at position(displacement 30) in the record. So if you convert this entire field it will ultimately be 11 characters long (10 for the digits, and 1 for the sign). You back up 1 character when you tell the program where to look for the field. The line in the parameter file would look like this:

0029$011

The means the program will put eleven characters to the output records. At the first character position it will substitute the sign (+ or -) and then the next 10 positions will be the digits.  Don’t worry about trashing that leading character you picked up from the prior field. If you write it from one of the other lines in the parameter file, it will still be there. It is like the conversion ‘c’ option. It is only a space saver. Signed and Total cannot operate on the same field at the same time.


top

Expand option

This option in the parameter file is used if your input file has a name field that is a fixed length but has the name in the following format: Lastname(comma), Firstname(comma), Middle initial. This poses a problem in that you cannot easily create an output that will have the last name, first name and middle initial in fixed positions in the output record.

This name-expansion option allows you to advise the program that this field is a name field of the L,F,Mi format and that you want it reformatted to: lastname(fixed) firstname(fixed) mi. Here is an example of what happens:

Input file: MARES,DAN,J *
(* indicates end of input field)

Output: MARES    DAN  J

The parameter file format for this field is:
Position 0-3: Displacement to beginning of field 0020
Position 4: An ‘n’ indicating to the program it is to do name expansion.
Position 5-7: Length of the field (actual length of the field)
    For example: 0020n030

/*beginning at displacement 20 expand a 30 character name field*/

The expansion percentages are standard Last name, first name percentages for the entire field. The last name is expanded to 3/5 the entire length, the first name gets 2/5 and the MI gets one character. The entire length of the output field can be increased but it can never be decreased with any predictable results.


top

Total Option

The total option is used to total two or more fields within a record and place the result in a 15 character field at the end of the record. The fields that are totaled are not affected and are NOT placed into the output file unless you include them in a normal field to be broken out by using the “=” format. Signed and Total cannot operate on the same field at the same time.

If all lower case t’s are used then the total field appended to the end of the record is right justified, and blank filled on the left.


top

Packed Option

The packed “p” parameter file option assumes the input is ebcdic and you are converting to ASCII all but this field, which is known to be a packed decimal field. With ebcdic packed decimal fields, you do not convert to ascii. You leave them as is, and later use the “u” unpack option to expand the field to real ascii characters.


top

Unpack Option

The unpack ‘u’ parameter file option tells the program that this field is packed decimal, and to unpack it to 2x the length given. It will right pad with blanks to fill the extra character if necessary.


top

Options

-a    (a)ppends output to existing output file. The default is that the output file (if one exists) is truncated before writing output to it.

-A    (A)dds/creates to statistic accounting file, called ‘ACCT-ING’.

/A    Turns off auto-accounting set by environment options.

-1 + fname  (that’s an option of one, not an ell) where fname is the name of the accounting file, other than a default acct-ing.

-[8|9]   This option operates the same way as the -A or -1 (accounting) option.

In addition, at the end of the search keys, if you place at least one blank line, you can then add comments to the parameter file. These comments will be added to the accounting file. If the byte count of the search keys is less than 250, then the entire parameter file is added to the accounting file.

-b or -s   The -s or -b option indicates that those fields identified in the parameter file by a ‘c’ for “conversion” of the field are to be converted to blanks. The is an easy way to add (include) blanks in your output record.

-d    The -d option works just like the -s and -b options except that instead of blanks, these characters (the field ) will be replaced by dashes. This is ideal for adding dashes to ssn’s. See sample parameter file.

-D + #    The -D # option works like the -d and -b options in that it will replace the characters identified in the field in the parameter file with a specific character. The character to use is the decimal equivalent of the item entered on the command line in place of the # sign. Ex., to have the field replaced with the upper case ‘A’ you would use -D 65. Because a decimal 65 is the equivalent of ‘A’.

-c    Used to cause all unprintable characters to be converted to a “~” in the output file.

-C + #   Convert all unprintable characters in entire record to decimal value #. Or, if a space is required, enter -C b. If a zero is required, enter -C z. If a dash is required, enter -C d.
You must always have a space after -C.

-e    To convert EBCDIC input files to ASCII output.

-fF +#    Format field totals with this code.

When a t is used in the Filbreak parameter file, the final total is a 20 character field added to the end of the record. If this field is to be formatted, use this -f option with the following code for the type of format needed:

1 = add dollar sign justified to figures
2 = add dollar sign left justified. at beginning of field
4 = zero fill on left of the numbers
8 = add comma in thousands positions
16 = star(*) fill left of the numbers(after any left justified $)
32 = leading sign, right justified(butt against numbers
64 = leading sign, left justified(left of field, before $)
128 = trailing sign, after numbers (for cobol programmers)
256 = no sign

The numbers can be added to obtain custom results. (ex., 1 + 8 + 16 = 25 will produce format like this: $****12,345)

-h + #    Use to bypass headers when using tape. Enter the number of tape blocks you wish to pass. This includes an EOF after the last legitimate header. Not available on NT version.

-H + #  To bypass characters at the beginning of the file. This option treats the characters just as headers, and passes them before starting to process the file. The integer following the -H is the number of characters to pass. (Not available in the Search program.)

-L  The display erases for long running programs.

-M + string of characters   enter the string of characters you wish to replace a field with. See the parameter file option ‘m’. The length of this argument must be at least as long as the field width designated in the parameter file.

-n    Use with the -c option to cause all unprintable characters to print as “~” in the output file but do not change any newline characters. This option leaves the newline characters intact.

-N   (N)o rewind of an input tape if using an IDT tape drive.

-P    Used to automatically insert pipes ‘|‘ after each field in the output record. Good for preparing files for databases. If the -P is followed by anything EXCEPT another valid option, then that item is taken to be the delimeter. So if you wanted a single comma as the delimeter, use (-P     ,). However, if you want a comma delimeted file you must use the following syntax to get the quotes also: (-P    \",\"). Don't forget the backslash before the quotes so the command line doesn't get expanded. This will get an output record like: "Field 1","Field2      ","Field     #3","Field      4". Notice however, that trailing spaces are NOT removed from the fields.

-r   Insert a carriage return (0X0d0X0a on DOS) at the end of the output record.

-R   Insert a carriage return (0X0a on DOS) at the end of the record.

-t   Forces the INPUT to be treated as if it were an IDT tape drive. If the input file is mt0, then the -t is a default and the input is expected to be the tape drive.

-u   Do not unload tape after completion. Default is unload.

-v   If $ (signed field conversion) causes leading blanks or zeros to all be converted to zeros(0).

-V    If $ (signed field conversion) causes leading blanks or zeros to all be converted to blanks( ).

-w  If the ‘p’acked decimal option is chosen in the parameter file and the “u”npack option is also used, then the -w option will not print the field designated as a “p”acked decimal to the output file, but will expand (“u”npack) the field based upon the “u” field designated in the parameter file.

--auto  (8/2010) the --auto is used to automatically recalculate the maximum allowed input blocksize under 65K. This is similar to adding the 'a' after the blocksize in the parameter file.

--header=filename  (8/2010) the --header=filename. Use this to have filebreak read the contents of the file "filename" and use its entire contents as the header line of the output file. This would usually be a formatted item, relating to the appropriate "columns" of the output file. This file should be only a single line of text properly formatted. No error checking is done to see if the columns line up correctly.


top

Related Programs

Bsearch

Pipefix

Search

top