WXP version 5
Program Reference

PARSE

Sections

NAME

parse - Text data parsing program

SYNOPSIS

parse [parameters...] filename

PARAMETERS

Command Line Resource Default Description
-h help No Lists basic help information.
-df=filename default .wxpdef Sets the name of the resource file.
-na=name name parse Specifies the name used in resource file parsing.
-ba batch No Run program in batch mode
-me=level message out2 Specifies level of messages to be displayed
  • file information - mess
-fp=filepath file_path current directory Specifies location of database files.  
-dp=datapath data_path current directory Specifies the location (path) of the input raw data files. This may be modified in the name convention file.
-nc=name_conv name_conv name_conv The name convention file specifies how files are named in WXP. This sets which name convention file to use.
-if=in_file in_file raw_dat Specifies the input file name tag. The default tag is raw_dat but will need to be modified for most applications. This can be determined from the product header if possible. Otherwise, it must be explicitly specified.
-cu=[hour|la] current None This specifies to use current data files. The current filename is based on the name convention. An optional hour can be specified for older data. If la is specified, the program will search back to find the most recent available file.
-ho=hour hour None This resource specifies the exact hour that a data file is valid for. This locks in the start hour for a multi-file sequence.
-nh=num_hour num_hour 0 This specifies the number of hours that will be used. If this is not specified, a single hour will be parse.  Otherwise a set of hours will be parsed.
-ph=product product User Prompt This specifies the product header to search for.
-id=identifier identifier None Specifies the station to parse for. This is a string within a product that printing will start with.  If this is not specified, the whole product will be displayed.
-pa=param parameter None Specifies additional plotting parameters. See the parameter resource for more details. Some possibilities are:
  • blank -- stop parsing at a blank line
  • 3blank -- stop parsing after 3 blank lines
  • dollar -- stop parsing at a dollar sign
  • equal -- stop parsing on a trailing equals sign
  • line[=lines] -- stop parsing after set number of lines (default 1)
  • first -- print only the first occurrence
  • last -- print only the last occurrence
  • cont -- keep file open to search for new products as they arrive
  • hdr -- print just the headers
  • prod -- print just the product
  • hdr+prod -- print the header and the product
filename[#seq] filename None
User Pompt
Batch: current=la
The name of the surface data file to be used. An optional sequence number can be added to designate the time for non-WXP files.

DESCRIPTION

This program parses text data for a specific product and identifier.  The input to the program is a raw ingested data file. The type of data file can be determined either from the product header or the in_file resource. When a product is specified, it is cross-referenced against the parse.lup file to determine a file name tag to use. A sample of this lookup file is:

W       sev_dat
F       for_dat
C       cli_dat
...

If a product does not exactly match what is in the lookup file, a tag can be specified with the in_file resource.

The programs starts off by prompting the user for input data file name.  The user may specify the input file either via the command line of through the current resource. This will depend on the type of file either specified by the in_file resource or the product header.

Next, the user enters a product header. The header can have wildcard characters to parse for multiple product types:

Product Pattern Matching
. or ? match a single character
- or * match any character
[letters] match a single character from the set.
[^letters] match any character except those from the set.
(str1[|str2...]) match strings
_ underscore matches a space.
/secondline second line parsing

Second line parsing is also possible.  For many products, the second line of the product is the AWIPS header:

   ** FPUS1 KIND 022030 ***
   SFPIN

which is this case is "SFPIN". To parse for this, specify either "FPUS1_KIND" or "/SFPIN".

If "all" is specified, all bulletins are searched.

Once the product header has been specified, the file will be opened and all products matching the given header will be displayed in their entirety.

Selective Output

At times, the entire product is not desirable. By using a combination of the identifier resource and various output parameters, specific subsets of products can be displayed. By specifying a station identifier, the printing will start on a line that contains the identifier. Once an identifier is found, printing will continue until the end of product, unless otherwise specified.  The identifier can be:

Printing normally continues to the end of product. To terminate it earlier, use one of the parameters in the parameter resource:

Since more than one product can appear, it may be desirable to use only the first or last occurrence. Since products are continually appended to data files, it may be desirable to continue parsing even when the program has hit the end of file. This way the latest products will be printed as they are ingested. Additional parameters are available for these cases:

Header Files

The use of a header file can considerably improve access to data files. Rather than parsing the entire file which at times is larger than 1MB, the product headers can be parsed directly out of a header file. Header files are much smaller and parse very fast. The header file contains a byte offset into the large file.

EXAMPLES

To parse for the latest state forecast from KIND

   parse -cu -nh=-12 -ph=FPUS1_KIND -pa=last
   ** FPUS1 KIND 022030 ***
   SFPIN
   INZ002>089-031000-

   STATE FORECAST FOR INDIANA
   NATIONAL WEATHER SERVICE INDIANAPOLIS IN
   330 PM EST THU OCT 2 1997

   .TONIGHT...FAIR AND WARMER. LOWS 50 TO 55.
   .FRIDAY...MOSTLY SUNNY...BREEZY AND WARMER. HIGHS 80 TO 85.
   .FRIDAY NIGHT...BECOMING MOSTLY CLOUDY. A CHANCE OF THUNDERSTORMS.
   LOWS IN THE LOWER 60S.
   .SATURDAY...MOSTLY CLOUDY...BREEZY AND A CHANCE OF THUNDERSTORMS.
   WARM. HIGHS MIDDLE 70S TO AROUND 80.

   .EXTENDED FORECAST...
   .SUNDAY AND MONDAY...MOSTLY CLEAR AND WARM. LOWS MIDDLE 50S TO AROUND
   60. HIGHS UPPER 70S TO LOWER 80S.
   .TUESDAY...PARTLY CLOUDY AND MILD. LOWS AROUND 50 TO MIDDLE 50S.
   HIGHS IN THE 70S.
   DS

To parse for the latest state forecast using the AFOS PIL. Note the in_file is specified since the product header does not appear in the parse.lup file.

   parse -cu -nh=-12 -if=for_dat -ph=/SFPIN -pa=last

To parse for the latest zone forecast

   parse -cu -nh=-12 -ph=FPUS53_KIND -id=%INZ029 -pa=dollar,last
   ** FPUS53 KIND 022040 COR ***
   INZ020>023-028>030-035-036-043-044-051-052-060-067-030930-
   CARROLL-CASS-CLAY-CLINTON-FOUNTAIN-KNOX-MIAMI-MONTGOMERY-PARKE-
   SULLIVAN-TIPPECANOE-VERMILLION-VIGO-WARREN-WHITE-
   INCLUDING THE CITIES OF...CRAWFORDSVILLE...FRANKFORT...LAFAYETTE...
   LOGANSPORT...TERRE HAUTE...VINCENNES
   330 PM EST THU OCT 2 1997

   .TONIGHT...PARTLY CLOUDY AND WARMER. LOW IN THE MIDDLE 50S. SOUTHWEST
   WIND 5 TO 10 MPH.
   .FRIDAY...MOSTLY SUNNY AND WARMER. HIGH 80 TO 85. BREEZY SOUTHWEST
   WIND 15 TO 20 MPH.
   .FRIDAY NIGHT...BECOMING MOSTLY CLOUDY. A 40 PERCENT CHANCE OF
   THUNDERSTORMS. MILD. LOW IN THE LOWER 60S.
   .SATURDAY...MOSTLY CLOUDY...BREEZY AND A 40 PERCENT CHANCE OF
   THUNDERSTORMS. MILD. HIGH IN THE UPPER 70S.

FILES

SEE ALSO


Last updated Oct 2, 1997