Input and Output Files

Most WXP programs will require an data file to process. WXP can process many different types of files and plot information based on their content. Because of this, programs are generally isolated to a specific type of file rather than one generic program to process every file type.

Name Conventions

The programs can use straight filenames or generate filenames based on a naming convention. It is critical for proper processing of data that filenames are time stamped. In general, the filenames include the date, type of data and data format. These are all defined by the name convention. Since WXP allows the user to define their own name conventions, this opens up WXP to use a variety of data files and naming conventions.

The naming conventions are defined in a name convention file. The default for this is "name.cnv" but it will also look for the WXP 5 file "name_conv". The structure of the convention file is:

tag filename [headerfilename] [format] [params]

The tag is an abbreviation to reference the actual name convention. The filename is the name convention for the data file. The headerfilename is the header filename which is often saved by the ingestor as a quick lookup into the data file. This header file will contain WMO headers or GRIB parameters along with the byte offset into the data file. The format is the format of the data such as "grib", "nids", "gini", etc. The params are name convention parameters that need to be specified. This will include min search time and minimum allowable file size.

The name convention is often a data path plus a time stamped filename. Here is an example:

sfc_dat %D/%Y%m%d%h_sao.wmo

This is a tag for raw surface data (METARs, SAOs, etc). The name convention has:

%D - specifies the data path (see data_path resource)
%Y - 4 digit year
%m - 2 digit month
%d - 2 digit day of month
%h - 2 digit hour
_sao - type of data
.wmo - data format

Anything starting with a "%" is a wildcard that will be replaced to create the final filename. Here are the possible wildcards:

PATHS
%C	Converted data path - con_path resource
%D	Raw data path - data_path resource
%F	Database file path - file_path resource
%G	Grid file path - grid_path resource
%I	Image file path - image_path resource
%M	Model data file path - model_path resource
%R	Raw file path - raw_path resource
%S	Satellite data path - sat_path resource
%T	Text file path - text_path resource
%W	Severe watches file path - watch_path resource
%{xxxxx}	Generic path - value taken from resource specified with the braces

DATE
%Y	Year (4 digits)
%y	Year (last 2 digits)
%m	Month (01-12)
%B	Month abbreviation (JAN-DEC)
%b	Month abbreviation (jan-dec)
%j	Julian day (001-366)
%d	Day of month (01-31)
%h	Hour (00-23)
%n	Minute (00-59)

MISC
%e	Name extension
%x	Generic wildcard where "x" is any lower case letter not used in other wildcards

To handle multiple hour/minute files, a number can be added to the wildcard. A %6h represents a data file created once every 6 hours or a 6 hour compilation file. A %5n specifies that a data file is created once every 5 minutes.

The in_file and out_file Resources

The name convention specifies tags that reference a naming convention. The in_file resource specifies the input name convention to a program. The out_file resource specifies the output name convention of a program. For example:

sfcplot -if=syn_cvt

This will tell the program to plot converted synoptic data rather than METAR data.

Also, these resources can be used to specify a full name convention. If the specification does not match a tag in the convention file, it will be assumed to be an actual name convention. For example:

sfcplot -if=%C/%h%m%d%y.sao

In prior versions of WXP, the data format would also have to be specified. But this is now handled by the format parameter in the name convention file. The format can still be specified with the input and output resources but they are no longer required.

File Name Extension

The name extension is a way of using the name convention for more than one set of filenames. The "%e" wildcard uses everything after the last "_" in the convention specification. A good example is with NIDS data. A generic name convention can be set up:

nids %D/nids/%Y%m%d%h%n_%e.nid

When specifying the in_file resource, the tag can have an extension added to the tag:

-if=nids_n0r

Everything after the last underscore "_" is considered the extension and the "n0r" will be put into the file name for the %e wildcard.

If more than one tag starts with "nids" the first match will always be used. This means that a generic tag such as "grib" should always be placed after more specific tags such as "grib_nam".

Generic Wildcards

This is any lower case character not already defined. Going back to the NIDS example:

nids %D/nids/%i/%Y%m%d%h%n_%t.nid

When specifying the in_file resource, the tag can have an wildcard added to the tag:

-if=nids:i=DIX:t=n0r

The string "DIX" will replace the "%i" specification in the name convention and "n0r" will replace the "%t" specification. In some cases as in the radplot program, it sets the %i and %t tag from the identifier and type resources:

radplot -if=nids -id=DIX -ty=n0r

This simplifies name convention specification.

Sample Name Convention

Here is a sample listing:

# # Surface SAO/METAR data # sfc_dat %D/%y%m%d%h_sao.wmo sfc_xml %C/%y%m%d%h_sao.xml - xml sfc_cdf %C/%y%m%d%h_sao.cdf - cdf sfc_cvt %C/%y%m%d%h_sao.wxp - wxp min=60 # # RCM radar data # rcm_dat %D/%y%m%d%h%30n_rcm.wmo - wmo rcm_cvt %C/%y%m%d%h%30n_rcm.wxp - rcm min=60 # # Upper air data # upa_dat %D/%y%m%d%12h_upa.wmo - wmo upa_cvt %C/%y%m%d%12h_upa.wxp - wxp # # NOAAPORT Satellite data # sat_vis_ec %S/%y%m%d%h%15n_gecvi.sat - giniz min=45,size=12000000 sat_ir_ec %S/%y%m%d%h%15n_geci11.sat - giniz min=45,size=850000 # # GRIB data # grib_nam2 %M/%y%m%d%6h_nam2.grb %M/%y%m%d%6h_nam2.hdr grib min=420 grib_nam5 %M/%y%m%d%6h_nam5.grb %M/%y%m%d%6h_nam5.hdr grib min=420 grib_nam %M/%y%m%d%6h_nam.grb %M/%y%m%d%6h_nam.hdr grib min=420 # # NIDS data # nids %{nids_path}/%i/%y%m%d%h%n_%t.nid - nids min=12

Ingest Programs

The ingest program whether it be the WXP wmoingest program or the LDM, will produce several types of files based on the content of the file. This includes surface, upper air, radar, NIDS, GRIB and text data files.

The ingestor has a configuration file which defines which data types and products are to be saved as well as defining the file name convention that these files will have. The setup file is either ingest.prd for the WXP ingestor or the pqact.conf file for the LDM.

Products File

This is the file the WXP ingestor uses to determine output. This file maps WMO headers to output name conventions. This file is either "ingest.prd" or whatever the prod_file resource is set to. The syntax is:

header action filename [headerfilename]

The header is the WMO header to match on. These are generally regular expressions to match on a set of possible headers. The action is:

Action
>	Write to file (overwrites)
>>	Appends to file
\|	Pipes to script or program
P	Create a Product Arrival Notification (PAN)
B	Binary data (do not strip non-printable characters)
R	Save data exactly as it arrives
U	Only select data if it hasn't already been selected
+##	Subtract ## minutes from time before generating filename
-##	Add ## minutes to time before generating filename

The name conventions for filename and headerfilename are basically identical to those used in the name convention file. The major difference is the "p" prefix on time. This tells the ingestor to use the time specified in the WMO header and not use clock time.

For example, textual forecasts may go into a forecast file based on the date in the WMO header and the have a file extension describing the type of data in that file:

FO >> %D/data/%pY%pm%pd%ph_mod.wmo %D/data/%pY%pm%pd%ph_mod.hdr

This will put all the MOS products in a file with a mod.wmo extension. It will also create a header file for quick searching of data.

U[^AB] >>-65 %D/data/%Y%m%d%12h_upa.wmo

This will put upper air data into 12 hourly files and will start saving data into a file 65 minutes prior. So the 12Z data will start being saved at 1055Z and will go to 2255Z when data will start being saved to the 00Z file.

S[AP] >>-15 %D/data/%Y%m%d%h_sao.wmo

This will write surface observations to an hourly file but the file will start writing 15 minutes before the hour. This is done to pick up observations sent before the top of the hour.

Decoder Programs

Each decoder has a default tag that is used to find the naming convention for the input and output files. For example, when converting surface data using the sfcdec program, the input tag is sfc_dat and the output tag is sfc_cvt. These will then be crossed referenced in the name convention file:

sfc_dat %D/%Y%m%d%h_sao.wmo sfc_cvt %C/%Y%m%d%h_sao.wxp

The tags themselves can be changed using the in_file and out_file resource.

The out_file resource can also be used to specify the type of decoded output file such as ASCII (wxp) or netCDF (cdf). The name convention will specify the output file name but the format parameter in the name convention file will specify the type of output:

sfc_cvt_wxp %C/%Y%m%d%h_sao.wxp - wxp sfc_cvt_cdf %C/%Y%m%d%h_sao.nc - cdf sfc_cvt_xml %C/%Y%m%d%h_sao.xml - xml

Updated January 2021

database <<

input_files

>> filenames