saada logo

 
  SAADA OVERVIEW
Home  
News  
Tutos and Links  
Download  
  Tutorial
Getting started  
Doing More  
SaadaQL  
The Web Interface  
VO Publishing  
Tips & Troubelshooting  
  COMMUNITY
Mailing List  
Saada Sites  
How to Contact us ?  
  DEVELOPER CORNER
Contributors  
Next Step  
Old Releases  
Inside Saada  
Using UWS  

 

SourceForge.net Logo


HOME ART > Tutorial > Doing More > Running Saada by Script
Data Loader Options

The data loder has been completely redesigned for Saada1.5. The goal was to make it faster and to make it quite simpler to use thanks to a default behavior avoiding beginer users to have to deal with complicated concepts (class mapping, product configuration).




 Using the dataloader in command line mode. The dataloader can be spawn as any Java program. But the classpath set-up is quite long. Basically, all needed libraries can be found either in the $SAADA_DB_HOME/lib dir and in $SAADA_DB_HOME/jtools subdirectories. The dataloader can also be run by the sant script provided with the distribution.

- Java Mode : java -classpath .... saadadb.dataloader.loader [params...] SaadaDBName In this mode, parameter have the following form :-parameter=value

- Sant Mode : sant data.load [params] In this mode, parameter have the following form :-Dparameter=value

 Data laoding in Default Mode Only 3 parameters are required to load data :

- -collection=value : Give the collection name where data files will be stored.

- -filname=value : Gives the file to load. If value is a directory, the datalaoder will try to load all its content. All FITS files and all VOTables having an extension matching the requested category will be loaded.

- -category= : Gives the category of the products to be loaded. The category can have one of the following values :

  • misc : The dataloader will look either for FITS files or VOTable. Saada will ingest all keywords of the primary header (see)
  • image : The dataloader looks for FITS files having one extension of IMAGE type. Saada will ingest all keywords of the primary header and all keywords of the IMAGE extension. Extension keywords are kept in case of keywords defined in both 1st HDU and extension.
  • spectrum : The dataloader will look either for FITS files or for VOTable having one extension of BINTABLE type for FITS or one TABLE for VOTables. Saada will ingest all keywords of the primary header (see) as for the MISC category. If spectral data are in other extension, the -extension parameter must be used (see below).
  • table : The dataloader will look either for FITS files or for VOTable having one extension of BINTABLE type for FITS or one TABLE for VOTables. Saada will ingest all keywords of the primary header and all keywords of the TABLE extension. Extension keywords are kept in case of keywords defined in both 1st HDU and extension. Saada will also store row data with one Saada column for each product column.
  • flatfile : The data laoder will store all files in the Saada repository.

In defaut mode all product with the same format (keyword set), the same category and the same collection are stored in the same Saada class (and in the same SQL table). The dataloader attempts to detect automatically the system of coordinates. FK5/J2000 is taken by default. WCS keywords, position, errors on position and spectral ranges are detected automatically. In case of failure, these fields are not set but product are loaded anyway.

 Extension to be loaded : If the given extension does not match the product category, the product is rejected.

- -extension=value : The extension can be given either by name (e.g. -extension=I/131A/sao) or by number (e.g. -extension=#1). The number must be prefixed with a ’#’. Extension numbers start at 0.

 The coordinates This parameters does not concern MISC product or TABLE header. All coordinate parameters (position, errors and system) can be either taken from dataloader parameters or automatically detected in data. The method to use is set-up by the coordinates mapping parameters. If a mapping priority is only, values are taken from dataloader parameters. If the priority is first, values are first taken from dataloader parameters and in case of failure, they are searched in product data. If the priority is last, values are first searched in product data and in case of failure they are taken from the dataloader parameters. The default value is always first

- The Coordinate System :

  • -sysmapping=only|first|last : Gives the priority of the coordinate system mapping (see above).
  • -system=SYSTEM[,DEQUINOX] : Gives the system which will be applied to the loaded product. The system can be GALACTIC/ICRF/FK4/FK5. The equinox can be set for KF4 and FK5. When this parameter is used (see priority), all porduct are assumed to use this system. Their coordinates are convert from this sytem to the SaadaDB system at loading time to populate collection attributes.

- The coordinates The coordinates can be given either by keywords or by object names or by numeric values.

  • -posmapping=only|first|last. Gives the priority of te usage of the position given in parameter (see above).
  • -position=value : Defines a position.
    • -position=KW1,KW2 : Keyword KW1 is taken as latitude and KW2 is taken as longitude. Both keywords must exist for this mapping to be used.
    • -position='OBJECT_NAME' : The value quoted in "’" is an object name which is resolved by the dataloader. The name resolution must be successful for the mapping to be used.
    • -position=numeric_1,nueric_2 : The position is given by a couple of numeric values. Numeric values can either be expressed in decimal (1.234) or on D/H:M:S. 2 values must be given for this mapping to be used.

- The Position Errors : Positions errors are supposed to be elliptic. Ellipse of errors are defined with 3 paremeters : major axis (maj), minor axis (min) and the angle between the north axis and the major axis (angle). If only one parameter is given, ellipses is supposed to be circle. If only 2 parameters are given, ellipses is supposed to be North to South oriented. The first parameter is considered as the error along of the latitude and the second along of the longitude. Error are supposed to be expressed in degree except if another unit is given. Angles are always considered as being in degrees.

  • -poserrormapping=only|first|last. Gives the priority of the usage of the position error given in parameter (see above).
  • -poserrorunit=unit : Gives the unit used to express errors in the product to load. Unit can be deg, arcmin, arcsec, mas and uas.
  • -poserror=KW1,KW2,KW3 : Keywords KW1 will taken major axis, KW2 will be taken minor axis and KW# is taken as angle. At least one KW must be given
  • -position=num_1,mun_2,num_3 : Numeric values of the ellipse of errors. One value must be given at least.

 Spectral Coordinates Saada attempts to extract get a spectral range (DISPERSION \ or WAVELENGTH from each spectrum. This range is converted into the Saada unit to set the spectral range at collection level. This feature is necessary to implement a proper SSA service. Spectra are laoded anyway even when spectral range can not be set.

- -spcmapping=only|first|lastGives the usage priority of spectral range mapping parameters (see above).

- -spcrange=value : Defines how to compute the spectral range.

    • -spcrange=KW : Considers the table data column KW as the spectral dispersion. The range is defined by the column extrema.
    • -spcrange=value1,value2 Takes both numeric values as spectral range.

- -spcunit=unit : Gives the product spectral range unit. The default unit is CHANNEL NUMBER. It can not be converted in any other unit.

 Special keyword mapping Some keywords can be used in a specific way. They can for instance be rejected or they can be used to build product names or to populate collection attributes added by the operator at creation time (user attributes).

- -name=KW1,String,KW2... : When this option is set, products names (collection attribute namesasada) are built by concatenation of the keyword values and the constant strings listed in the parameter value. Constant names are quoted in "’". Example : -name=OBJECT,’seen by’,TELESCOP will generate names made with the value OBJECT keyword followd by the ’ seen by ’ string followed by the value of TELESCOP keyword. Name components are separated by blanks. Components not found are replaced with empty strings.

- -ename=KW1,String,.... : This parameter work the same way as above but it is used for table entries. In this case, keywords are table column names.

- -ignore=KW1,KW2 : gives a list of keywords which must be ignored by the dataloader. Wild cards are accepted here. Example : -ignore=NAXIS,T* make the dataloder ignoring NAXIS keywords and all other keywords beginning with a ’T’

- -eignore=KW1,KW2 : This parameter work the same way as above but it is used for table entries. In this case, keywords are table column names.

- -ukw UserKeyord=KW : This double parameter associates a user keyword UserKeyord with the product keyword KW. If the keyword KW is not found, the user keyword is set to null. The double parameter can be set several time in the command line.

- -eukw UserKeyord=KW : This parameter work the same way as above but it is used for table entries. In this case, keywords are table column names.

 Class Mapping Saada can merge keywords of various products in a single class (or SQL table). That is the SAADA_FUSION mode. Otherwise, products with the same keyword set are put in the same class. That is the mode SAADA_CLASSIFIER The mode SAADA_CLASSIFIER is the default mode, with class names generated by Saada. Class names are unique for the whole database.

- -classifier=ClassName : Ask the dataloader to run in SAADA_CLASSIFIER mode. The ClassName is mandatory. If multiple classes have to be created because the input product set is heterogeneous, classes are suffixed with a rank number (ClassName_1, ...). If the class name is already used (ClassName to ClassName_N), new class are also suffixed with a rank number (ClassName_N+1).

- -classfusion=ClassName : Ask the dataloader to run in SAADA_FUSION mode. The ClassName is mandatory. If the class name is already used (ClassName to ClassName_N), new class is suffixed with a rank number (ClassName_N+1).

>>Repository Mode : Set the input file management mode see article 197 :

- -repository=move : All loaded files are moved into the repository. In this mode, the download facility is properly set and input files are removed. Good mode to save disk room.

- -repository=no : The download facility stores the orginal path of input files to retreive them. This mode saves copy time but it assumes that input files will not be moved or that the download facility can fail.

 -debug : Makes the dataloader quite talkative.

last update 2016-06-07