# Processor configuration # ## AOI ## The *AOI* entity represents a logial hirarchy in a processor definition that is mapped also in the DES folder structure. Each processor can have 1 - n *AOI*, also with the same name if required. ```xml <EURAC_GENERIC_P> <Processing> <AOI name="yourAOI" active="yes"> ... ... </AOI> </Processing> </EURAC_GENERIC_P> ``` ### Attributes ### **name** = : Name of the AOI **active** = yes | no : Enable or disable AOI [**sleepFactor**] = positive int : Factor that multiplies the *AOI_THREAD_SLEEP*, namely the sleeping interval for the threads inside AOIs. This enables to have AOIs where Threads have bigger sleeping intervals ### Tags ### [[Distribution] [Download] [Task] [TaskGroup] [DataCleaner]] (1 - n) ## Distribution ## The *Distribution* entity provides file transfer functionality for uploading data ```xml <Distribution type="ftp" active="no" priority="NORM"> <consumer> </consumer> </Distribution> ``` ### Attributes ### **type** = ftp | sftp : Transfer protocol **active** = yes | no : Enable or disable Distribution **priority** = MIN | NORM | MAX : Thread priority ### Tags ### Consumer (1 - n) ## Download ## The *Download* entity provides file transfer functionality for downloading data ```xml <Download type="sftp" active="yes" priority="NORM"> <consumer> </consumer> </Download> ``` ### Attributes ### **type** = sftp : Transfer protocol **active** = yes | no : Enable or disable Distribution **priority** = MIN | NORM | MAX : Thread priority ### Tags ### Consumer (1 - n) ## Consumer ## The *Consumer* entity represents the configuration for *Distribution* and *Download* entities ```xml <consumer name="user@host.yourdomain" active="yes"> <ProductType>*.txt</ProductType> <EMail>mail@yourdomain</EMail> <NotifyEmail>false</NotifyEmail> <Recursive>true</Recursive> </consumer> ``` ### Attributes ### **name** = : Name reference for authentication in DistributionAuth.xml **active** = yes | no : Enable or disable Consumer NOTE: It is worthwhile noting that in this file each name reference should be different from the others. It is strongly suggested to use a naming convention such as user@hostname as indicated in the follow: ```xml <AuthRef name="user@host.yourdomain" active="yes"> <Host>host.yourdomain</Host> <User>user</User> <Pwd>pass</Pwd> <Param></Param> </AuthRef> ``` ### Tags ### **RunOnTrigger** = : Absolute path of file that serves as trigger (has priority over StartDate/StopDate and overrides Runnable state). Trigger file will be deleted IFF ExitCode == 0 || ExitCode == 1 **StartDate** = Date Time to start the Task (YYY-MM-DDTmm:hh:ss) **StopDate** = Date Time to stop the Task (YYY-MM-DDTmm:hh:ss) **FileType** (1 - n) = : regular expresison for file type filtering (default: *.*) **MD5Filter** = true | false : Filters files defined in FileType/ProductType with check on exiting MD5 file. NOTE: for Downloads this can be set to 'false' whenever the server side file does not have an equivalent md5 file. For Distributions and Tasks it is advisable to have the default 'true' value in order to have a file completion validation, else the dataflow logic needs to guarantee file completion (default: true) **MD5Generation** = true | false : If not present, generates the MD5 file. NOTE: <MD5Filter> must be false in this case in order to have effect **MD5Check** = true | false: IFF <Download> AND <MD5Filter>==true recomputes the md5 checksum and compares to .md5 file **EMail** (1 -n ) = : Email notification recipient **NotifyEmail** = true (OnError) | OnSuccess | OnError | OnWarning | false : To enable e-mail notifications on data delivery **RemoteDestinationPath** = : Path for directory in destination Host (default: $AOI/$PROCESSOR ) [ Deprecated keyword 'DestinationPath' is still valid ] **LocalDestinationPath** = : Path for directory local Host (default: $BASEPATH/$AOI/$PROCESSOR) **ProductRename** = : File name delivered to remote Host (default: filname as filtered by FileType) **Recursive** = true | false : Enable recurive filtering and upload of data in directory hierarchies. Currently only applicable to <Distribution> ## DataCleaner ## The *DataCleaner* entity provides functionality for deleting filesystem contents ```xml <DataCleaner name="HOME" type="filesystem" active="yes" priority="NORM"> <EMail>mail@host.yourdomain</EMail> <NotifyEmail>yes</NotifyEmail> <IncludePath>/raid0/abz01rstest.eurac.edu/AdminCoA/*</IncludePath> <ExcludePath>/raid0/abz01rstest.eurac.edu/AdminCoA/doNOTdelete/*</ExcludePath> <ExcludeFileType>*.java</ExcludeFileType> <Size>+1k</Size> <LastAccess>+1</LastAccess> </DataCleaner> ``` ### Attributes ### **name** = : Name of DataCleaner **type** = filesystem : Type of data Cleaner **active** = yes | no : Enable or disable DataCleaner **priority** = MIN | NORM | MAX : Thread priority ### Tags ### **RunOnTrigger** = : Absolute path of file that serves as trigger (has priority over StartDate/StopDate and overrides Runnable state). Trigger file will be deleted IFF ExitCode == 0 || ExitCode == 1 **StartDate** = Date Time to start the Task (YYY-MM-DDTmm:hh:ss) **StopDate** = Date Time to stop the Task (YYY-MM-DDTmm:hh:ss) **EMail** (1 -n ) = : Email notification recipient **NotifyEmail** = true (OnError) | OnSuccess | OnError | OnWarning | false : To enable e-mail **IncludePath** (1 - n) = : Path to be included in the cleaning policy **ExcludePath** (1 - n) = : Path to be excluded in the cleaning policy **ExcludeFileType** (1 - n) = : Regular expression for file types to be excluded from the cleaning policy **Size** = +int ( +1c, +1k, +1M, +1G) : Size in byte, kilobyte, megabyte, gigabyte (default: MIN_SIZE = "+1000k") **LastAccess** = +int : Nr of days (default: LAST_ACCESS = "30") **LastModified** = : +int : Nr of days (default: LAST_MODIFIED = "+30"; // Days ago from the current date) **Type** = **f** (file) | **d** (directory) | **l** (link) : Type of data, only a single value allowed (default: f) NOTE: Only available on Linux/UNIX compatible environments OR Windows with CyWin ## Task ## The *Task* entity encapsulates a given command or code to be executed concurrently in form of a dedicated execution thread ```xml <Task name="testPerl" type="cmd" active="yes" priority="NORM"> <EMail>mail@host.yourdomain</EMail> <NotifyEmail>false</NotifyEmail> <Command>perl</Command> <FileType>*.txt</FileType> <Parameter>./test_data/scripts/testPerl.pl</Parameter> <Parameter>$CWD</Parameter> </Task> ``` ### Attributes ### **name** = : Name of executing Task **type** = mail | class | code | cmd : Execution type **active** = yes | no : Enable or disable Task **priority** = MIN | NORM | MAX : Thread priority ### Tags ### **RunOnTrigger** = : Absolute path of file that serves as trigger (has priority over StartDate/StopDate and overrides Runnable state). Trigger file will be deleted IFF ExitCode == 0 || ExitCode == 1 **StartDate** = Date Time to start the Task (YYY-MM-DDTmm:hh:ss) **StopDate** = Date Time to stop the Task (YYY-MM-DDTmm:hh:ss) **Email** (1 -n ) = : Email notification recipient **NotifyEmail** = true (OnError) | OnSuccess | OnError | OnWarning | false : To enable e-mail **Command** = class | code | cmd (sh | perl | python | cmd ) : Command to be executed **FileType** | ProductType (1 - n) = : regular expresison for file type filtering. If not defined task will be executed without passing a file list **LocalDestinationPath** = : Relative Path for directory to list File specified by a given <FileType> (relative to directory: $BASEPATH/$AOI/$PROCESSOR) **MD5Filter** = true | false : Filters files defined in FileType/ProductType with check on exiting MD5 file. NOTE: for Downloads this can be set to 'false' whenever the server side file does not have an equivalent md5 file. For Distributions and Tasks it is advisable to have the default 'true' value in order to have a file completion validation, else the dataflow logic needs to guarantee file completion (default: true) **LOCKFilter** = true | false : Filters files defined in FileType/ProductType with check on exiting LOCK file. This is used whenever a given Task should have exclusive access to a file, i.e. for parallelization. The LOCK file is deleted automatically whenever a Tasks returns form the execution of a given file **Parameter** (1 - n) = : Parameters to Command **<...>** (1 - n) = : Custom Tags that can be added and are passed as hastable (only available to Commands of type class ## TaskGroup ## The *TaskGroup* entity encapsulates a given set of *Task* entities and provides *serial* or *parallel* execution ```xml <TaskGroup name="taskGroup1" type="parallel" active="yes" priority="NORM"> <Task> ... </Task> <Task> ... </Task> </TaskGroup> ``` ### Attributes ### **name** = : Name of executing Task **type** = parallel | serial : Run tasks in parallel or serial mode within this TaskGroup **active** = yes | no : Enable or disable the TaskGroup **priority** = MIN | NORM | MAX : Threads priority ### Tags ### **RunOnTrigger** = : Absolute path of file that serves as trigger (has priority over StartDate/StopDate and overrides Runnable state). Trigger file will be deleted IFF ExitCode == 0 || ExitCode == 1 **StartDate** = Date Time to start the Task (YYY-MM-DDTmm:hh:ss) **StopDate** = Date Time to stop the Task (YYY-MM-DDTmm:hh:ss) **EMail** (1 -n ) = : Email notification recipient **NotifyEmail** = true (OnError) | OnSuccess | OnError | OnWarning | false : To enable e-mail **Task** (1 -n) ## Built-in variables ## Inside a *Processor* definition configuration file some build-in parameter can be used to abstract some configurations. The build-in parameters are replaced at runtime in all <Parameter> and in all custom tags (eg. *$DAY*) *$BASEPATH* The base path variable configured in *DES.ini* *$AOI* The name of the executing *AOI* *$PROCESSOR* The name of the executing *Processor* *$CWD* The current working directory, that expands to *$BASEPATH/$AOI/$PROCESSOR* *$STAMPS_PATH* The path for Stamp files configured in *DES.ini* *$FILE* Iff *<FileType>* is configured, references the current File being processed (*<ProductType>* is deprecated but still valid for backward compatibility) *$YEAR* The current year *$MONTH* The current month *$DAY* The current day