Skip to content
Snippets Groups Projects
Processors.md 10.3 KiB
Newer Older
Armin Costa's avatar
Armin Costa committed
 
Armin Costa's avatar
Armin Costa committed
# Processor configuration #
Armin Costa's avatar
Armin Costa committed


  
## AOI ##
 
Armin Costa's avatar
Armin Costa committed
The *AOI* entity represents a logial hirarchy in a processor definition that is mapped also in the DES folder structure.
Armin Costa's avatar
Armin Costa committed

Armin Costa's avatar
Armin Costa committed
Each processor can have 1 - n *AOI*, also with the same name if required.
Armin Costa's avatar
Armin Costa committed
 
```xml
<EURAC_GENERIC_P>
	<Processing>
		<AOI name="yourAOI" active="yes">
			...
			...
		</AOI>
	</Processing>
</EURAC_GENERIC_P>
Armin Costa's avatar
Armin Costa committed
```

 
### Attributes ###
 
Armin Costa's avatar
Armin Costa committed
**name** = : Name of the AOI
Armin Costa's avatar
Armin Costa committed

Armin Costa's avatar
Armin Costa committed
**active** = yes | no : Enable or disable AOI
Armin Costa's avatar
Armin Costa committed

Armin Costa's avatar
Armin Costa committed
[**sleepFactor**] = positive int : Factor that multiplies the *AOI_THREAD_SLEEP*, namely the sleeping interval for the threads inside AOIs. This enables to have AOIs where Threads have bigger sleeping intervals 
Armin Costa's avatar
Armin Costa committed

Armin Costa's avatar
Armin Costa committed
 
Armin Costa's avatar
Armin Costa committed
### Tags ###
 
Armin Costa's avatar
Armin Costa committed
[[Distribution] [Download] [Task] [TaskGroup] [DataCleaner]] (1 - n)

  
## Distribution ##
 
Armin Costa's avatar
Armin Costa committed
The *Distribution* entity provides file transfer functionality for uploading data 
Armin Costa's avatar
Armin Costa committed
 
```xml
<Distribution type="ftp" active="no" priority="NORM">
	<consumer>
	</consumer>
</Distribution>
```
 
### Attributes ###
 
Armin Costa's avatar
Armin Costa committed
**type** = ftp | sftp : Transfer protocol
Armin Costa's avatar
Armin Costa committed

Armin Costa's avatar
Armin Costa committed
**active** = yes | no : Enable or disable Distribution
Armin Costa's avatar
Armin Costa committed

Armin Costa's avatar
Armin Costa committed
**priority** = MIN | NORM | MAX : Thread priority
Armin Costa's avatar
Armin Costa committed

 
### Tags ###
 
Consumer (1 - n)

  
Armin Costa's avatar
Armin Costa committed
  
## Download ##
 
Armin Costa's avatar
Armin Costa committed
The *Download* entity provides file transfer functionality for downloading data 
Armin Costa's avatar
Armin Costa committed
 
```xml
<Download type="sftp" active="yes" priority="NORM">
	<consumer>
	</consumer>
</Download>
```
 
### Attributes ###
 
Armin Costa's avatar
Armin Costa committed
**type** = sftp : Transfer protocol
Armin Costa's avatar
Armin Costa committed

Armin Costa's avatar
Armin Costa committed
**active** = yes | no : Enable or disable Distribution
Armin Costa's avatar
Armin Costa committed

Armin Costa's avatar
Armin Costa committed
**priority** = MIN | NORM | MAX : Thread priority
Armin Costa's avatar
Armin Costa committed

 
### Tags ###
 
Consumer (1 - n)

  
  
Armin Costa's avatar
Armin Costa committed
## Consumer ##
 
Armin Costa's avatar
Armin Costa committed
The *Consumer* entity represents the configuration for *Distribution* and *Download* entities
Armin Costa's avatar
Armin Costa committed
 
```xml
<consumer name="user@host.yourdomain" active="yes">
	<ProductType>*.txt</ProductType>
	<EMail>mail@yourdomain</EMail>
	<NotifyEmail>false</NotifyEmail>
	<Recursive>true</Recursive>
</consumer>
```
 
### Attributes ###
 
**name** = : Name reference for authentication in DistributionAuth.xml

**active** = yes | no : Enable or disable Consumer

NOTE: It is worthwhile noting that in this file each name reference should be different from the others. It is strongly suggested to use a naming convention such as user@hostname as indicated in the follow: 
```xml
<AuthRef name="user@host.yourdomain" active="yes">
	<Host>host.yourdomain</Host>
	<User>user</User>
	<Pwd>pass</Pwd>
	<Param></Param>
</AuthRef>
```
 
### Tags ###
 
**RunOnTrigger** = : Absolute path of file that serves as trigger (has priority over StartDate/StopDate and overrides Runnable state). Trigger file will be deleted IFF ExitCode == 0 || ExitCode == 1 

**StartDate** = Date Time to start the Task (YYY-MM-DDTmm:hh:ss)

**StopDate** = Date Time to stop the Task (YYY-MM-DDTmm:hh:ss)

**FileType** (1 - n) = : regular expresison for file type filtering (default: *.*)

**MD5Filter** = true | false : Filters files defined in FileType/ProductType with check on exiting MD5 file. NOTE: for Downloads this can be set to 'false' whenever the server side file does not have an equivalent md5 file. For Distributions and Tasks it is advisable to have the default 'true' value in order to have a file completion validation, else the dataflow logic needs to guarantee file completion (default: true) 

**MD5Generation** = true | false : If not present, generates the MD5 file. NOTE: <MD5Filter> must be false in this case in order to have effect 

**MD5Check** = true | false: IFF <Download> AND <MD5Filter>==true recomputes the md5 checksum and compares to .md5 file 

**EMail** (1 -n ) = : Email notification recipient

**NotifyEmail** = true (OnError) | OnSuccess | OnError | OnWarning | false : To enable e-mail notifications on data delivery 

**RemoteDestinationPath** = : Path for directory in destination Host (default: $AOI/$PROCESSOR ) [ Deprecated keyword 'DestinationPath' is still valid ] 

**LocalDestinationPath** = : Path for directory local Host (default: $BASEPATH/$AOI/$PROCESSOR)

**ProductRename** = : File name delivered to remote Host (default: filname as filtered by FileType) 

**Recursive** = true | false : Enable recurive filtering and upload of data in directory hierarchies. Currently only applicable to <Distribution> 
Armin Costa's avatar
Armin Costa committed


## DataCleaner ##
 
Armin Costa's avatar
Armin Costa committed
The *DataCleaner* entity provides functionality for deleting filesystem contents
Armin Costa's avatar
Armin Costa committed
 
```xml
<DataCleaner name="HOME" type="filesystem" active="yes" priority="NORM">
	<EMail>mail@host.yourdomain</EMail>
	<NotifyEmail>yes</NotifyEmail>
	<IncludePath>/raid0/abz01rstest.eurac.edu/AdminCoA/*</IncludePath>
	<ExcludePath>/raid0/abz01rstest.eurac.edu/AdminCoA/doNOTdelete/*</ExcludePath>
	<ExcludeFileType>*.java</ExcludeFileType>
	<Size>+1k</Size>
	<LastAccess>+1</LastAccess>
</DataCleaner>
```
 
### Attributes ###

**name** = : Name of DataCleaner

**type** = filesystem : Type of data Cleaner

**active** = yes | no : Enable or disable DataCleaner

**priority** = MIN | NORM | MAX : Thread priority
 
 
### Tags ###
 
**RunOnTrigger** = : Absolute path of file that serves as trigger (has priority over StartDate/StopDate and overrides Runnable state). Trigger file will be deleted IFF ExitCode == 0 || ExitCode == 1 

**StartDate** = Date Time to start the Task (YYY-MM-DDTmm:hh:ss)

**StopDate** = Date Time to stop the Task (YYY-MM-DDTmm:hh:ss)

**EMail** (1 -n ) = : Email notification recipient

**NotifyEmail** = true (OnError) | OnSuccess | OnError | OnWarning | false : To enable e-mail 

**IncludePath** (1 - n) = : Path to be included in the cleaning policy

**ExcludePath** (1 - n) = : Path to be excluded in the cleaning policy

**ExcludeFileType** (1 - n) = : Regular expression for file types to be excluded from the cleaning policy 

**Size** = +int ( +1c, +1k, +1M, +1G) : Size in byte, kilobyte, megabyte, gigabyte (default: MIN_SIZE = "+1000k")

**LastAccess** = +int : Nr of days (default: LAST_ACCESS = "30")

**LastModified** = : +int : Nr of days (default: LAST_MODIFIED = "+30"; // Days ago from the current date)

**Type** = **f** (file) | **d** (directory) | **l** (link) : Type of data, only a single value allowed (default: f)
  
NOTE: Only available on Linux/UNIX compatible environments OR Windows with CyWin


Armin Costa's avatar
Armin Costa committed

## Task ##
 
Armin Costa's avatar
Armin Costa committed
The *Task* entity encapsulates a given command or code to be executed concurrently in form of a dedicated execution thread
Armin Costa's avatar
Armin Costa committed
 
```xml
<Task name="testPerl" type="cmd" active="yes" priority="NORM">
	<EMail>mail@host.yourdomain</EMail>
	<NotifyEmail>false</NotifyEmail>
	<Command>perl</Command>
	<FileType>*.txt</FileType>
	<Parameter>./test_data/scripts/testPerl.pl</Parameter>
	<Parameter>$CWD</Parameter>
</Task>
```
 
### Attributes ###
 
Armin Costa's avatar
Armin Costa committed
**name** = : Name of executing Task
Armin Costa's avatar
Armin Costa committed
**type** = mail | class | code | cmd : Execution type
Armin Costa's avatar
Armin Costa committed
**active** = yes | no : Enable or disable Task
Armin Costa's avatar
Armin Costa committed
**priority** = MIN | NORM | MAX : Thread priority
Armin Costa's avatar
Armin Costa committed

 
### Tags ###
 
**RunOnTrigger** = : Absolute path of file that serves as trigger (has priority over StartDate/StopDate and overrides Runnable state). Trigger file will be deleted IFF ExitCode == 0 || ExitCode == 1 

**StartDate** = Date Time to start the Task (YYY-MM-DDTmm:hh:ss)

**StopDate** = Date Time to stop the Task (YYY-MM-DDTmm:hh:ss)

**Email** (1 -n ) = : Email notification recipient

**NotifyEmail** = true (OnError) | OnSuccess | OnError | OnWarning | false : To enable e-mail 

Armin Costa's avatar
Armin Costa committed
**Command** = class | code | cmd (sh | perl | python | cmd ) : Command to be executed
Armin Costa's avatar
Armin Costa committed

**FileType** | ProductType (1 - n) = : regular expresison for file type filtering. If not defined task will be executed without passing a file list 

**LocalDestinationPath** = : Relative Path for directory to list File specified by a given <FileType> (relative to directory: $BASEPATH/$AOI/$PROCESSOR) 

**MD5Filter** = true | false : Filters files defined in FileType/ProductType with check on exiting MD5 file. NOTE: for Downloads this can be set to 'false' whenever the server side file does not have an equivalent md5 file. For Distributions and Tasks it is advisable to have the default 'true' value in order to have a file completion validation, else the dataflow logic needs to guarantee file completion (default: true) 

**LOCKFilter** = true | false : Filters files defined in FileType/ProductType with check on exiting LOCK file. This is used whenever a given Task should have exclusive access to a file, i.e. for parallelization. The LOCK file is deleted automatically whenever a Tasks returns form the execution of a given file

**Parameter** (1 - n) = : Parameters to Command

**<...>** (1 - n) = : Custom Tags that can be added and are passed as hastable (only available to Commands of type class 
Armin Costa's avatar
Armin Costa committed


## TaskGroup ##
 
Armin Costa's avatar
Armin Costa committed
The *TaskGroup* entity encapsulates a given set of *Task* entities and provides *serial* or *parallel* execution
Armin Costa's avatar
Armin Costa committed
 
```xml
<TaskGroup name="taskGroup1" type="parallel" active="yes" priority="NORM">
	<Task>
		...
	</Task>
	<Task>
		...
	</Task>
</TaskGroup>
```

### Attributes ###
 
**name** = : Name of executing Task

**type** = parallel | serial : Run tasks in parallel or serial mode within this TaskGroup

**active** = yes | no : Enable or disable the TaskGroup

**priority** = MIN | NORM | MAX : Threads priority
 
 
### Tags ###
 
**RunOnTrigger** = : Absolute path of file that serves as trigger (has priority over StartDate/StopDate and overrides Runnable state). Trigger file will be deleted IFF ExitCode == 0 || ExitCode == 1 

**StartDate** = Date Time to start the Task (YYY-MM-DDTmm:hh:ss)

**StopDate** = Date Time to stop the Task (YYY-MM-DDTmm:hh:ss)

**EMail** (1 -n ) = : Email notification recipient

**NotifyEmail** = true (OnError) | OnSuccess | OnError | OnWarning | false : To enable e-mail 

**Task** (1 -n)
 
 
Armin Costa's avatar
Armin Costa committed
## Built-in variables ##
  
Armin Costa's avatar
Armin Costa committed
Inside a *Processor* definition configuration file some build-in parameter can be used to abstract some configurations. 
Armin Costa's avatar
Armin Costa committed
The build-in parameters are replaced at runtime in all <Parameter> and in all custom tags (eg. *$DAY*)
Armin Costa's avatar
Armin Costa committed

  
*$BASEPATH* The base path variable configured in *DES.ini*

*$AOI* The name of the executing *AOI*

*$PROCESSOR* The name of the executing *Processor*

*$CWD* The current working directory, that expands to *$BASEPATH/$AOI/$PROCESSOR*

*$STAMPS_PATH* The path for Stamp files configured in *DES.ini*

*$FILE* Iff *<FileType>* is configured, references the current File being processed (*<ProductType>* is deprecated but still valid for backward compatibility)

*$YEAR* The current year

*$MONTH* The current month

*$DAY* The current day