Directory Templates

By default, the Hadoop FS destination uses directory templates to create output and late record directories. Hadoop FS writes records to the directories based on the configured time basis.

You can alternatively write records to directories based on the targetDirectory record header attribute. Using the targetDirectory attribute disables the ability to define directory templates.

When you define a directory template, you can use a mix of constants, field values, and datetime variables. You can use the every function to create new directories at regular intervals based on hours, minutes, or seconds, starting on the hour. You can also use the record:valueOrDefault function to create new directories from field values or a default in the directory template.

For example, the following directory template creates output directories for event data based on the state and timestamp of a record with hours as the smallest unit of measure, creating a new directory every hour:
 /outputfiles/${record:valueOrDefault("/State", "unknown")}/${YY()}-${MM()}-${DD()}-${hh()}
You can use the following elements in a directory template:
Constants
You can use any constant, such as output.
Datetime Variables
You can use datetime variables, such as ${YYYY()} or ${DD()}. The destination creates directories as needed, based on the smallest datetime variable that you use. For example, if the smallest variable is hours, then the directories are created for every hour of the day that receives output records.
When you use datetime variables in an expression, use all of the datetime variables between one of the year variables and the smallest variable that you want to use. Do not skip a variable within the progression. For example, to create directories on a daily basis, use a year variable, a month variable, and then a day variable. You might use one of the following datetime variable progressions:
${YYYY()}-${MM()}-${DD()}
${YY()}_${MM()}_${DD()}
For details about datetime variables, see Datetime Variables.
every function
You can use the every function in a directory template to create directories at regular intervals based on hours, minutes, or seconds, beginning on the hour. The intervals should be a submultiple or integer factor of 60. For example, you can create directories every 15 minutes or 30 seconds.
Use the every function to replace the smallest datetime variable used in the template.
For example, the following directory template creates directories every 5 minutes, starting on the hour:
/HDFS_output/${YYYY()}-${MM()}-${DD()}-${hh()}-${every(5,mm())}
For details about the every function, see Miscellaneous Functions.
record:valueOrDefault function
You can use the record:valueOrDefault function in a directory template to create directories with the value of a field or the specified default value if the field does not exist or if the field is null:
${record:valueOrDefault(<field path>, <default value>)}
For example, the following directory template creates a directory based on the Product field every day, and if the Product field is empty or null, uses Misc in the directory path:
/${record:valueOrDefault("/Product", "Misc")}/${YY()}-${MM()}-${DD()}
This template might create the following paths:
/Shirts/2015-07-31 
/Misc/2015-07-31