The Directory origin reads files in ascending order
based on the timestamp or file name:
- Last Modified Timestamp
- The Directory
origin can read files in ascending order based on the timestamp associated
with the file. The origin checks both the last-modified timestamp and the
changed timestamp, then uses the highest - the more recent - of the two when
ordering files for processing.
- When the origin reads from a secondary location - not the directory where
the files are created and written - the last-modified timestamp should
reflect when the file was moved to the directory to be processed. When files
are copied to the secondary location, the changed timestamp should reflect
when the file was copied to the directory to be processed.
-
Tip: Avoid moving files using commands that preserve the
existing timestamp, such as cp -p
. Preserving the
existing timestamp can be problematic in some cases, such as moving
files across time zones.
- When ordering based on timestamp, any files with the same timestamp are read
in lexicographically ascending order based on the file names.
- For example, when reading files with the
log*.json
file
name pattern, the origin reads the following files in the following
order:
-
File Name
|
Last Modified
|
Changed
Timestamp |
log-1.json
|
APR 24 2016 14:03:35
|
APR 24 2016 14:03:35 |
log-903.json
|
APR 24 2016 14:05:03
|
APR 24 2016 14:05:03 |
log-0054.json
|
APR 24 2016 14:00:03
|
APR 24 2016 14:40:44 |
log-2.json
|
APR 24 2016 14:45:11
|
APR 24 2016 14:45:11 |
log-3.json |
APR 24 2016 14:45:11 |
APR 24 2016 14:45:11 |
- Notice, log-0054.json is processed after log-903.json because its changed
timestamp is later than both the last-modified and changed timestamp of
log-903.json. The log-2.json and log-3.json files have identical timestamps,
and so are processed in lexicographically ascending order based on their
file names.
- Lexicographically Ascending File Names
- The Directory origin can read files in lexicographically ascending order
based on file names. Note that lexicographically ascending order reads the
numbers 1 through 11 as follows:
-
1, 10, 11, 2, 3, 4... 9
- For example, when reading files with the
web*.log
file name
pattern, Directory reads the following files in the following
order:web-1.log
web-10.log
web-11.log
web-2.log
web-3.log
web-4.log
web-5.log
web-6.log
web-7.log
web-8.log
web-9.log
- To read these files in logical and lexicographically ascending order, you
might add leading zeros to the file naming convention as
follows:
web-0001.log
web-0002.log
web-0003.log
...
web-0009.log
web-0010.log
web-0011.log