File Compression Formats

Origins and processors that read files can read uncompressed files, compressed files, archives, and compressed archives.

Hadoop FS reads compressed files automatically. For other origins and processors that read files, you configure the compression format.

The following table lists the supported file types by extension:
Compression Format Description
Uncompressed Processes uncompressed files of the configured data format.
Compressed Processes files compressed by the following compression formats:
  • gzip
  • bzip2
  • xz
  • lzma
  • Pack200
  • DEFLATE
  • Z
Archive Processes files archived by the following archive formats:
  • 7z
  • ar
  • arj
  • cpio
  • dump
  • tar
  • zip
Compressed Archive Processes files in compressed archives created by supported compression and archive formats.