Skip to main content

RAR

Supported formats

RAR (up to and including WinRAR version 7)

Description

This backend extracts files and metadata from RAR archives, which are widely used across different operating systems.

info

Available in Contextal Platform 1.0 and later.

Features

The backend provides the following information about the archive itself:

  • archive comment
  • list of directories
  • whether archive uses encrypted headers
  • whether archive has a recovery record
  • whether archive is locked/has lock attribute
  • whether archive file is the first volume in multi-volume archive
  • whether archive is one of volumes in multi-volume archive
  • whether archive follows new numbering scheme for volume names
  • whether archive is signed
  • whether archive is solid

The following information is provided about each archive entry:

  • file modification time
  • optional atime, mtime (different precision and format from the above) and ctime
  • operating system, which created an archive entry
  • file attributes (according to an operating system, which created an archive entry)
  • hash function used to verify the entry's integrity
  • CRC32 hash
  • optional BLAKE2sp hash
  • whether the entry is encrypted
  • file path-and-name
  • dictionary size
  • compressed size, optional uncompressed size, and optional compress ratio
  • unrar version required to extract an entry
  • compression method
  • redirection type (none, symlink, hardlink, junction, etc.)
  • optional redirection target

Symbols

Object

  • COMMENT_TRUNCATED → the archive comment is truncated
  • COMMENT_EXTRACTION_FAILED → the archive comment couldn't be extracted
  • LIMITS_REACHED → limits triggered while processing the archive
  • HAS_CHECKSUM_INCONSISTENCY → checksum mismatches have been detected for some children

Children

  • INVALID_FILETIME → invalid file time
  • INVALID_MTIME → invalid modified timestamp
  • INVALID_ATIME → invalid access timestamp
  • INVALID_CTIME → invalid changed timestamp
  • INVALID_PASSWORD → invalid password
  • TOOBIG → this child object was not extracted as it exceeds the limits
  • ENCRYPTED → this child object is encrypted
  • DECRYPTED → this child object has been successfully decrypted
  • PARTIAL_DATA → thus child object has been partially extracted as it's split between multiple volumes
  • SOLID → this child object is part of a solid archive
  • CHECKSUM_MISMATCH → a checksum mismatch was detected for the extracted child

Example Metadata

{
"org": "ctx",
"object_id": "c41b74238fecce0feafe456a72fbe357eae2528b108d7583ff13c1833d5a092b",
"object_type": "RAR",
"object_subtype": null,
"recursion_level": 1,
"size": 343171,
"hashes": {
"sha1": "7aea001d8856c170fe64c9b754ad5c16088d2ab9",
"sha512": "b1e8438e53cfd14536a4b30350fdbdc427670a9c746fa41f4fb8f3441ddb381230ac03d4fe1d3016e9505dc7225a1b40c11ec8a2f73c5e3662ef0e2d98c791e7",
"md5": "4eb98136cd894399a8776593bd221954",
"sha256": "c41b74238fecce0feafe456a72fbe357eae2528b108d7583ff13c1833d5a092b"
},
"ctime": 1726235643.045153,
"ok": {
"symbols": [],
"object_metadata": {
"_backend_version": "1.0.0",
"directories": [],
"has_comment": false,
"has_encrypted_headers": false,
"has_recovery_record": false,
"is_locked": false,
"is_multivolume_part": false,
"is_multivolume_start": false,
"is_new_numbering_scheme": true,
"is_signed": false,
"is_solid": false
},
"children": [
{
"org": "ctx",
"object_id": "2e66b7e7982b88190544e9cfca34279e11c208302187d1d3c4160d3191768c53",
"object_type": "PE",
"object_subtype": null,
"recursion_level": 2,
"size": 429056,
"hashes": {
"md5": "5aaa85df01c62cdacb612c7bb3f72728",
"sha1": "20e3a2e9daa2c64b2be4df2cbcdb3b75ecd24d73",
"sha256": "2e66b7e7982b88190544e9cfca34279e11c208302187d1d3c4160d3191768c53",
"sha512": "2340517a991fbee3a30e675623919c3922ef578116d7c2320eb1b13c37f87290cf615eacb97dc49d63a3e1afee902577a13e0d8b03f87c9f8a0f449200951655"
},
"ctime": 1726235643.045153,
"relation_metadata": {
"attributes": "000040",
"compress_ratio": 1.250910401344299,
"compressed_size": 342995,
"compression_method": "normal",
"creation_os": "Windows",
"dict_size": 524288,
"expected_crc32": "8cd3572a",
"filetime": {
"parsed": 1546902892
},
"hash_type": "CRC32",
"is_encrypted": false,
"mtime": {
"parsed": 1546902892
},
"name": "Purchase Order1819.exe",
"redir_type": "None",
"uncompressed_size": 429056,
"version_to_extract": "5.0"
},
[...]

Example Queries

object_type == "RAR"
&& @has_child(object_type == "PE"
&& @has_name(iregex("(order|invoice)"))
)
  • This matches a RAR object, which has a children of PE type (Windows executable), which name contains order or invoice substrings (case insensitive).

Configuration Options

  • max_processed_size → maximum size of the input object that will be processed (default: 262144000)
  • max_children → maximum number of children objects to create (default: 100)
  • max_child_input_size → maximum size of a single input children object (default: 41943040)
  • max_child_output_size → maximum size of a single output children object (default: 41943040)
  • max_entries_to_process → maximum number of archive entries that will be processed (this includes entries such as directories, for which no children are created)