Skip to main content

Office

Supported formats

Microsoft Word and Excel (all popular versions)

Description

The backend performs a throughout analysis of Office files, extracts text, VBA code, embedded objects and a plethora or properties.

info

Available in Contextal Platform 1.0 and later.

Features

Encryption

Supported encryption formats:

  • XOR Obfuscation
  • Office Binary Document RC4 Encryption
  • Office Binary Document RC4 CryptoAPI Encryption
  • ECMA-376 Standard Encryption
  • ECMA-376 Agile Encryption

Supported agile algorithms:

  • AES-128
  • AES-192
  • AES-256

Supported agile counters:

  • CBC
  • CFB-8

Supported agile hashes:

  • SHA1
  • SHA256
  • SHA384
  • SHA512

VBA

Visual Basic for Applications (VBA) structures are stored in two flavors: in the version-independent format (which is publicly documented) and in the version-dependent format (the infamous, undocumented, PerformanceCache). Since the version-independent data is regularly stomped by malware authors and almost never used by MS Office, this backend also extracts the cached data and decompiles the P-Code.

Although Office forms are rare these days, they are often abused by malware to store fragments of code; this backend extracts all forms as metadata.

Symbols

Object

  • ENCRYPTED → the document is encrypted
  • DECRYPTED → the document has been successfully decrypted
  • LIMITS_REACHED → limits triggered while processing the document
  • VBA → the document contains VBA (Visual Basic for Applications)
  • CORRUPTED_VBA → the document contains VBA, but it cannot be parsed
  • HAS_FORMS → VBA contains forms
  • HAS_MACRO_SHEET → the document contains Excel 4.0 macrosheets
  • OLE → the document is contained within OLE
  • DOC → the document is a legacy Word document
  • XLS → the document is a legacy Excel document
  • DOCX → the document is an ooxml Word document
  • XLSX → the document is an ooxml Excel document

Children

  • ENCRYPTED → the parent (the document) is encrypted
  • DECRYPTED → this child object has been successfully decrypted
  • VBA → this child object contains result from processing VBA project
  • DECOMPILED → the VBA project was decompiled from P-Code
  • CORRUPTED → the VBA contains corrupted content
  • TOOBIG → this child object was truncated or was not stored as it exceeds the limits
  • NOT_FOUND → this child object was referenced in the document but not found in the document container

Example Metadata

{
"org": "ctx",
"object_id": "587bfa9fe6162e0c74dfaa1e48b2ff1b596f803b95648d362daa412cf9dcbb3a",
"object_type": "Office",
"object_subtype": "DOCX",
"recursion_level": 5,
"size": 250807,
"hashes": {
"sha256": "587bfa9fe6162e0c74dfaa1e48b2ff1b596f803b95648d362daa412cf9dcbb3a",
"sha1": "ee867ac81fcb3e51995dcf90aaad659cd340a750",
"sha512": "a9a4715ec8d374cfce9137bd5fa91e662f90bade289d07c7f9f134b5bef53338502642c2f14b745946a4dba8e3afb2d914e2aa32dad410c9a27e33837beedbe1",
"md5": "065fee6d19cb04e56ab15b1682c463b6"
},
"ctime": 1725869059.435991,
"relation_metadata": {
"decoded_size": 250807,
"encoded_size": 334443,
"mime_type": "application/msword"
},
"ok": {
"symbols": [
"DOCX",
"VBA"
],
"object_metadata": {
"_backend_version": "1.0.0",
"properties": {
"app_version": "14.0000",
"application": "Microsoft Office Word",
"characters": 565152,
"characters_with_spaces": 662976,
"created": "2019-06-27 10:51:00.0 +00:00:00",
"creator": "",
"doc_security": {
"locked": true,
"password_protected": false,
"read_only_enforced": false,
"read_only_recommended": false
},
"hyperlinks_changed": false,
"last_modified_by": "",
"lines": 4709,
"links_up_to_date": false,
"modified": "2019-06-27 11:57:00.0 +00:00:00",
"pages": 120,
"paragraphs": 1325,
"revision": "1",
"scale_crop": false,
"shared_doc": false,
"template": "Normal.dotm",
"total_time": "0s",
"words": 99149
},
"user_properties": {},
"vba": {
"rsvd2": 0,
"rsvd3": 1,
"version": 151
}
}
}
}

Example Queries

object_type == "Office"
&& @match_object_meta($properties.doc_security.password_protected == true)
  • This query matches password protected documents.
object_type == "Office"
&& @match_object_meta($properties.app_version == "14.0000")
&& @has_symbol("VBA")
  • This matches documents created by a specific Office version and containing VBA macros.

Configuration Options

  • max_processed_size → maximum size of the input object that will be processed (default: 262144000)
  • max_children → maximum number of children objects to create (default: 100)
  • max_child_output_size → maximum size of a single children object (default: 41943040)
  • sheet_size_limit → size limit of Excel's sheet (default: 5242880)
  • shared_strings_cache_limit → size limit of shared strings cache (default: 10000000)
  • create_domain_children → whether to create Domain children out of collected domain names for further processing (default: true)