Body File
Body file may also be referred to as "bodyfile", however official documentation refers to it as body file (two separate words).
The body file format is a delimiter-separated output timeline format (as
far as known) introduced by the The Sleuth Kit. Body files are pipe
(|) delimited and are referred to as an "intermediate file", as they are not
sorted chronologically and are often staged for post-processing. Subsequent
timeline sorting is done via the mactime tool.
Data & Fields
All times within a body file are reported in UNIX time format. Lines that start
with # are ignored and treated as comments.
There was a rewrite of body file output with The Sleuth Kit 3.0. However, some tools may still utilize version 2.x.
Default output fields are as follows:
Version 2.x
MD5 | path/name | device | inode | mode_as_value | mode_as_string | num_of_links | UID | GID | rdev | size | atime | mtime | ctime | block_size | num_of_blocks
Version 3.x
MD5 | name | inode | mode_as_string | UID | GID | size | atime | mtime | ctime | crtime
Note: Outputs do not include spaces between values and pipes; we have inserted them here for easier readability.
Observed behavior and known issues
Known shortcomings with body file format are:
- Undocumented granularity of timestamp, current implementation by The Sleuth Kit appears to be seconds. See here.
- Undocumented extended file mode of
-/-rrwxrwxrwx. Characters-/-are appended to the body file entry andrfile entry type indication, presumably to indicate a "regular file". - Undocumented and inconsistent application of TSK metadata addresses. See here
- Undocumented and inconsistent application of owner identifier (UID). See here.
- Date and time values do not indicate a time zone or if daylight savings applies. Timestamps can be in either UTC or local time depending on the original file system.
- Body file encoding is not specified, UTF-8 is assumed.
- It is unclear how "invalid" Unicode characters should be handled, such as unpaired surrogates in NTFS file names.
- The name field can contain
($FILE_NAME)to indicate the body file entry was derived from a NTFS$FILE_NAMEattribute instead of$STANDARD_INFORMATIONand$DATAattributes. Note that the exact behavior is not documented by the Sleuth Kit project. - The
namefield can contain-> symbolic_link_targetbutflsdoes not appear to support this for NTFS. Also see here. - It is unclear if the symbolic link target can be used in combination with the
($FILE_NAME)suffix. - It is unclear which characters should be escaped, by observation
|and\are both escaped with\in the name field byflsbutmactimeis unable to handle a name that contains the|character. See here. Note: Other implementations are known to not escape\. - It is unclear from the specification if control characters should be
escaped. See here. Other implementations are known to escape control characters as
x##where##contains the hexadecimal byte value of the control character. - The format of the
inodefield is unclear for file systems like NTFS, the documentation indicates that it uses a TSK Metadata Address, however by observation the implementation is TSK specific and does not seem to match what is documented. Also see here. Other implementations are known to store the 64-bit NTFS file reference value as theinodefield. - The format of the
mode_as_stringfield is unclear for file systems like NTFS, this likely can be derived from the source code. Also see here. - The Sleuth Kit currently does not correctly identify symbolic links for NTFS in the body file output. Also see here.
- The
atime,mtime,ctime, andcrtimetypically contain the number of seconds since January 1, 1970. It is unknown if a fractional part is allowed by specification. The corresponding mactime tool does allow for a fractional part to be present but ignores it. Also see here. This limits the usefulness of the format for timelines with a vast amount of sub-second activity. - The format of the MD5 field is undefined, however documentation indicates:
- If hashing is disabled, the value will be
0. - If hashing is enabled, but no MD5 was calculated, the value will be
00000000000000000000000000000000. See here.
- If hashing is disabled, the value will be
FAT-12, FAT-16 and FAT-32
Output of fls includes:
* regular files and directories
* volume label directory entries
* Virtual metadata file $MBR, which represents the FAT Boot Record
* Virtual metadata file $FAT#, which represents the File Allocation Table, where # is the number of the table e.g. 1 or 2
* Virtual metadata file $OrphanFiles
Note that FAT-12, FAT-16 and FAT-32 have no root directory entry.
The MD5 calculation of fls includes:
* Contents of regular files
* Contents of the directory entries data stream
* Contents of volume label directory entries, with " (Volume Label Entry)" appended to the name
* Contents of "Virtual metadata files/directories" like $MBR
Noteworthy observed behavior:
* the root directory and virtual metadata files have a mode_as_string value of ----------
* the inode for regular file entries can be calculated as following: (((offset of directory entry / sector size) - data area start sector) * (sector size / directory entry size)) + 3 + ((offset of directory entry % sector size) / directory entry size), where 3 is a hardcoded "first normalized inode number"
NTFS
Output of fls includes:
* regular files and directories
* symbolic links and junctions
* $FILE_NAME attributes, with " ($FILE_NAME)" appended to the name
* Alternate Data Streams (ADS) like $BadClus:$Bad
* Named indexes like $Secure:$SII
* file system metadata files like $MFT and $Bitmap
* Virtual metadata file $OrphanFiles
Note that the root directory entry is not included.
The MD5 calculation of fls includes:
* Contents of regular files
* Contents of file system metadata files. Note that $BadClus:$Bad is treated as it would be 0 bytes in size
* Contents of the directory entries data stream
* Contents of symbolic links data stream, not its target
Noteworthy observed behavior: * Multiple entries for the same NTFS ADS. Also see here.
HFS+ and HFSX
Output of fls includes:
* regular files and directories
* symbolic links, with " -> " followed by the symbolic link target appended to the name
* Virtual metadata file $CatalogFile
The MD5 calculation of fls includes:
* Contents of regular files
* Contents of the directory entries data stream
* Contents of symbolic links data stream, not its target
* Virtual metadata files like $CatalogFile
Noteworthy observed behavior:
* On HFS+ and HFSX the / character in a file name will be replaced by :, which
corresponds with the behavior of Mac OS Terminal. Also see here.
* For hard links on HFS+ the Catalog Node Identifier (CNID) of the link target (indirect node) file record is used instead as the inode value instead of the CNID of the (hard link) file record itself. This matches the behavior of Mac OS (file) stat as described here, in the section "Hard Links".
ext2, ext3 and ext4
Output of fls includes:
* regular files and directories
* symbolic links, with " -> " followed by the symbolic link target appended to the name
* Virtual metadata file $OrphanFiles
Note that the root directory entry is not included.
The MD5 calculation of fls includes:
* Contents of regular files
* Contents of the directory entries data stream
* Contents of named pipes, character devices but not block devices
* Contents of symbolic links data stream, not its target
* Virtual metadata files like $OrphanFiles
Output Format
Sleuthkit 4.7.0 was used to create the example below from a sample NTFS MFT file entry:
Body file data:
0|/$BadClus|8-128-2|r/rr-xr-xr-x|0|0|0|1580550524|1580550524|1580550524|1580550524
0|/$BadClus:$Bad|8-128-1|r/rr-xr-xr-x|0|0|7270400|1580550524|1580550524|1580550524|1580550524
0|/$BadClus ($FILE_NAME)|8-48-3|r/rr-xr-xr-x|0|0|82|1580550524|1580550524|1580550524|1580550524
Corresponding NTFS MFT entry:
MFT Entry Header Values:
Entry: 8 Sequence: 8
$LogFile Sequence Number: 1069751
Allocated File
Links: 1
$STANDARD_INFORMATION Attribute Values:
Flags: Hidden, System
Owner ID: 0
Security ID: 256 ()
Created: 2020-02-01 10:48:44.857384500 (CET)
File Modified: 2020-02-01 10:48:44.857384500 (CET)
MFT Modified: 2020-02-01 10:48:44.857384500 (CET)
Accessed: 2020-02-01 10:48:44.857384500 (CET)
$FILE_NAME Attribute Values:
Flags: Hidden, System
Name: $BadClus
Parent MFT Entry: 5 Sequence: 5
Allocated Size: 0 Actual Size: 0
Created: 2020-02-01 10:48:44.857384500 (CET)
File Modified: 2020-02-01 10:48:44.857384500 (CET)
MFT Modified: 2020-02-01 10:48:44.857384500 (CET)
Accessed: 2020-02-01 10:48:44.857384500 (CET)
Attributes:
Type: $STANDARD_INFORMATION (16-0) Name: N/A Resident size: 72
Type: $FILE_NAME (48-3) Name: N/A Resident size: 82
Type: $DATA (128-2) Name: N/A Resident size: 0
Type: $DATA (128-1) Name: $Bad Non-Resident size: 7270400 init_size: 0
Behavior of the TSK Metadata Address:
8-128-2references the$DATAattribute in MFT entry 8, note that this$DATAattribute on-disk is the 3rd attribute in the MFT entry.8-128-1references the$DATAattribute in MFT entry 8, note that this$DATAattribute on-disk is the 4th attribute in the MFT entry.8-48-3references the first$FILE_NAMEattribute in MFT entry 8, note that the$FILE_NAMEattribute on-disk is the 2nd attribute in the MFT entry.
Note that due to an issue within The Sleuth Kit, NTFS metadata addresses for $FILE_NAME attributes in an $ATTRIBUTE_LIST are not unique, and it is not deterministic as to what the "first" MFT attribute is. See here.
Also see
External Links
- Body file - SleuthKit Wiki
- Bodyfile format, by dfImageTools project