09 - File system analysis techniques Flashcards

(47 cards)

1
Q

Why do we want to analyze a file system?

A

To access overt content
To access deleted content
To access hidden content

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the default file systems for many distributions of Linux?

A

The ExtX file system family.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Overview of main ExtX data structures.

A

File contents are stored inside blocks (e.g., 4KB)
The blocks allocated to a file are kept by a record called inode
Directory entries associate the file name with the file’s inode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Inode block pointers

A

Each inode is the root of an unbalanced tree of blocks that
belong to a given file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do we keep track of inode and block allocation?

A

Using bitmaps.
Using bitmaps: bit arrays, each bit indicates allocation status
- Inode bitmap: tells which inodes are allocated to files
- Block bitmap: tells which data blocks are allocated to files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What happens in the creation of a new file in EXT2?

A

Create a new file
- Allocate a new inode (inode bitmap is updated)
- Allocate required blocks (block bitmap is updated)
- Allocate entry in directory (entry points to inode)
- Update data blocks, inode, and directory entry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What happens in the deletion of a file in EXT2?

A

Delete an existing file
- Update the inode and block bitmaps, unallocate directory entry
- Most contents of inode, data blocks, and directory entry remain intact.

Deleting a file involves updating the record length of previous file entry in the directory.

No diretório, não é necessário apagar o nome fisicamente; em vez disso:

O rec_len da entrada anterior é atualizado para “engolir” a entrada do arquivo deletado.

Assim, a próxima leitura do diretório pula a entrada apagada porque agora a entrada anterior ocupa o espaço dela.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are directories?

A

Directories are special files that contain file name entries.

No Ext2, um diretório não é um arquivo especial com estrutura própria separada.

Ele é simplesmente armazenado em um ou mais blocos do filesystem, assim como qualquer outro arquivo.

O conteúdo desses blocos consiste em uma sequência de entradas de diretório (directory entries).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does a journal do?

A

Journal records transactions of FS operations.

A journal is a special area of the disk (or sometimes a file) that the file system uses to keep a record (“log”) of recent operations before they are fully written to the main disk structures.

Think of it as a diary of pending updates to the file system.

If the system crashes or loses power in the middle of those updates, the disk can become inconsistent - e.g., a file is marked in the directory but its inode was never updated.

The journal prevents that by letting the system replay or roll back incomplete changes after a crash.

When the system wants to modify something, it first writes the intended changes to the journal (a log).

Once the journal entry is safely written to disk, the actual data structures are updated.

When that’s done, the journal entry is marked as “committed.”

If the computer crashes during an update:

On reboot, the system checks the journal.

If it finds uncommitted changes, it replays them (finishes them) or discards them safely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why are journals good for forensic?

A

🔍 2️⃣ Journals as circular buffers

The journal isn’t infinite — it’s usually a fixed-size region on disk (say a few MBs).

It’s implemented as a circular buffer, meaning:

New transactions are appended at the “front”.

When it gets full, the system wraps around and overwrites the oldest entries.

So the journal constantly cycles through old and new file system updates.

🔁 3️⃣ Append-only behavior

Journals are append-only — you only add new records.

When updates (transactions) are no longer needed for crash recovery, the OS simply moves a pointer that marks them as “freed”.

Nothing is physically deleted.

The old data remains on disk until it’s overwritten by new transactions.

🕵️‍♀️ 4️⃣ Why this is useful for forensics

Because old journal entries linger on disk (sometimes for minutes or hours before being overwritten), forensic analysts can:

Recover traces of recent file activity, even after files were deleted or modified.

See metadata about files that once existed — their names, inode numbers, timestamps, etc.

Potentially reconstruct who did what recently, even if the current filesystem no longer shows it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In which node does Ext3 typically maintain a journal?

A

In inode 8.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

On-disk organization of a ExtX file system.

A

Organized as sequence of logical blocks.
Blocks are grouped into larger units called block groups.
The first data block aka boot block is not used by the FS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the superblock?

A

Group block internals: The superblock
Superblock: contains fundamental info about the file system
E.g., block size, total number of blocks, # of blocks per group…
- The size of the superblock is 1024 bytes

The superblock is replicated in all group blocks
- 1st superblock is 1024 bytes past the beginning of the FS
- Copies of the superblock are in the first block of each block group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the group descriptor table?

A

Group block internals: Group descriptor table

Group descriptor table: array of descriptors for all block groups
Every block group descriptor table contains all info about all block groups
Provides the location of the inode bitmap and inode table, block bitmap, number of free blocks and inodes, etc.
The size of the descriptor table depends on how many groups are defined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Group block internals: Block bitmap & Inode bitmap

A

Block bitmap: monitors the state of each data block
Inode bitmap: monitors the state of each inode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Group block internals: Inode table & data blocks

A

Inode table: contains table of inode data structures.
Each inode contain the information about a single physical file on the system

Data blocks: contain chunks of data that belong to files or directories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is TSK?

A

TSK stands for The Sleuth Kit — it’s one of the most widely used digital forensics toolkits in the world.

TSK is an open-source collection of command-line forensic tools that let investigators:

Analyze disk images (like .dd, .img, .E01, etc.)
Examine file systems (FAT, NTFS, Ext2/3/4, etc.)
Recover deleted files
Inspect metadata, inodes, partitions, and journals

It was created by Brian Carrier, and it’s the backend used by Autopsy, the popular graphical forensic interface.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In ExtX, evidence in this content category includes …?

A

In ExtX, evidence in this category includes: blocks and block bitmap.

  • Content category includes the storage locations allocated to files so
    that they can save data: data units (e.g., named blocks in ExtX)
  • A data unit is a set of consecutive sectors of the volume
  • A sector has: physical address and a logical FS address
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does the command dcat and where is it from?

A

dcat: Dump the content of block 1 from a Ext3 FS image.
It’s from TSK.

20
Q

What is Data Unit viewing?

A

In digital forensics, data unit viewing refers to the ability to examine the raw contents of a storage device at the smallest addressable unit level — such as sectors, clusters, or blocks — without any interpretation by the file system.

So, instead of seeing “files” or “folders”, you’re looking directly at the binary (or hexadecimal) data that lives in those low-level units on disk.

⚙️ “Data unit” = smallest chunk of readable data

Depending on the context:

On a hard disk → a sector (usually 512 bytes or 4096 bytes)
On a file system → a block or cluster (can be 1 KB, 4 KB, etc.)
On an SSD or flash → a page

So, a “data unit” is basically the smallest chunk the system can read or write at once.

21
Q

Logical file system-level searching.

A

We know what to look for, but not the place: a logical file
system search looks in each DU for specific values
- E.g., search for “forensics” or a specific file header value

22
Q

In logical file system-level searching, if the value is located in two non-consecutive DUs of a fragmented file, will search find it?

23
Q

Unallocated data unit searching.

A

If we do not know the location of evidence, but we know that it
is unallocated, we can focus our attention there.
Some tools can extract all unallocated DUs to a separate file;
others restrict analysis to only the unallocated areas.
The definition of unallocated space may vary: you need to
know what your analysis tool considers unallocated data.

24
Q

Consistency checking.

A

Consistency checks allow us to determine if the file system is in a suspicious state.
Example: orphans and double metadata entries

1️⃣ Orphaned inodes (orphans)

An orphan is an inode that’s marked as allocated (in use),
but no directory entry points to it.

In other words, the data still exists on disk, but the file has no name anymore.
(This often happens when a file is deleted but not fully cleaned up, or after a crash.)

2️⃣ Double metadata entries

A data block or inode is referenced by more than one file, even though the file system doesn’t allow that (except for hard links).

This means two files claim to own the same physical blocks — a serious inconsistency.

Another check examines each of the data units that are listed as damaged filled with non-zero data

🔍 2️⃣ “Listed as damaged”

➡️ Alguns blocos são marcados no bitmap de blocos (ou numa lista separada) como danificados (bad blocks) — ou seja, o sistema operativo considera que não devem ser usados, normalmente por erros físicos no disco.

🔍 3️⃣ “Filled with non-zero data”

➡️ Este é o ponto interessante:
Mesmo que um bloco esteja marcado como “danificado”, pode conter dados válidos.
Um consistency check pode ler esses blocos para ver se:

Eles estão totalmente a zero (como seria esperado — bloqueios realmente inativos ou vazios), ou

Se têm dados não nulos (“non-zero data”), o que pode indicar que:

O bloco foi indevidamente marcado como danificado; ou

Alguém escondeu dados ali de propósito (por exemplo, data hiding em áreas “bad blocks”); ou

O disco tem corrupção e o sistema não limpou corretamente.

25
Summary of techniques in content category. (6)
1. Data unit viewing 2. Logical file system-level searching 3. Unallocated data unit searching 4. Consistency analysis 5. Data carving (next class) 6. Anti-forensic techniques (e.g., data wiping)
26
What is in the metadata category?
The metadata category contains the descriptive data. - E.g., last accessed time, addresses of DUs allocated to a file Many metadata structures are stored in a fixed or dynamic-length table, and each entry has an address. In ExtX, evidence in this category includes: inodes and inode bitmap.
27
1. Metadata lookup in metadata category.
Sim - os ficheiros não “contêm” metadados diretamente, mas sim apontam para uma estrutura separada chamada inode, que guarda todos os metadados do ficheiro. We found the name of a file that points to a specific metadata and we want to learn about the file. We just need to locate the metadata and process it.
28
What does istat do?
Shows the metadata of file associated with inode #x.
29
2. Logical file viewing in metadata category.
After we look up the metadata for a file, we can view the file contents by reading the DUs allocated to the file. This process occurs in both the metadata and content categories. During this process, we need to keep slack space in mind. 🧠 1️⃣ O que é slack space Quando um ficheiro é guardado no disco: O disco está dividido em blocos de dados (data units, DUs), por exemplo 4 KB cada. A file must allocate a full DU, even if it needs part of it Um ficheiro pode não preencher completamente o último bloco. Exemplo: Bloco = 4 KB (4096 bytes) Ficheiro = 6000 bytes O ficheiro vai ocupar 2 blocos: Primeiro bloco = 4096 bytes (completamente preenchido) Segundo bloco = 1904 bytes de ficheiro + 2192 bytes restantes ➡️ Esses 2192 bytes restantes = slack space 🔍 3️⃣ Slack space na prática Quando fazes logical file viewing: Lês os blocos (data units) alocados ao ficheiro. Observas o conteúdo real do ficheiro. Mas, no último bloco, precisas estar atento ao slack space, porque pode conter evidência residual que não pertence ao ficheiro mas ainda está no disco. If not wiped, the sectors will contain data from a previous file!
30
What does icat do?
The icat tool allows you to view the contents of the data units that are allocated to a metadata structure. - If the -s flag is given, the slack space is shown - If the -r flag is given, it attempts to recover deleted files
31
3. Logical file searching in metadata category.
Oftentimes, however, we need to find a file based on its content - For example, we want all files with the term "forensics" in it - That is when we use a logical file search
32
4. (Un)allocated metadata analysis in metadata category.
When searching for deleted content, your evidence could be sitting in an unallocated metadata entry - You cannot see it because it no longer has a name - Some tools can list the unallocated entries for you E.g. ils, from TSK
33
Give me a major method to search for evidence in deleted files.
A major method is metadata-based. - Metadata-based recovery works when metadata from the deleted file still exists - Doesn’t work, if the metadata was wiped or reallocated to a new file Note: metadata and data units can become out of sync because the data units are allocated to new files.
34
5. Metadata attribute searching and sorting in metadata category.
Search for files based on one of their metadata values. Ex: If we found interesting data in one of the data units, we might want to search the metadata entries for the data unit address.
35
6. Consistency checking in metadata category.
A consistency check with the metadata may reveal attempts to hide data or internal errors, e.g., - Verify that the number of data units allocated is consistent with the size of the file - Verify that every allocated directory entry has an allocated name that points to it - Check the ranges of dates or other non-essential data
36
Summary of techniques in metadata category (6)
1. Metadata lookup 2. Logical file viewing 3. Logical file searching 4. Unallocated metadata analysis 5. Metadata attribute searching and sorting 6. Consistency checking
37
Evidence in the file name category.
Includes the names of files and allows the user to refer to a file by its name instead of its metadata address - In other words: we are analyzing directories In ExtX, evidence in this category includes: directory entries. An important part of file name analysis is to determine where the root directory is located, e.g., / in ExtX - Each file system has its own way of defining the location of the root directory
38
With file name-based recovery, how do we recover a file content?
With file name-based recovery, we use a deleted file name and its corresponding metadata address to recover the file content using metadata-based recovery.
39
1. File name listing in the file name category.
List the names of the files and directories when searching for evidence based on name, path, or extension of a file - First locate the root directory of the file system and metadata - Then, obtain file list and corresponding metadata Featured tool: fls, from TSK
40
2. File name searching in the file name category.
Listing file names works well if we know what file we are looking for. If we don't know the full file name, we can search for the part that we do know - For example, based on the file’s extension - The process required to search for a name is similar to what we saw for file name listing Featured tool: ffind, from TSK
41
3. Consistency checking in the file name category.
Consistency checks for the file name data include verifying that all allocated names point to allocated metadata structures.
42
Is it valid for some file systems to have multiple file names for the same file?
Yes, and many of them implement this functionality by having more than one file name entry with the same metadata address. -> Hard links 1️⃣ Hard Links A hard link is a directory entry that points to an existing inode. Since multiple directory entries can point to the same inode, the same file content can appear under different names. Example: Suppose you have a file report.txt with inode 1234. You can create a hard link: ln report.txt report_backup.txt Now both report.txt and report_backup.txt point to inode 1234. Both filenames reference the same metadata and data blocks. If you modify the file through one name, the changes appear when accessed via the other name. The file is only deleted when all hard links are removed (inode’s link count reaches 0).
43
Summary of techniques in file name category (3)
1. File name listing 2. File name searching 3. Consistency checking
44
File system evidence: File system category.
It has the superblock and the group descriptor. Featured tool: fsstat, from TSK.
45
What is the volume slack?
Volume slack é o espaço entre o final do sistema de ficheiros e o fim da partição ou volume físico. - The data structures in this category frequently have unused values that can be used to hide small amounts of data - A consistency check in this category is to compare the size of the file system with the size of the volume in which it is located - If the volume is larger, the sectors after the file system are called volume slack and could be used to hide data.
46
File system evidence: Application category.
Has the journal. Some file systems contain data that belongs in the application category - These data are not essential to the FS, and exist as special FS data instead of living inside a normal file - One of the most common application category features is called journaling Featured tool: jls, from TSK.
47
FS evidence can be grouped into categories. Name them. (5)
- Content category: contains the data that comprise the actual content of a file - Metadata category: contains the data that describe a file, e.g., file content location, file size, times and dates of file accesses, and access control information - File name category: contains the data that assign a name to each file (directories) - File system category: contains the general file system information; the “map” of the FS - Application category: contains data that provide special features (e.g., journaling)