Sunday, September 20, 2015

Who's your Master? : MFT Parsers Reviewed

The Master File Table (MFT) contains the information related to folders and files on an NTFS system. Brian Carrier (2005) stated “The Master File Table is the heart of NTFS because it contains the information about all files and directories” (p. 274) Many of the forensics tools such as EnCase, FTK and X-Ways parse the MFT to display the file and folder structure to the user.

During Incident Response, there could be hundreds if not thousands of computers to examine. A way to quickly review these systems for Indicators of Compromise (IOCs) is to grab the MFT file rather than take a full disk image. The MFT file is much smaller in size than a disk image and can be parsed to show existing as well as deleted files on a system.

During a case, I noted some anomalies with a tool that I use to accomplish this task, AnalyzeMFT. This led me to do some testing and verification of several MFT parsers – and I was a little surprised with the results. Foremost, I would like to say that I am appreciative to all the authors of these tools.  My intent with this post is to draw attention to understanding the outputs of these tools so that the examiner can correctly interpret the results.

Many of the differences and issues arose due the handling of deleted files. The documentation of one of the tools I tested, MFTDump, explains the issues with deleted files in the MFT:
"Since MFTDump only has access to the $MFT file, it is not possible to “chase‟ down deleted files through the $INDEX_ALLOC structures to determine if the file is an orphan. Instead, the tool uses the resident $FILE_NAME attribute to determine its parent folder, and follows the folder path to the root folder. In the case of deleted files, this information may or may not be accurate. To determine the exact status of a deleted file, you need to analyze the file system in a forensic tool."
Some of the tools did not notify the examiner that the file path associated with the deleted file may be incorrect  – which could lead to some false conclusions.

There are a lot of tools that parse the MFT. For this testing, I focused on tools that are free, command line and output the results into Bodyfile format. The reason I chose to do this is that when I parse the MFT, I am using it to create a timeline, usually in an automated fashion. The one exception to this was the tool MFTDump.  The output was a TSV file that I wrote a parser for that converted it into Bodyfile format.

There were four “things” that I was checking each tool for:  File Size, Deleted Files, Deleted File Paths and Speed. This criteria may not be important to everyone, but I’ll explain why these are important to me.
  1. File Size
    When looking for IOC’s, file size can be used to distinguish a legitimate file from malware that has the same name.  It could also be used in lieu of file hashes. Instead of hashing every file on the computer which can be time consuming, the hashed file's size can be used to do a comparison of the MFT file sizes to flag suspect files (thanks to @rdormi for that idea)
  2. Deleted Files
    MFT records can contain deleted file information. Does the output show deleted files? In some cases the attacker’s tools and malware have been removed from the system, so being able to see deleted files is nice.
  3. Deleted File Paths
    Is the tool able to resolve and display any portion of the previous file path for the deleted file? Knowing the parent path helps give context to the file. For example, it may be located under a user account, or a suspicious location, like a temp folder.
  4. Speed
    If I am processing thousands of machines, I need a tool that will parse the MFT relatively quickly. 10 minutes per machine or 1 hour per machine can make a big difference.


The tools I tested were AnalyzeMFT,, list-mft and MFTDump. Below is a summary of the findings. Further below, I explain the results in more detail, along with some sample data.

  1. Many files, both deleted and existing, show an incorrect file size of 0
  2. Deleted files were not designated as deleted in the output
  3. Deleted files where prepended with incorrect file paths
  4. Time to parse MFT: 11 minutes
  1.  File sizes were shown in the output
  2.  Deleted files were designated as deleted
  3.  No file paths were shown for deleted files
  4. Time to parse MFT: 1 hour, 49 minutes 
  1. No file sizes were shown in the output
  2. Deleted files were designated as deleted
  3. Deleted files were shown with correct file paths
  4. Time to parse MFT: 39 minutes
  1.  File sizes were shown in the output
  2. Deleted files were designated as deleted
  3. Deleted files were enclosed  with ‘?’ to alert the examiner that file paths may be inaccurate
  4. Time to parse MFT: 7 minutes
Please note, I did not cross reference and verify every single file in the output. The observations made above were for the files that I reviewed.

What does this mean, or why are these results important?

No file size reported
The file size can help give context to a file. Having the file size can help determine if a file is suspect or not. If no file size is provided, this context is lost.

'0’ File size reported
The incorrect file size of ‘0’ can be misleading to an investigator. Take into consideration a RAM scraper output file. If an examiner is checking various systems and they see a file size of ‘0’, they might think the file is empty, when in fact, it could have thousands of credit card numbers written to it.

Files are not being reported/noted as deleted
Since there is no designation that the file is deleted, malware might appear to exist on a system, when in fact, it has been deleted. A suspect may have deleted a file and it is still showing as active in the output.

Deleted files are being associated with the wrong parent path
As noted above, due to issues with looking up the parent folder for deleted files, incorrect file paths were found to be prepended to deleted files. Even though a portion of the path may be correct, the prepended path could cause the examiner to draw an incorrect conclusion.

For example, many times a malware file will have a legitimate windows system name, such as svchost.exe. What flags the file as suspicious is where it was/is located. If the parent path is reportedly incorrectly, a malicious file may be missed. Or, a file may my attributed to an incorrect user account because the path is listed incorrectly.


Based on my testing and criteria, MFTDump seems to be the best fit for my process. It contains the file sizes, and designates between an active file and a deleted file. In the event that it recovers a file path for a deleted file, it lets the examiner know that it might be inaccurate by making a notation in the output.  If any important files are found using any of these tools, it would be prudent for the examiner to verify with a full disk image.

Sample Test Data

Below, I show some examples from the output for each tool. Although I did some testing and verification, it is up to each examiner to test their tools – I accept no liability or responsibility for using these tools and relying on my results. For demonstrative purposes only. :)

I used FLS from the Sleuthkit and X-Ways to check a deleted file. I then compared how this deleted file was handled with the different tools. I also used Harlan Carvey’s tools (bodyfile.exe and parse.exe) to convert the bodyfile generated by the tool into TLN format for readability.

The deleted file I reviewed was “048002.jpg”.  The path was shown as C:/$OrphanFiles/Pornography/048002.jpg (deleted) in both FLS and X-Ways.

Each of the outputs were grepped for the file 048002.jpg, and the entries located are displayed below in TLN format. I omitted the "Type" (File), "Host" (Computer1) and "User" (blank) columns in order to better display the results.

I have also included how long each process took. The system I used was Windows 7 with an Intel i7 and 16GB of RAM. The size of the MFT was about 1.8GB (which is much larger then most systems I process)
FLS Output
fls -m C: -f ntfs -r \\.\[Mounted Drive] >> C:\path\to\bodyfile

Date Description
2076-11-29 08:54:34 MA.B [4995] C:/$OrphanFiles/Pornography/048002.jpg (deleted)
2014-01-11 01:25:45 ..C. [4995] C:/$OrphanFiles/Pornography/048002.jpg (deleted)
2013-10-28 20:38:37 MACB [124] C:/$OrphanFiles/Pornography/048002.jpg ($FILE_NAME) (deleted)

FLS was used as the baseline for the test, and the output was verified with X-Ways. It shows the file as a deleted Orphan file, with a partial recovered directly listing of "/Pornography/048002.jpg". According to The Sleuthkit documentation on orphan files:
"Orphan files are deleted files that still have file metadata in the file system, but that cannot be accessed from the root directory."
Fls took about 20 minutes to run accross the mounted image.

AnalyzeMFT Output -f "C:\path\to\$MFT" -b "C:\path\to\output\bodyfile.txt" --bodyfull -p

Date Description
2013-10-28 20:38:37 MACB [0] /Users/SpeedRacer/AppData/Roaming/Scooter Software/Beyond Compare 3/BCState.xml/Pornography/048002.jpg

AnalyzeMFT showed 0 for the file size. It had no designation in the output that flags if the file is deleted or active. Although it was able to recover the deleted file path "/Pornography/", it prepended the file path with a folder that currently exists on the system rather then identify it as an Orphan file.

This makes it appear to the examiner that this is an active file, under the location "Users/SpeedRacer/AppData/Roaming/Scooter Software/Beyond Compare 3/BCState.xml/Pornography",  when in fact, it is a deleted Orphan file.

During my review of the outputs, I noticed quite a few files were showing an incorrect file size of '0', including active files.  In the review of the open issues on github, these issues appear to have been noted.

I also ran AnalyzeMFT with the default output, a csv file. In this output, the file did have a flag designating it as deleted, however, the bodyfile format does not. Output
log2timeline -z local -f mft -o tln -w /path/to/bodyfile.txt 

Date Description
2014-01-11 01:25:45 FILE,-,-,[$SI ..C.] /Pornography/048002.jpg (deleted)|UTC|  inode:781789

The “old” version of log2timeline has an –f  mft option that parses an MFT file into bodyfile format. The “new” version of log2timeline with Plaso does not have the option to parse the MFT separately (at least I coudnt find it.). was run from a SIFT Virtual Machine. For the VM, I gave the VM about 11GB of RAM, and 6 CPUs. With this setup, it took about 39 minutes to parse the MFT.

No file size was provided in the log2timeline for any files. The file is flagged as deleted, and includes the correct partial recovered path /Pornography/". Out off all the MFT tools I tested, this one most accurately depicts the deleted file path. However, it's interesting to note that it did not include the FileName attribute.

list-mft Output "C:\path\to\$MFT" >> "C:\path\to\output\bodyfile.txt"

Date Description
2014-01-11 01:25:45 ,..C. [4995] \\$ORPHAN\048002.jpg (inactive)
2013-10-28 20:38:37 ,MACB [4995] \\$ORPHAN\048002.jpg (filename, inactive)

list-mft provided the file size, and a designation that the file was deleted (inactive). It also identified the file as an Orphan, however, it did not recover the partial path of /Pornography/. This may be important as the partial path can help provide context for the deleted file.

This program took the longest to run at 1 hour and 49 minutes. There is a -c, cache option that can be configured. This can be increased for better performance, however, I just used the default settings.

MFTDump Output
mftdump.exe "C:\path\to\$MFT" /o "C:\path\to\output\mftdump-output.txt"

Date Description
2076-11-29 08:54:34 MA.B [4995] ?\Users\SpeedRacer\AppData\Roaming\Scooter Software\Beyond Compare 3\BCState.xml\Pornography\048002.jpg?(DELETED)
2014-01-11 01:25:45 ..C. [4995] ?\Users\SpeedRacer\AppData\Roaming\Scooter Software\Beyond Compare 3\BCState.xml\Pornography\048002.jpg?(DELETED)
2013-10-28 20:38:37 MACB [4995] ?\Users\SpeedRacer\AppData\Roaming\Scooter Software\Beyond Compare 3\BCState.xml\Pornography\048002.jpg? (DELETED)(FILENAME)

The file sizes are displayed, and a designation is included showing that the file has been deleted. Deleted files were enclosed  with ‘?’ to alert the examiner that file paths may be incorrect. This tool ran the fastest, clocking 7 minutes for a 1.8 GB MFT file. The output from this tool as a TSV file. I wrote a python script to parse it into bodyfile format.

To keep this post relativity short, I just demonstrated the output for one file, however, I used the same process on several files and the results were consistent. Whatever tool an examiner chooses to use will depend on their particular needs. For example, an examiner may not be interested in file sizes, and in this case they may choose to use log2timeline.  However, if speed is an issue, MFTDump might make more sense. As long as the examiner knows what information the output is portraying, and can verify the results independently, any of these tools can get the job done.

Carrier, B. (2005). File System Forensic Analysis. Upper Saddle River, NJ: Pearson Education

Monday, June 22, 2015

SQLite Deleted Data Parser Update - Leave no "Leaf" unturned

One of the things I love about open source is that people have the ability to update and share code.  Adrian Long, aka @Cheeky4n6Monkey, did just that. Based upon some research, he located additional deleted data that can be harvested from re-purposed SQLite pages - specifically the Leaf
Table B-Tree page type. He updated my code on GitHub and BAM!! just like that, the SQLite Deleted Data parser now recovers this information.

He has detailed all the specifics and technical goodies in a blog post, so I won't go into detail here. It involved a lot of work and I would like to extend a huge thank you to Adrian for taking the time to update the code and for sharing his research.

You can download the most recent version on my GitHub page. I've also update the command line and GUI to support the changes as well.

Tuesday, June 9, 2015

Does it make sense?

Through all my high school and college math classes, my teachers always taught me to step back after a problem was completed and ask if the answer made sense.  What did this mean?  It meant don't just punch numbers into the calculator, write the answer, and move on. It meant step back, review the problem, consider all the known information and ask, "Does the answer I came up with make sense?"

Take for instance the Pythagorean Theorem. Just by looking at the picture, I can see that C should be longer than A or B. If my answer for C was smaller than A or B, I would know to recheck my work.

Although the above example is relatively simple, this little trick applied to more complicated math problems, and many times it helped me catch incorrect answers.

But what does this have to do with DFIR?  I believe the same principle can be applied to investigations. When looking at data, sometimes stepping back and asking myself the question, "Does what I am looking at make sense?" has helped me locate issues I may not have otherwise caught.

I have had a least a couple of DFIR situations where using this method paid off.

You've got Mail...
I was working a case where a user had Mozilla Thunderbird for an email client. I parsed the email with some typical forensic tools and begin reviewing emails.

While reviewing the output, I noticed it seemed pretty sparse, even though the size of the profile folder was several gigs. This is where I stepped back and asked myself, does this make sense? It was a large profile, yet contained very few emails.  This led to my research and eventual blog post on the Thunderbird MBOXRD file format. Many of the programs were just parsing out the MBOX format and not the MBOXRD format, and thus, missing lots of emails. Had I just accepted the final output of the program, I would have missed a lot of data.

All your files belong to us...
Many times I will triage a system either while waiting for an image to complete, or as an alternate to taking an image. This is especially useful when dealing with remote systems that need to be looked at quickly. Exporting out the MFT and other files such as the Event Logs and Registry files results in a much smaller data set than a complete image. These artifacts can then be used to create a mini-timeline so analysis can begin immediately  (see Halan's post here for more details on creating mini-timelines).
To parse the MFT file into timeline format, I use a tool called Analyze MFT  to provide a bodyfile. Once the MFT is in bodyfile format, I use Harlan Carvey's to convert it into TLN format and add it into the timeline.

While working a case, I created timelines using the above method for several dozen computers. After the timelines were created, I grepped out various malware references from these timelines. While reviewing the results, I noticed many of the malware files had a file size of zero. Weird. I took a closer look and noticed ALL the malware files contained a file size of zero. Hmmm.. what did that mean? What are the chances that ALL of those files would have a zero file size??? Since the full disk images were not available, validateing this information with the actual files was not an option. But I stepped backed and asked myself, given what I knew about my case and how the malware behaved.. does that make sense?

So I decided to "check my work" and do some testing with Analyze MFT. I created a virtual machine  with Windows XP and exported out the MFT.  I parsed the MFT with Analyze MFT and began looking at the results for files with a zero file size.

I noticed right away that all the prefetch files had a file size of zero, which is odd.  I was able to verify that the prefetch files sizes were in fact NOT zero by using other tools to parse the MFT, as well as looking at the prefetch files themselves in the VM. My testing confirmed that Analyze MFT was incorrectly reporting a file size of zero for some files.

After the testing I reached out to David Kovar, the author of Analyze MFT, to discuss the issue. I also submitted a bug to the github page.

If I had not "checked my work" and assumed that the the file size of  zero meant the files were empty, it could have led to an incorrect "answer".

So thanks to those teachers that ground the "does it make sense" check into my head, as it has proved to be a valuable tip that has helped me numerous times  (more so then the Pythagorean Theorem...)

Tuesday, April 7, 2015

Dealing with compressed vmdk files

Wherever I get vmdk files, I take a deep breath and wonder what issues might pop up with them. I recently received some vmkd files and when I viewed one of these in FTK Imager (and some other mainstream forensic tools), it showed up as the dreaded "unrecognized file system".

To verify that I had not received some corrupted files, I used the VMWares disk utility to check the partitions in the vmdk file. This tool showed two volumes, so it appeared the vmdk file was not corrupted:

When I tired to mount the vmdk file using vmware-mount, the drive mounted, but was not accessible. A review of their documentation, specifically the limitation section, pointed out that the utility could not mount compressed vmdk files:

You cannot mount a virtual disk if any of its files are encrypted, compressed, or have read-only permissions. Change these attributes before mounting the virtual disk

Bummer. It appeared I had some compressed vmdk files.

So after some Googling and research, I found a couple different ways to deal with these compressed vmdk files - at least until they are supported by the mainstream forensic tools. The first way involves converting the vmdk file, and the second way is by mounting it in Linux.

Which method I choose ultimately depend on my end goals. If I want to bring the file into one of the mainstream forensics tools, converting it into another format may work the best. If  I want to save disk space, time and do some batch processing, mounting it in Linux may be ideal.

One of the first things I do when I get an image is create a mini-timeline using fls and some of Harlan's tools. Mounting the image in Linux enables me to run these tools without the additional overhead of converting the file first.

Method 1: "Convert"

The first method is to "convert" the vmdk file.
I'm using "quotes" because my preferred method is to "convert" it right back to the vmdk file format, which in essence, decompresses it.

The vmdk file format is usually much smaller then the raw/dd image and appears to take less time time to "convert".

I used the the free VBoxManger.exe that comes with VirtualBox. This is a command line tool located under C:\Program Files\Oracle\VirtualBox. This tool give you the option to convert the compressed vmdk (or a regular vmkd) into several formats: VHD, VDI, VMDK and Raw. The syntax is:

VboxManage.exe clonehd "C:\path\to\compressed.vmkd" "C:\path\to\decompressed.vmdk" --format VMDK.

It give you a nice status indicator during the process:

Now the file is in a format that can worked with like any other normal vmdk file.

Method 2: Mount in Linux

This is the method that I prefer when dealing with LOTS of vmdk files. This method uses Virtual Box Fuse, and does not require you to decompress/convert the vmkd first.

I had a case involving over 100 of these compressed files. Imagine the overhead time involved with converting 100+ vmdk files before you can even begin to work with them. This way, I was able to write a script to mount each one in Linux, run fls to create a bodyfile, throw some of Harlan's parsers into the mix, and was able to create a 100+ mini-timelines pretty quickly.

There is some initial setup involved but once that's done, it's relatively quick and easy to access the compressed vmdk file.

I'll run though how to install Virtual Box Fuse, how to get the compressed vmkd file  mounted, then run fls on it.

1)Install VirtualBox:

sudo apt-get install virtualbox-qt

2) Install Virtual Box Fuse. It is no longer in the app repository, so you will need to download and install the .deb file - don't worry, it's pretty easy, no compiling required :)

Download the .deb from from Launchpad under "Published Versions". Once you have it downloaded, install it by typing:

sudo dpkg -i --force-depends virtualbox-fuse_4.1.18-dfsg-1ubuntu1_amd64.deb

Note  - this version is not compatible with Virtual Box v. 4.2. At the time of this writing, when I installed Virtual Box on my Ubuntu distro, it was version 4.1 and worked just fine. If you have a newer version of virtual box, it will still work - you just unpack the .deb file and run the binary without installing it. See the bottom of the thread here for more details.

3)Mount the compressed VMDK file read-only

vdfuse -r -t VMDK -f /mnt/evidence/compressed.vmdk /mnt/vmdk

This will created a device called "EntireDisk" and Parition1, Parition2 etc. under /mnt/vmdk

(even though I got this fuse error - everything seems to work just fine)

At this point and time, you can use fls to generate a bodyfile. fls is included in the Sleuth Kit, and is installed on SIFT by default. You may need to specify the offset for your partition.  Run mmls to grab this:

Now that we have the offsets, we can run fls to generate a bodyfile:

fls -r -o 2048 /mnt/vmdk/EntireDisk -m C: >> /home/sansforensics/win7-bodyfile
fls -r -o 206848 /mnt/vmdk/EntireDisk -m C: >> /home/sansforensics/win7-bodyfile

Next, if you want access to the files/folders etc, you will need to mount the EntireDisk Image as an ntfs mount for each partition. This is assuming you have an Windows system - if not, adjust the type accordingly:

Mount Partition 1, Offset 2048:

Mount Parition2, Offset 206848:

There are multiple ways to deal with this compressed format, such as using VMWare or VirtualBox GUI to import/export the compresses file... these are just some examples of a couple of ways to do it. I tend to prefer command line options so I can script out batch files if necessary.

Wednesday, March 4, 2015

USN Journal: Where have you been all my life

One of the goals of IR engagements is to locate the initial infection vector and/or patient zero. In order to determine this, timeline analysis becomes critical, as does determining when the  malware was created and/or executed on a system.

This file create time may become extremely critical if you're dealing with multiple or even hundreds of systems and trying to determine when and where the malware first made its way into the environment.

But what happens when the malware has already been remediated  by a Systems Administrator, deleted by an attacker, or new AV signatures are being pushed out, resulting in the malware being removed?

Many of the residual artifacts demonstrate execution,  however, it seems very few actually document when the file was created on the system. This is where the USN Journal recently helped me on a case. The USN Journal is by no means new.. but I just wanted to talk about a case study and share my experience with it, as I feel it's an often overlooked artifact.

For purposes of demonstrative data, I downloaded and infected a Windows 7 VM with malware.  This malware was from a phishing email that contained a zip file, This zip file contained a payload, voice.exe. For more details on this malware sample, check out
So lets run through some typical artifacts that demonstrate execution along with the available timestamps and see what they do and don't tell us...

The MFT contains the filesystem information - Modified, Accessed and Created dates, file size etc. However, a deleted file's MFT record may be overwritten. If you're lucky, your deleted malware file will still have an entry in the MFT - however, in my case this was not to be.

The ShimCache
I won't go into to much detail here as Mandiant has a great white paper on this artifact. Basically, on most systems this artifact contains information on files that have been executed including path, file size and last modified date. I parsed this registry key with RegRipper, and located an entry for the test malware, voice.exe:

ModTime: Wed Jan 28 15:28:46 2015 Z

So what does this tell me? That voice.exe was in the Downloads path, was executed, and has a last modified date of 01/28/2015 - <sigh> no create date </sigh>.
The User Assist is another awesome key... it displays the last time a file was executed, along with a run count. Once again, using RegRipper to parse this I located an entry for the test malware:

Mon Feb 23 02:33:34 2015 Z C:\Users\user1\Downloads\voice#5734223\voice.exe (2)*

By looking at this artifact, I can see that the file was executed twice - once on February 23rd, however, I don't know when the first time was. It could have been minutes, hours or days earlier. It still does very little to let me know when the file was created on the system, although I do know it should be sometime before this time stamp.

Prefetch File
This is a great artifact that can show execution and even multiple times of execution. But what if the system is a Server, where prefetching may not be enabled?  In my case, prefetching was enabled, but there was no prefetch file for the malware in question - at least that is what I thought until I checked the USN Journal. And, once again, it does not contain information related to when the file was created on the system.

Ok, so I've reviewed a couple of typical artifacts that demonstrated that the malware executed (for more artifacts related to execution, check out this blog post by Mandiant "Did it Execute") With timeline analysis, I may even get an idea of when this file was most likely introduced on the system - however, a definitive create date would be nice to have. This brings me too.....the USN Journal.

USN Journal
There are a couple of tools I use to parse the USN Journal. A quick, easy to use script is available from Harlan Carvey's GitHub page.

Parsing the USN Journal and looking for the malware in question, I see some interesting entries for voice.exe, highlighted in red below:

Sweeet! I now have a File_Create timestamp for voice.exe I can use in my timeline. I can also see that voice.exe was deleted relatively quickly ~ 30 seconds after it was created. This deletion occured about the same time when the prefetch file for it was created. This might be an indication that the malware deleted itself upon execution.

It's also interesting to note that around the same time the prefetch file was created for voice.exe, a file called testmem.exe was created and executed (highlighted in yellow)..hmmmm.

Time to dig deeper. For a little more detail on the USN Journal, there is the TriForce tool. This tool processes three files: $MFT, $J and $Logfile. It then cross references these three files to build out some additional relationships. In my experience, this tool takes a bit longer to run. As you can see by the output below, I now have full file paths that may help add a little more context:

That testmem.exe just became all that more suspicious due to it's location - a temp folder.

By reviewing the USN Journal file, I was able to establish a create date of the malware. This create date located in the USN Journal gave me an additional pivot point to work with. This pivot point lead to some additional findings -  a suspicious file, testmem.exe. (For some more on timeline pivoting, check out Harlan's post here). Not only did the create date help locate additional artifacts, but it can also help me home in on which systems may be patient zero. Malware may arrive on a system long before its executed.
Just because it's not there - doesn't mean it didn't exist. For the case I was working, I did not have any existing prefetch files for the malware. However, when I parsed the USN Journal, I saw the prefetch file for the malware get created and deleted within the span of 40 minutes. I also saw some additional temporary files get created and deleted, some of which were not in the MFT.

Alass, as sad as it is, my relationship with the USN Journal does have some shortcomings (and it's not my fault). Since it is a log file, it does "roll over" The USJ Journal can be limited in the amount of data that it holds - sometimes it seems all I see in it are Windows Update files. If a system is heavily used, and if the file was deleted months ago, it may no longer be in the USN Journal. However, all is not lost though, Rumor has it that there may be some USN Journal files located in the Volume Shadow Copies so there may still be hope. Also, David Cowen points out the log file is not circular (as I once thought), and just frees the old pages to disk:

"After talking to Troy Larson though I now understand that this behavior is due to the fact that the journal is not circular but rather pages are allocated and deallocated as the journal grows"

This means that you can carve USN Journals records! Check out his blog post here for more information.

Happy Hunting!

Additional reading on the USN Journal

*The run count number was modified from 1 to 2 on this output to illustrate a point.

Tuesday, October 7, 2014

Timestomp MFT Shenanigans

I was working a case a while back and I came across some malware that had time stomping capabilities. There have been numerous posts written on how to use the MFT as a means to determine if time stomping has occurred, so I won't go into too much detail here.

Time Stomping

Time Stomping is an Anti-Forensics technique. Many times, knowing when malware arrived on a system is a question that needs to be answered. If the timestamps of the malware has been changed, ie, time stomped, this can make it difficult to identify a suspicious file as well as answer the question, "When".

Basically there are two "sets" of timestamps that are tracked in the MFT. These two "sets" are the $STANDARD_INFORMATION and $FILE_NAME. Both of these track 4 timestamps each - Modified, Access, Created and Born. Or if you prefer - Created, Last Written, Accessed and Entry Modified (To-mato, Ta-mato). The $STANDARD_INFORMATION timestamps are the ones normally viewed in Windows Explorer as well as most forensic tools.

Most time stomping tools only change the  $STANDARD_INFORMATION set. This means that by using tools that display both the $STANDARD_INFORMATION and $FILE_NAME attributes, you can compare the two sets to determine if a file may have been time stomped.  If the $STANDARD_INFORMATION predates the $FILE_NAME, it is a possible red flag (example to follow).

In my particular case, by reviewing the suspicious file's $STANDARD_INFORMATION and $FILE_NAME attributes, it was relatively easy to see that there was a mismatch, and thus, combined with other indicators, that time stomping had occurred. Below is an example of what a typical malware time stomped file looked like. As you can see, the $STANDARD_INFORMATION highlighted in red predates the $FILE_NAME dates (test data was used for demonstrative purposes)

System A \test\malware.exe

M: Fri Jan  1 07:08:15 2010 Z
A: Tue Oct  7 06:19:23 2014 Z
C: Tue Oct  7 06:19:23 2014 Z
B: Fri Jan  1 07:08:15 2010 Z

M: Thu Oct  2 05:41:56 2014 Z
A: Thu Oct  2 05:41:56 2014 Z
C: Thu Oct  2 05:41:56 2014 Z
B: Thu Oct  2 05:41:56 2014 Z

However, on a couple of systems there were a few outliers where the time stomped malware $STANDARD_INFORMATION and $FILE_NAME modified and born dates matched:

System B \test\malware.exe

M: Fri Jan  1 07:08:15 2010 Z
A: Tue Oct  7 06:19:23 2014 Z
C: Tue Oct  7 06:19:23 2014 Z
B: Fri Jan  1 07:08:15 2010 Z

M: Fri Jan  1 07:08:15 2010 Z
A: Thu Oct  2 05:41:56 2014 Z
C: Thu Oct  2 05:41:56 2014 Z
B: Fri Jan  1 07:08:15 2010 Z

Due to other indicators, it was pretty clear that these files were time stomped, however, I was curious to know what may have caused these dates to match, while all the others did not. In effect, it appeared that that Modified and Born dates were time stomped in both the $SI and $FN timestamps, however this was not the MO in all the other instances.

Luckily, I remembered a blog post written by Harlan Carvey where he ran various file system operations and reported the MFT and USN change journal output for these tests. I remembered that during one of his tests, some dates had been copied from the $STANDARD_INFORMATION into the $FILE_NAME attributesA quick review of his blog post revealed the following had occurred during a rename operation . Below is a quote from Harlan's post:

"I honestly have no idea why the last accessed (A) and creation (B) dates from the $STANDARD_INFORMATION attribute would be copied into the corresponding time stamps of the $FILE_NAME attribute for a rename operation"

In my particular case it was not the accessed date and creation dates (B) that appeared to have been copied, but the modified and creation dates (B). Shoot.. not the same results as Harlan's test... but his system was Windows 7, and the system I was examining was Windows XP.  Because my system was different, I decided to follow the procedure Harlan used and do some testing on a Windows XP to see what happened when I did a file rename.


My test system was Widows XP Pro SP3 in a Virtual Box VM.  I used FTK Imager to load up the vmdk file after each test and export out the MFT record. I then parsed the MFT record with Harlan Carvey's mft.exe.

First, I created "New Text Document.txt" under My Documents. As expected, all the timestamps in both the $STANDARD_INFORMATION and $FILE_NAME were the same:

12591      FILE Seq: 1    Link: 2    0x38 4     Flags: 1  
.\Documents and Settings\Mari\My Documents\New Text Document.txt
    M: Thu Oct  2 23:22:05 2014 Z
    A: Thu Oct  2 23:22:05 2014 Z
    C: Thu Oct  2 23:22:05 2014 Z
    B: Thu Oct  2 23:22:05 2014 Z
  FN: NEWTEX~1.TXT  Parent Ref: 10469  Parent Seq: 1
    M: Thu Oct  2 23:22:05 2014 Z
    A: Thu Oct  2 23:22:05 2014 Z
    C: Thu Oct  2 23:22:05 2014 Z
    B: Thu Oct  2 23:22:05 2014 Z
  FN: New Text Document.txt  Parent Ref: 10469  Parent Seq: 1
    M: Thu Oct  2 23:22:05 2014 Z
    A: Thu Oct  2 23:22:05 2014 Z
    C: Thu Oct  2 23:22:05 2014 Z
    B: Thu Oct  2 23:22:05 2014 Z

Next, I used the program SetMACE to change the $STANDARD_INFORMATION timestamps to "2010:07:29:03:30:45:789:1234" . As expected, the $STANDARD_INFORMATION changed, while the $FILE_NAME stayed the same. Once again,  this is common to see in files that have been time stomped:

12591      FILE Seq: 1    Link: 2    0x38 4     Flags: 1  
.\Documents and Settings\Mari\My Documents\New Text Document.txt
    M: Wed Jul 29 03:30:45 2010 Z
    A: Wed Jul 29 03:30:45 2010 Z
    C: Wed Jul 29 03:30:45 2010 Z
    B: Wed Jul 29 03:30:45 2010 Z

  FN: NEWTEX~1.TXT  Parent Ref: 10469  Parent Seq: 1
    M: Thu Oct  2 23:22:05 2014 Z
    A: Thu Oct  2 23:22:05 2014 Z
    C: Thu Oct  2 23:22:05 2014 Z
    B: Thu Oct  2 23:22:05 2014 Z
  FN: New Text Document.txt  Parent Ref: 10469  Parent Seq: 1
    M: Thu Oct  2 23:22:05 2014 Z
    A: Thu Oct  2 23:22:05 2014 Z
    C: Thu Oct  2 23:22:05 2014 Z
    B: Thu Oct  2 23:22:05 2014 Z

Next, I used the rename command via the command prompt to rename the file from New Text Document.txt to "Renamed Text Document.txt" (I know - creative naming). The interesting thing here is, unlike the Windows 7 test where two dates were copied over, all four dates were copied over from the original files $STANDARD_INFORMATION into the $FILE_NAME:

12591      FILE Seq: 1    Link: 2    0x38 6     Flags: 1  
.\Documents and Settings\Mari\My Documents\Renamed Text Document.txt
    M: Wed Jul 29 03:30:45 2010 Z
    A: Wed Jul 29 03:30:45 2010 Z
    C: Thu Oct  2 23:38:36 2014 Z
    B: Wed Jul 29 03:30:45 2010 Z
  FN: RENAME~1.TXT  Parent Ref: 10469  Parent Seq: 1
    M: Wed Jul 29 03:30:45 2010 Z
    A: Wed Jul 29 03:30:45 2010 Z
    C: Wed Jul 29 03:30:45 2010 Z
    B: Wed Jul 29 03:30:45 2010 Z

  FN: Renamed Text Document.txt  Parent Ref: 10469  Parent Seq: 1
    M: Wed Jul 29 03:30:45 2010 Z
    A: Wed Jul 29 03:30:45 2010 Z
    C: Wed Jul 29 03:30:45 2010 Z
    B: Wed Jul 29 03:30:45 2010 Z

Based upon my testing, a rename could have caused the 2010 dates to be the same in both the $SI and $FN attributes in my outliers. This scenario "in the wild" makes sense...the malware is dropped on the system, time stomped, then renamed to a file name that is less conspicuous on the system. This sequence of events on a Windows XP system may make it difficult to use the MFT analysis alone to identify time stomping.

So what if you run across a file where you suspect this may be the case? On Windows XP you could check the restore points change.log files. These files track changes such as file creations and renames. Once again, Mr. HC has a perl script that parses these change log files, If you see a file creation and a rename, you can use the restore point as a guideline to when the file was created and renamed on the system.

You could also parse the USN change journal to see if and when the suspected file had been created and renamed. Tools such as Triforce or Harlan's do a great job.

If the change.log file and and journal file do not go back far enough, checking the compile date of the suspicious file with program like CFF Explorer may also help shed some light. If a program has a compile date years after the born date,.. red flag.

I don't think anything I've found is ground breaking or new. In fact,the Forensics Wiki entry on timestomp demonstrates this behavior with time stomping and moved files, but I thought I would share anyways.

Happy hunting, and may the odds be ever in your favor...

Tuesday, September 2, 2014

SQLite Deleted Data Parser - GUI Added

Last year I wrote a Python script to parse deleted data from SQLite Databases (original post here).
Every once in a while, I get emails asking for help on how to use the SQLite Parser from users who are not that familiar with using Python or command line tools in general.

As an everyday user of command line tools and  Python, I forget the little things that may challenge these users (we were all there at one point and time!) This includes things like quotes around file paths, which direction slashes go, and how to execute a python script if Python is not in your environment variable.

So, to that end, I have created a Windows GUI for the SQLite Parser to make the process a little less painful.

The GUI is pretty self explanatory:
  • Choose the path to the SQLite database
  • Choose the file to save the results to
  • Select Formatted or Raw output

This means there are now three flavors of the SQLParser available:
  • - python script
  • sqlparse_CLI.exe - Windows command line tool
  • sqlparse_GUI.exe - Windows GUI tool
All three files are available for download here on on my GitHub page.

Coming soon... a blog post/tutorial on how to use python scripts :-)