Those of us working in the Digital Forensics and Incident Response realm rely on tools to harvest data for analysis, not to mention to perform the actual analysis. Let’s be honest: Without tools, we would have a dickens of a time doing our jobs. Unfortunately, this had led to examiners having an inherent high level of trust in his or her tools. This article aims to highlight how dangerous this can be given the propensity for software to contain bugs and/or procedural defects.
We will begin by discussing the concept of Dual Tool Verification (DTV). Next, we will provide the advantages and disadvantages of implementing the methodology. Then, we move to discussing ways to implement the methodology. Finally, we will cover two real-world examples to highlight the methodology’s importance.
Let’s get started!
Definition: Dual Tool Verification
Dual Tool Verification involves using more than one tool to obtain data, comparing the results obtained by these tools, and then ensuring that the results produced by these tools are the same.
The methodology can best be highlighted via a pseudo code-like example:
result_set1 = tool1.process(forensic_data) result_set2 = tool2.process(forensic_data) if not (result_set1 == result_set2): dig_and_find_the_damn_reason()
So… Why Are We Doing This?
Remember back in grade school when your (math) teacher would mandate that you check your work constantly? Many of us checked our manual work using a calculator. In the digital realm, the idea is similar: Your initial method may have been flawed, so use a secondary method to obtain the data, compare, and make sure that you have the same results.
If software were not volatile in nature, we would not need to run multiple tools and/or evaluate our tools to ensure that they provide the correct data. Think about it this way: Software is not like a physical tape measure. Have you ever used a secondary measurement device to ensure that a physical tape measure denotes the correct measurements at the ascribed intervals? Probably not.
Tape measures are manufactured via a machine-based process that creates the device in a systematic manner. The physical replication of tape measures might have a degree of variance, but the general idea is that these devices are churned out easily and differences between produced devices are rare. The same cannot be said about software.
Ask yourself the following:
- Have you ever run a software product that didn’t require an update?
- Have you ever run a software product that was flawless, devoid of bugs?
- Have you ever blah blah you get the idea blah blah
The maxims (ooooh, another list!):
- Software is complicated
- Software is developed by humans
- Software is error-prone
- Software should not carry an implicit high level of trust
Advantages and Disadvantages
- Ensure Correctness of Findings
- In forensic analysis, the correctness of findings is of utmost importance. Imagine submitting an analysis wrought with error. Deriving the improper timeframe for activity, attributing activity to the wrong individual, and incorrectly determining that a host has been “popped” are all examples of incorrect findings that could be the result of relying on the wrong tool(s).
- Enhance Strength of Evidence
- Anyone skeptical of your findings might find it difficult to remain skeptic when shown that multiple tools were used to derive said data. Imagine someone digging up a dinosaur bone in the desert and having a professional paleontologist verify the remains. The paleontologist could be subject to a voir dire challenge. But what if two professional paleontologists deemed the remains as legit? Sure, sure, they could both be submitted to the voir dire, but you get the idea!
- Increased Familiarity with Tools
- As you increase the number and types of tools that you use to analyze digital evidence, you’ll become familiar with a more varied toolset. More tools in the toolbox is always a good thing!
- KEEP YOUR DAMN JOB!
- Hey, if you submit a bogus analysis, you could cost your company money; cause someone to lose his or her freedom; lose respect in the industry; or freakin’ outright be fired!
Let’s end this section with the disadvantage: Time. Yup, the major blocker in implementing DTV is the time it takes to run multiple tools and compare the results. When in a pinch for time, implementing the methodology could post issues. A good example of this is when an IR team must deal with hundreds of potentially compromised hosts. Are we really going to run multiple tools to derive each data point when time is of the essence? Hmmmm…
Example Tool Comparison Options
So if this DTV thing is so advantageous, what are some common methods we can use?
|Competing Products||Use major competing products.
Example: EnCase and F-Response
|Closed Source vs. Open Source||Use a closed source and an open source utility.
Example: EnCase and Google Rapid Response
|Different Software Versions||Yup! The exact same software product can be tested against older versions. Quite often, I’ve received different results from different versions of software.
Example: WireShark 1.12.6 has an issue with it’s AOL Mail parser that does not exist in other versions.
While the first two options above might speak for themselves, I would like to highlight the comparison of different versions of the same software. I was stuck in a recent Capture the Flag (CTF)-type event because my version of Wireshark was not parsing a particular protocol properly. At the time, I did not know that was why I was failing. After deciding to move to a different environment for a “change of pace” (I switched VMs), I noticed that the place I was looking was indeed the correct location! Dangit!
A Potential “Gotcha”
When attempting to compare tool results, you might run into some expected issues with direct result comparison. Keep in mind that not all tools output the same information. For that matter, some tools might produce “different” information based on how the tool itself stomps data.
For example, if you have EnCase agents deployed throughout your organization, you could pull a memory image via this mechanism. However, if you choose to compare the resultant memory image against one pulled via FTK Imager that you install on host post-incident, you will most likely see FTK itself running in the second image. This might sound obvious, but many folks fall victim to the “A and B don’t match” dilemma, when the real issue is that one of the tools itself altered the evidence. This of course is a whole different topic, so let’s continue.
Real World Examples
1st Degree Murder, or Manslaugther?
After a lengthy trial that concluded in July of 2011, a woman named Casey Anthony was cleared of murdering her daughter. Don’t worry: I do not intend to delve into the specifics of this particular fiasco. The Casey Anthony trial concluded in a highly contested verdict and, quite frankly, was an ugly, ugly case. For our purposes, we will focus on the digital evidence presented in this matter.
The prosecution presented digital evidence analysis that was performed on a Mork database file containing user history from Mozilla’s Firefox Web browser. The sheriff’s department originally analyzed this database file using NetAnalysis v1.37. These initial results pointed to a single visit (that’s one , folks) to a Web page that detailed the production of chloroform. This tool also showed 84 visits to MySpace.com. However, a follow-up analysis was conducted using a secondary tool (whose name will remain redacted). This secondary tool purported 84 visits to the chloroform page versus the single occurrence (Wilson, 2011).
This evidence was crucial to the case, as searches for chloroform could point to premeditation, which could mean the difference between murder in the first degree and manslaughter. During the trial, the defense jumped on the fact that the evidence was flawed. Post-trial, Craig Wilson, the author of NetAnalysis, conducted his own analysis on the Mork database file. Given a deep understanding of the Mork database specifications along with a knack for manual analysis, Wilson (2011) found that the results from his tool were correct: One visit. ONE.
Now, let’s analyze what happened in this case. A single piece of digital evidence was analyzed using two different tools. These tools yielded different results. However, both sets of results made their way into the court room. I will avoid the “Why didn’t the prosecution clear up the issue before presenting their digital evidence?” line of questioning for now. The fact of the matter is: This was a highly-publicized MURDER trial that was marred by poor analysis. Properly implementing the idea of DTV within such a trial should have included identifying the exact reason(s) why the result sets were different. However, this did not happen.
What if Casey was found guilty? Would the evidence as presented be enough to charge her with first degree murder? Some might think, “Well, yeah, a single visit or 84 visits would both point to premeditation.” But come on folks, that’s not how trials work! The digital evidence was not confirmed and as such would have been challenged, potentially even thrown out completely. Thus the concept of premeditation could have gone completely out of the window, simply because the prosecution did not implement DTV correctly.
Battle of the MFT Parsers
Mari DeGrazia, a well-known and respected forensic analyst from the Phoenix area, noticed anomalies in a tool she was using to parse the Master File Table (MFT) for a case. This spurred her to compare the results of four (4) tools whose purpose is to perform this task. Her findings were… troublesome. Regarding her Analysis, she notes, “Some of the tools did not notify the examiner that the file path associated with the deleted file may be incorrect – which could lead to some false conclusions” (DeGrazia, 2015, par. 6).
Per Microsoft (2016, par. 1), “All information about a file, including its size, time and date stamps, permissions, and data content, is stored either in MFT entries, or in space outside the MFT that is described by MFT entries.” As such, one can see that analysis of the MFT can be pivotal to a forensic investigation. Proper understanding of which files reside on disk, their sizes, their MAC times, etc. is a foundation of many investigations.
Before we delve into DeGrazia’s analysis, ask yourself the following:
- Which MFT parser do you use most?
- Are you 100% confident that your tool of choice is not flawed?
Yeah… about that. Let’s review her findings!
MFT Parsers Compared:
|Tool Name||File Sizes||Deleted Files|
|AnalystMFT||Incorrect||Not Designated & Incorrect Paths|
|List-mft||Correct||Designated & No File Paths|
|Log2timeline||None||Designated & Correct Paths|
|MFTDump||Correct||Designated & Path Confused|
So! Which of the above tools got everything correct? Which tool was that you say? Yeah… exactly :(. DeGrazia’s notes do not include version numbers, but also think of this: Is version of X tool that you’re using as reliable as a previous version?
Personally, I rely heavily on one of the above tools. However, I also implement DTV, which allows me to ensure that my findings are correct. When I have time. I mean always. Yeah, totally always. Totally.
That’s all folks! I think I made my point here. Overall, I believe that DTV is an essential methodology whose use will help ensure correctness of findings, bolster evidentiary confidence, and basically keep your butt from getting fired.
I look forward to any and all feedback, so please do not hesitate to drop a comment!
DeGrazia, M. (2015, September 20). Who’s your Master? : MFT Parsers Reviewed.
Microsoft. (2016). Master file table.
Wilson, C. (2011, July 11). Digital Evidence Discrepancies – Casey Anthony Trial.