2014/04/09

Wireshark Primer: Manual Carve HTTP Objects

by InterDimensional_Shambler
Categories: Analysis, Network Forensics
Tags: ,
Comments: Leave a Comment

Wireshark Primer: Manual Carve HTTP Objects Description:

This is the first wireshark primer article (there will be more) on how to manually carve HTTP objects from network dumps (PCAPs) using wireshark.

A lot of this can be done automatically with tools like network miner, photorec, bulk extractor, and foremost but this article is meant to be a guideline point-of-reference for when those tools fail.


Background:

So before any of this a brief explanation of some things:

  • Capture Filters – These are used while collecting network data to filter only your desired traffic. (They use the same syntax as tcpdump)
  • Display Filters – These only display certain packets based upon your filter.
  • Hex Editor – You will need a hex editor tool (like HxD or 010 Editor) to do manual carves.

There are easy ways to build your filters; this is done by right-clicking the desired packet segment in the packet details pane and applying as filter.


Extracting Objects from TCP Streams:

This is one of the most common and useful functions that I use.

Note: By default wireshark does not have this as a column so you might have to add it. Add it by using a custom field “named” tcp.stream.

This will show you which TCP stream a packet is on.

How To:

Step 1 Finding the Desired Data:

If you want to search the PCAP “easily” you can use CTRL+F and search for a string.

OR

You can a filter that searches the frames for text frame contains TEXT.

Step 2 Following the Stream:

Now if you want to follow a TCP Stream you just right-click the packet you want (or if you’ve added the TCP stream column you can see the set of packets to that stream) and choose Follow TCP Stream.

This will open another window with all the “data” that the application layer is supposed to see.

You will see many things like: HTTP Headers, HTTP Data, Raw files all without the datagram encapsulation bytes.

Now all you have to do is save your stream make sure you save as RAW; otherwise you’ll end up with your data encoded in another format.

WS_1_1-1

Step 3 Carve Object(s) from Stream Data:

So now you should have a file of the TCP Stream, but you just wanted one of the HTTP files in it.

Open your file in your Hex Editor program so that we can remove the excess data.

So this is where you find your file’s “Entry Bytes” (Also known as the File Signature); and it’s End of File Bytes.

So we will remove all the data that ISN’T our file:

Removing data before the file:

WS_1_1-2

Removing data after the file:

WS_1_1-3

Note: Hex values 0D (Carriage-Return) 0A (New Line) may be part of your file, so if content-length is in your header match up your byte count.


Extracting Chunked Objects from TCP Streams:

Most dynamic web content is chunked (Separated into different lengths).

Each chunk indicates its length by a hexadecimal number (in between CR:LFs) preceding the chunk’s data.

How To:

This is basically the same procedure as normal extraction of objects from TCP streams except for step 3.

Additionally we will have to remove the “Chunk Delimiter Data”.

Step 1 Find the Desired Data

Step 2 Follow the Stream (Save as RAW)

Step 3 Carve Object(s) from Stream Data:

So now you should have a file of the TCP Stream (containing your chunked file)

Open your file in your Hex Editor program so that we can remove the excess data.

Removing data before the file:

WS_1_2-1

You should notice a hexadecimal number “separated” throughout your file.

In this example it is 5a8 (1,448 in decimal).

This indicates that there is exactly 1448 bytes between this chunk and the next chunk.

Removing Chunked Delimiter Data:

The easiest way to remove chunked data is with wildcard replace (available in some hex editors).

You simply find all data between 0D 0A * 0D 0A

Then you simply remove all the 0D 0A (Hexadecimal Number) 0D 0A’s throughout the file.

Note: If you see something besides a hexadecimal number between your 0D 0A data, it is NOT to be deleted (isn’t chunk delimiter).

Removing data after the file:

WS_1_2-2

The last chunk of the file should have a final 0D 0A “0” 0D 0A (Make sure you remove that as well).

Note: Take note of your content-length if it is in your header (match up your byte count).


Extracting Objects from HTTP 206 Responses (Partial Content):

HTTP 206 Responses are valid, and successful file downloads.

This occurs when a file is sent across multiple streams (commonly used by downloaders for efficiency).

How To:

This is basically the same procedure as normal extraction of objects from TCP streams except for step 3.

Additionally we will have to “Re-assemble” the data.

Step 1 Find the Desired Data

Step 2 Follow the Streams (Save ALL as RAW)

Step 3 Carve Object(s) from Stream Data

So now you should have multiple streams saved as different files appearing to contain the same file.

Note: Pay attention to the content-range bytes in the HTTP headers you will need them.

Compiling the file:

WS_1_3-1

Here are my awesome paint skills showing the file going across multiple streams.

So this diagram is depicting three streams with the same file (but each strange has different byte ranges of the file).
Normally it will not be as clean as this diagram the scenario I’m going to show is actually over two streams.

Figure out how file is separated:

WS_1_3-2

So here we see both streams HTTP headers; You will need the content-length values.

Stream 1 = 0-4,178,039/4,178,040

This says the ENTIRE file (minus one byte) is in this stream (It lies notice the entire stream in bytes is 2,170,552).

Stream 2 = 2,048,000-4,178,039/4,178,040

This says we have roughly the last half of the file.

So from this we can ASSUME that we have roughly the first-half in Stream 1 and the second-half in Stream 2.

Now all we have to do it put the data together (without overlapping bytes).

Joining Stream Data:

WS_1_3-3WS_1_3-4

 

 

 

 

 

 

 

The simplest way to combine your data is to take the starting bytes (16-32) of the second stream and find it in stream 1.
(Hope that 0D 0A isn’t part of the data…)

(In my example the starting bytes of the second Stream are the right picture, and then the same bytes are found Stream 1 on the Left)

Now all you do is copy ALL of the bytes from Stream 2 right where the data starts in Stream 1.

(In my Example you copy from STREAM 2 offset 0x1AD to the end STREAM 2. Then copy the data into offset 0x1F72CE in STREAM 1)

Note: Don’t’ forget to remove the HTTP headers before and after the file.


Missing Bytes in Stream:

What do you do if you’ve followed these procedures and your file is still not right (length, hash, something doesn’t match)?

Well depending on various things (capture device not processing fast enough to capture data, not capturing enough time, duplex mismatch) you might see this message:

[# bytes missing in capture file]

It will appear where the bytes are missing so you might have to search for it.

So always be sure to check the flags at the end of your stream to verify it had proper termination.

How To “Fix”:

You sort of don’t…you will have to re-pull the capture somehow, or replay the traffic.

It might be possible the file was transferred again (but it is doubtful).


Summary:

Now you should be able to manually carve some HTTP objects from TCP streams.

Happy Hunting!


Leave a Reply

Your email address will not be published. Required fields are marked *



Today is Friday
2018/02/23