The file must begin with a Section Header Block. Below is typical configuration, with a single Section Header that covers the whole file.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SHB v1.0 | Data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Typical configuration with a single Section Header Block
The Section Header Block is mandatory. It identifies the beginning of a section of the capture dump file. Its format is shown in Figure 3.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Byte-Order Magic | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Major Version | Minor Version | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Section Length | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / / / Options (variable) / / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Section Header Block format. |
The meaning of the fields is:
- Byte-Order Magic: magic number, whose value is the hexadecimal number 0x1A2B3C4D. This number can be used to distinguish sections that have been saved on little-endian machines from the ones saved on big-endian machines.
- Major Version: number of the current mayor version of the format. Current value is 1. This value should change if the format changes in such a way that tools that can read the new format could not read the old format (i.e., the code would have to check the version number to be able to read both formats).
- Minor Version: number of the current minor version of the format. Current value is 0. This value should change if the format changes in such a way that tools that can read the new format can still automatically read the new format but code that can only read the old format cannot read the new format.
- Section Length: 64-bit value specifying the lenght in bytes of the following section, excluding the Section Header Block itself. This field can be used to skip the section, for faster navigation inside large files. Section Length equal -1 (0xffffffffffffffff) means that the size of the section is not specified, and the only way to skip the section is to parse the blocks that it contains.
- Options: optionally, a list of options (formatted according to the rules defined in Section 4) can be present.
Interface Description Block (mandatory)
The Interface Description Block is mandatory. This block is needed to specify the characteristics of the network interface on which the capture has been made. In order to properly associate the captured data to the corresponding interface, the Interface Description Block must be defined before any other block that uses it; therefore, this block is usually placed immediately after the Section Header Block.
An Interface Description Block is valid only inside the section which it belongs to. The structure of a Interface Description Block is shown in Figure 4.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LinkType | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SnapLen | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / / / Options (variable) / / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Interface Description Block format. |
The meaning of the fields is:
- LinkType: a value that defines the link layer type of this interface. (TODO)
- SnapLen: maximum number of bytes dumped from each packet. The portion of each packet that exceeds this value will not be stored in the file.
- Options: optionally, a list of options (formatted according to the rules defined in Section 4) can be present.
Now, let us try to understand how it looks in actually captured file with an example,
Global Header – The global header has a fixed size of 24 bytes.
We captured some packets in wifi monitor mode and saved those to file named understand_pcap_file.pcap, now lets print hex values of global header in bytes,
$ hexdump -n 24 -C understand_pcap_file.pcap | cut -c 11-59
d4 c3 b2 a1 02 00 04 00 00 00 00 00 00 00 00 00
ee 05 00 00 7f 00 00 00
The first 4 bytes d4 c3 b2 a1 constitute the magic number which is used to identify pcap files. The next 4 bytes 02 00 04 00 are the Major version (2 bytes) and Minor Version (2 bytes), in our case 2.4. Why is 2 written on 2 bytes as 0x0200 and not 0x0002 ? This is called little endianess in which, the least significant byte is stored in the least significant position: This means that 2 would be written on 2 bytes as 02 00. How do we know that we are not using Big Endianness instead ? The magic number is also used to distinguish between Little and Big Endianness. The “real” value is 0xa1b2c3d4, if we read it as as a1 b2 c3 d4, it means Big E. Otherwise (0xd4c3b2a1), it means Little E.
Now, we will try to understand next 8 bytes,
$ hexdump -n 24 -C understand_pcap_file.pcap | cut -c 11-59
d4 c3 b2 a1 02 00 04 00 00 00 00 00 00 00 00 00
ee 05 00 00 7f 00 00 00
These are the GMT timezone offset minus the timezone used in the headers in seconds (4 bytes) and the accuracy of the timestamps in the capture (4 bytes). These are set to 0 most of the time which gives us the 00 00 00 00 00 00 00 00
Now, we will try to understand last 8 bytes,
$ hexdump -n 24 -C understand_pcap_file.pcap | cut -c 11-59
d4 c3 b2 a1 02 00 04 00 00 00 00 00 00 00 00 00
ee 05 00 00 7f 00 00 00
The first 4 bytes is Snapshot Length field (4 bytes, ee 05 00 00 ) which indicates the maximum length of the captured packets (dataX) in bytes. In our file it is set to 1518 (0x000005ee) as defined by,
#define SNAP_LEN 1518 in c source code, and is used in
/* open capture device */
handle = pcap_open_live(dev, SNAP_LEN, 1, 1000, errbuf);
if (handle == NULL) {
fprintf(stderr, "Couldn't open device %s: %s\n", dev, errbuf);
exit(EXIT_FAILURE);
}
The last 4 bytes in the global header specify the Link-Layer Header Type. Our file has the value of 0x0000007f meaning 127 in decimal, and as per link-layer header types mentioned at http://www.tcpdump.org/linktypes.html it is radiotap header as below,
LINKTYPE_IEEE802_11_RADIOTAP | 127 | DLT_IEEE802_11_RADIO | Radiotap link-layer information followed by an 802.11 header. |
After the Global header, we have a certain number of packet header / data pairs.
$ hexdump -s 24 -n 16 -C understand_pcap_file.pcap | cut -c 11-59
10 30 60 56 5e 5d 0d 00 89 01 00 00 89 01 00 00
This prints 16 bytes after global header ends, lets try to understand what those are,
The first 4 bytes are the timestamp in Seconds. This is the number of seconds since the start of 1970, also known as Unix Epoch. The value of this field in our pcap file is 0x56603010. An easy way to convert it to a human readable format:
$ calc 0x56603010
1449144336
$ date --date='1970-01-01 1449144336 sec GMT'
Thu Dec 3 17:35:36 IST 2015
Now, lets try to understand next 4 bytes,
$ hexdump -s 24 -n 16 -C understand_pcap_file.pcap | cut -c 11-59
10 30 60 56 5e 5d 0d 00 89 01 00 00 89 01 00 00
The second field (4 Bytes, 5e 5d 0d 00 ) is the microseconds part of the time at which the packet was captured. In our case it equals to 0x000d5d5e or 875870 microseconds.
Now, lets try to understand next 4 bytes,
$ hexdump -C wifimon.pcap -s 24 -n 16 | cut -c 11-59
10 30 60 56 5e 5d 0d 00 89 01 00 00 89 01 00 00
This field is 4 bytes long and contains the size of the saved packet data in our file in bytes. In hex its 0x00000189 which equals 393 in decimal.
$ hexdump -C wifimon.pcap -s 24 -n 16 | cut -c 11-59
10 30 60 56 5e 5d 0d 00 89 01 00 00 89 01 00 00
The next field is 4 bytes long too and contains the length of the packet as it was captured on the wire. In hex its 0x00000189 which equals 393 in decimal.