First authored Feb. 2023
Inventory/Catalog files Creating an inventory of evidentiary files
Forensic file copying Article tests over 40 "forensic" file copiers
Forensic Hashing Article tests over 30 "forensic" hash programs.
ZIP-IT for forensic retention Article test a few zipping programs and
ZIP_IT_TAKE2 More tests for your zipping capabilities.
ZIP FILE/container Hashing your zip container reliably
MATCH FILE HASHES Demonstrates hash matches using Maresware.
A HASH software buffet How-to use Maresware hash software
Read this article and raise your forensic intelligence level a few points. 😄
A little background. If you haven't already read ZIP-IT above, please read it before reading this article..
Some preliminary information:
Did you ever stop to think about this situation, or something close. You created a zip container to provide to the opposition. Then at a later date, for whatever reason you created another zip container with the identicle files, and again provided it to the opposition. However, your opponent pulled a fast one and hashed both containers. Guess what, the hashes are different. This started a major discussion: WHY: if the contents of both containers are supposed to be identicle, are the container hashes different. Answer that one in court. Well, here is a possible answer.
I want to remind you that all the testing I have done for this and the articles mentioned above, and reference in this and any other testing related article was done using Windows on an NTFS file system on a desktop computer. The NTFS file system was used as the test environment because I believe that a significant number of corporations and other forensic investigations take place using the NTFS file system. Also, the test environment regarding ability to alter a files last access date, use long filenames and alternate data streams adds to the forensic and evidentiary complexity.The test computer for other forensic evidentiary tests has the last access update registry key turned on. This is so any action on a file will cause the files last access date to be updated. In the following tests of various zipping programs that capability does not come into question, but in some instances it is referenced. It has nothing to do with the overall testing process for this article. However, as stated below, when the zipping program altered (did not reset) the source last access date, it was reset so that all subsequent tests and comparisons were not affected by altered source access dates.
This short article was written as a result of my performing tests on some of the recognized file zipping software programs. They include programs that are routinely recommended and used by most people processing evidence for retention, attorney discovery, or court adjudication. The programs used are: WINrar, 7-Zip, PKzip, WINzip. During the testing, WINzip proved to be the hardest and least forensic program to use in an evidentiary environment, so minimal testing was done using WINzip.
Now, let the tests begin.
TEST OVERVIEW:
These tests were to confirm or deny (like my governmenteze) that when a zipping program zips up suspect files to a container, the container
ends up with a specific hash value.
(NOTE: because some of the zipping programs routinely alter source last access date when they include the program into the container, after
each run the file dates, especially last access were ALWAYS reset to the initial setting to make sure subsequent zipping was looking at the
same source file date/times.)
Then, at a later time, maybe a few minutes or a few hours or days, if those same suspect files are zipped to a new container, that new
container hash is completely different from the prior one(s).
Discussion was done between myself and another old investigator (don't tell him I called him old) regarding this change/update of the container hash each time a
new container was created. So I decided to test that theory.
I set up the following test data as described here:
A top level directory called TEST21 (irrelevant) was set up with six files containing 50 bytes of hex values as shown below by their names.
HEX00 contains 50 hex 00 values, HEX01 contains 50 values of hex01, etc.
HEX00.TXT | 50 | 01/01/2019 07:34:56:789c | 01/01/2019 07:34:56:789w | 01/01/2019 07:34:56:789a EST HEX01.TXT | 50 | 01/01/2019 07:34:56:789c | 01/01/2019 07:34:56:789w | 01/01/2019 07:34:56:789a EST HEX02.TXT | 50 | 01/01/2019 07:34:56:789c | 01/01/2019 07:34:56:789w | 01/01/2019 07:34:56:789a EST HEX03.TXT | 50 | 01/01/2019 07:34:56:789c | 01/01/2019 07:34:56:789w | 01/01/2019 07:34:56:789a EST HEX04.TXT | 50 | 01/01/2019 07:34:56:789c | 01/01/2019 07:34:56:789w | 01/01/2019 07:34:56:789a EST HEX05.TXT | 50 | 01/01/2019 07:34:56:789c | 01/01/2019 07:34:56:789w | 01/01/2019 07:34:56:789a EST
These files were then hashed to determine the correct hash of each file:
E:\...\HEX00.TXT | 871BDD96B159C14D15C8D97D9111E9C8 | 50 |01/01/2019 07:34:56:789c 01/01/2019 07:34:56:789w 01/01/2019 07:34:56:789a EST E:\...\HEX01.TXT | 76E7E36462E7E73C6D8D927BA0E78F73 | 50 |01/01/2019 07:34:56:789c 01/01/2019 07:34:56:789w 01/01/2019 07:34:56:789a EST E:\...\HEX02.TXT | 85F7588E2D312BBD69E927CD3701AF2E | 50 |01/01/2019 07:34:56:789c 01/01/2019 07:34:56:789w 01/01/2019 07:34:56:789a EST E:\...\HEX03.TXT | 07EE8AEA7E9AA3EFAC64666095EC4876 | 50 |01/01/2019 07:34:56:789c 01/01/2019 07:34:56:789w 01/01/2019 07:34:56:789a EST E:\...\HEX04.TXT | AE86B1B5EE54B541BCF64E2C3743D00D | 50 |01/01/2019 07:34:56:789c 01/01/2019 07:34:56:789w 01/01/2019 07:34:56:789a EST E:\...\HEX05.TXT | 37953A8A9A6E70A349875E4B69DCFF1C | 50 |01/01/2019 07:34:56:789c 01/01/2019 07:34:56:789w 01/01/2019 07:34:56:789a EST
Then for each zipping program I compressed the files into a "container" using mostly default settings. In some instances I did it only 2 or 3 times, in other
instances a few more containers were created. Between each run, as mentioned before, because the system had last access update turned on, I
then reset the file date/times to the correct values as seen above before creating the next container. This way all the containers contained
identicle data.
The formats of the output container file names is generally of the form: TEST21_xx_DD_HHHH.ext,
where the xx is generally (except for the rar) replaced by 7z (7-zip), zp (pkzip), wz (winzip), and
DD is the day: ie 22, or 23, and the
HHHH is the time of the run.
So the container name for 7z containers created at two times would have a name as: TEST21_7z_22_0929.7z, and TEST21_7z_22_1005.7z appropriately identified.
WINRAR
First things first. Reading my other zipping articles you will see that winrar is the only zipping program I have found that is truly compatable with all my
evidentiary tests, and is the best one to use in my humble opinion. That being said, lets begin.
WINrar Hash of the test container files at different times:
TEST21_19_1556.rar | 1CD6A6B06BC093893C3BD3DA65FDD130 | 550 | 01/19/2023 15:56:53:234c 01/19/2023 15:56:53:234w 01/19/2023 15:56:53:234a EST TESt21_19_1559.rar | 9C381BB508D0E9A76D7A9555DBFF85CC | 538 | 01/19/2023 15:59:28:576c 01/19/2023 15:59:28:576w 01/19/2023 15:59:28:576a EST TEST21_19_1604.rar | B8E5D4573CE01D993AD99F11A1328C2A | 550 | 01/19/2023 16:04:12:441c 01/19/2023 16:04:12:457w 01/19/2023 16:04:12:457a ESTIt appears that even as little as a 3 minute delay will cause the rar file to have a different hash. It is thought (I know thinking is dangerous) that this difference is the result that somewhere in the header the WINRAR program maintains some sort of date/time reference as to when the program was run and rar file created. Other administrative data may also be maintained in the container header, but I'm not concerned with that. What is of interest is that each run created a different container hash.
7-ZIP
7-zip runs a few hours after the first shows obvious hash difference of the 7z output file. This alteration is anticipated in any future zip
creation. Most likely, as mentioned before, the result of some type of header information.
TEST21_7z_22_0929.7z | 507396E4262FD18A3A5AA9D226E23057 | 262 | 01/22/2023 09:29:54:623c 01/22/2023 09:05:10:875w 01/22/2023 09:29:54:639a TEST21_7z_22_1005.7z | 7362015616B8FBE14989C6DA9525FA2E | 259 | 01/22/2023 10:05:32:400c 01/22/2023 10:01:46:325w 01/22/2023 10:05:32:416a TEST21_7z_22_1414.7z | 1EA8F6D1FCA90C54D628956778AC8CF1 | 228 | 01/22/2023 14:14:00:890c 01/22/2023 14:14:00:890w 01/22/2023 14:14:00:890a
PKZIP:
PKzip alters the last access date of the files being zipped, so after each run the last access date was reset before the next run.
TEST21_zp_22_0942.zip | 5DFD6A149863C66A7975B082034154ED | 1076 | 01/22/2023 09:44:44:207c 01/22/2023 09:44:44:394w 01/22/2023 09:44:44:394a EST TEST21_zp_22_1034.zip | 9B737AEA124D90B52E4691EABF5AE35E | 1076 | 01/22/2023 10:34:39:636c 01/22/2023 10:34:39:761w 01/22/2023 10:34:39:761a EST TEST21_zp_22_1053.zip | D08A6686FA5F24C561933ACDBC2DD9AC | 1076 | 01/22/2023 10:53:38:736c 01/22/2023 10:53:38:876w 01/22/2023 10:53:38:876a EST TEST21_zp_22_1425.zip | 7668D7FF9E36C31ADEDF270FF67E7805 | 1224 | 01/22/2023 14:26:11:075c 01/22/2023 14:24:01:257w 01/22/2023 14:26:11:075a EST TEST21_zp_23_0635.zip | 91EB0C67203CBC69A71CCA4B2DC1648B | 1072 | 01/23/2023 06:36:07:370c 01/23/2023 06:35:26:552w 01/23/2023 06:36:07:370a EST
WINzip
Winzip is a terrible GUI program to conduct evidentiary zipping, etc. So not too much was done except to confirm that it too like all others created different
container hashes after each run. Seel below.
TEST21_wz_22_1000.zip | 14488AFD9393867E48EBDF1187FC2047 | 1042 | 01/22/2023 10:00:56:697c 01/22/2023 09:59:05:197w 01/23/2023 07:50:15:404a EST TEST21_wz_23_0738.zip | 3692BC8D73DC19BF4C05A4A6D45CBBFA | 1042 | 01/23/2023 07:38:27:440c 01/23/2023 07:37:04:236w 01/23/2023 07:38:27:456a EST TEST21_wz_23_0752.zip | 4E61DD7BF881AB9D114981361C48073A | 1042 | 01/23/2023 07:52:13:357c 01/23/2023 07:51:26:049w 01/23/2023 07:52:13:379a EST
FINAL REVIEW CONFIRMATION
All the compressed files were extracted to separate directories bearing the HHHH name to keep all the extracted data separate.
Then a hash was done (see command line used below) for all the files: HEXnn.TXT which was a total of 15 directories, and 90 files. Excerpt from the log file
created is shown here. Directory names which follow the HHHH names of the zipped containers:
DIR 0635 DIR 0738 DIR 0752 DIR 0929 DIR 0942 DIR 1000 DIR 1005 DIR 1023 DIR 1034 DIR 1053 DIR 1414 DIR 1425 DIR 1556 DIR 1559 DIR 1604Maresware HASH command line used to obtain hashes of the 90 extracted files, and log file count.
c:\maresware\hash.exe -f hex*.txt -w 50 -tw -d "|" -v -o RESTORED_HASHES.TXT -1 logfile Number of files processed: 90The 90 hashes were then sorted and counted so see that all the extracts were as they should be, and no erroneous hash values showed up. Below is the total count for each of the file hashes. You can see the final hashes match the original inputs, with no unusual hashes showing up, indicating that all extracts are as they should be. Had an unusual hash showed up in the extraction of the containers, this would mean that the zipping program upon extraction caused an alteration in the data. That would not be nice.!!!!
HEX03.TXT | 07EE8AEA7E9AA3EFAC64666095EC4876 | 50|01/01/2019|07:34:56:789w|EST| +15 HEX05.TXT | 37953A8A9A6E70A349875E4B69DCFF1C | 50|01/01/2019|07:34:56:789w|EST| +15 HEX01.TXT | 76E7E36462E7E73C6D8D927BA0E78F73 | 50|01/01/2019|07:34:56:789w|EST| +15 HEX02.TXT | 85F7588E2D312BBD69E927CD3701AF2E | 50|01/01/2019|07:34:56:789w|EST| +15 HEX00.TXT | 871BDD96B159C14D15C8D97D9111E9C8 | 50|01/01/2019|07:34:56:789w|EST| +15 HEX04.TXT | AE86B1B5EE54B541BCF64E2C3743D00D | 50|01/01/2019|07:34:56:789w|EST| +15
Final Conclusion regarding container hash values.
(reminder: some zipping programs DO not allow for resetting of the original access date of the suspect file when including it into a container,
so any subsequent container creation would obviously have alterations in its hash. This is not a situation that is to be considered here.
Again, ALL subsequent container creationts were done on IDENTICALLY dated suspect files. Any container creation of files with different access
dates, or any date would not fit in to the logic of these tests.)
When using a common zipping program to create a container for the evidence, each time the same evidence is included into a new container, the
hash of that new container will be different from the previously created container. It was not studied, but a practical explanation as to why
each subsequent container has a different hash value might be that the zipping program itself embedds something within the header of the
container which is differendt from each run. The most logical (and not tested) might be the date/time the container is created. This would
account for the fact that even a few minute interval results in a different hash value of the container.
I would be interested in knowing if anyone reading this article does their own test and can identify what changes in the header are made from run to run. That might identify why the hashes are different.
Associated articles and programs of interest:
Inventory/Catalog files Creating an inventory of evidentiary files
Forensic file copying Article tests over 40 "forensic" file copiers
Forensic Hashing Article tests over 30 "forensic" hash programs.
ZIP-IT for forensic retention Article test a few zipping programs and
MATCH FILE HASHES Demonstrates hash matches using Maresware.
A HASH software buffet How-to use Maresware hash software
A fun fact for you real forensicators: decode the time of all the source files or: 07:34:56:789w|EST| and figure what it is in GMT.
I would appreciate any comment or input you have regarding this article. Thank you. dan at dmares dot com,