Read this article and raise your forensic intelligence level a few points. 😄
First authored May 2019.
However, by the time you read the article, a lot of time may have passed and the software that was tested may have been
updated and now just might pass the tests. However, you should conduct tests of your own to see if the current version
passes your tests and meets your needs.
Recently (10/2022) I offered to provide some test data to persons who wished to test out their zipping programs for evidentiary
reliability and accuracy. Of the testers, I received a response which all should consider: "this has been a very eye opening exercise".
Lets just hope that your opposition didn't have the same response. Because if they did, and followed thru, your next evidentiary
presentation may go out the window.
One thing that really upsets me is that during my testing of the zip programs, and discussion with some other forensicators who use various zipping tools I
found that some of them have the idea that if/when they zip up the evidence, whether it be the original evidence from the server, or their work product result
to provide to the court/reviewer/whatever, they really think that if they can explain what their "provided/final" evidence is, that is enough. And they don't
really care that during their process of zipping, unzipping, zipping, unzipping, you get the picture, that they may have missed something, or lost something in
the translation, so to say. They only argue that their final evidence proves XYZ and have no feelings that they may have either corrupted, lost, or just missed
evidence during the process. They only feel showing the evidence they provide in their final report is what proves the case, and any lost, or otherwise not
processed is of no consequence. Believe it or not, not in these exact words, but this is what I have heard.
Before you get into this article, you might read these associated sequence of articles.
Here are a few articles you might like to read in the order listed. But before reading them, think about this small difference:
the difference between "processing the evidence", and "conducting the forensic investigation". I think these
articles are more targeted to the processing of the evidence rather than the direction you use to conduct the
investigation. They may be very similar but no cigar.
Start here:
Inventory/Catalog files Creating an inventory of evidentiary files
Forensic file copying Article tests over 40 "forensic" file copiers
Forensic Hashing Article tests over 30 "forensic" hash programs.
the one you are on:ZIP-IT for forensic retention Article test a few zipping programs and
ZIP_IT_TAKE2 More tests for your zipping capabilities.
ZIP FILE/container/container Hashing your zip container reliably
MATCH FILE HASHES Demonstrates hash matches using Maresware.
A HASH software buffet How-to use Maresware hash software
Zip-it: That’s what my mother used to say to me when I was bad. But that’s not what we are talking about here. However, we will talk about “bad” zipping software. Yes zipping software doesn’t always perform as expected. What a revelation.
This article and its companion: ZIP_IT_TAKE2, talks about the forensic and evidentiary uses of zipping software. It points out some unique and simple requirements on an NTFS file system which when zipping evidence you may not retain all that is necessary for true evidentiary retention. I have set out some simple requirments on an NTFS file system which when you use your zipping software to zip and store evidence, you may not be getting all the evidentiary data you think you should retain. And when you think about it, isn't zipping from pointA to pointB another method of forensic copying? About 75% of the zipping software I tested failed one or more of my forensic/evidentiary requirments. Is this what you want? Explain its shortcomings to a defense person.
Preliminary case information which determines why I chose the items to test.
First is you have a situation where you can seize the entire computer, or make a full bit image of the drive then some of these
test requirements will be easily met using a suite. See Suite stuff below. However, there are
situations which will be a little more restrictive, and which will cause you (or rather your software) to be more restrictive in
what and how you process the evidence. That situation will be explained here, and again below, just so you get the idea behind
the topics I chose to perform the tests aroung. I think (I know thinking is bad), that testing software under these more restrictive
scenarios will show that the software can not only perform in a more restrictive environment, but also in one in which you have
conplete control.
So lets begin:
The tests were performed on an NTFS file system because I believe that is the most common file system used by corporations today.
It also offers the more items with which we till perform the tests.
So number one is the fact that the software will be able to find unicode file names. Not necessarily display in full unicode
format, but merely find and process those items. And more importantly, when unzipping, restore the correct name.
Then, second because we are on NTFS files system, we must be able to find and process long filenames. Those filename paths
greater than 255 characters. You will be surprised at how many programs can't do that. In some cases, the long filenames may be
missed all together, or their names stored as the 8.3 form, so when unzipped you loose part of the real war and peace filename.
Third, again because NTFS, we will ass ume that the owner of the computer system, (usually a corporation) has last access
update turned on. The last access update may or man not be important to your investigations, but if it is turned on, your program
should be able to NOT tamper with the evidneces last access date. Will your zip, unzip maintain all three original MAC dates, and
more importantly, when unzipping, will it set the original dates. Wouldn't that process be a nice evidentiary step?
Also, you must consider when performing the zip unzip operation from suspect to a work drive for transmit to your office, or
to/from the reviewer or prosecutor, that the zip program retains ALL original file dates so as not to corrupt or influence the
analysys.
Fourth and final: again because of NTFS, we should be able to find, identify, and process where necessary any alternate data
streams. Consider a porn investigation where the user downloads porn from various sites. Did you know, that some browsers (I'm
not telling you which, thats for you to find out) actually store in ADS's the original URL and other information of the download.
Might be very interesting in porn or other internet investigations. Will the zipping program store and restore the ADS evidence?
Have you tested it?
If you perform a bit-image of the drive using a suite, most of these items above will easily be identified and located as
evidence. The bit-image copy is in-fact a true and accurate copy. However, in our test scenario, we are sitting at a corporate
server where we can ONLY process/examine/image/copy/zip-unzip (call it what you will) that directory tree belonging to the suspect. So this
fine line refinement and restriction must be considered when testing our software. So in effect a zip un-zip process is a fancy
copy procedure. Also, the zip un-zip process is often used as a long term storage process. So wouldn't it be nice if this process
saved and restored all pertinent evidentiary information. I would like to think so.
Another thought. I know, thinking is not good. But for long term storage. Years down the road, will you have a program which can
faithfully unzip the evidence when needed. How many times have you performed a task with a program which worked years ago, and
now for whatever reason, you cannot get it to perform any longer. Meaning the original zip program is no longer available, and
you don't have a program which can unzip your evidence.
Before we start:
A challenge
(6/2020) for you to test your forensic hash/copy/zip software for forensic and evidentiary reliability.
As an aside, you might want to check out: Sullivan strickler and their tape archiving solutions. They have been in business quite a long time.
Some preliminary information: I want to remind that all the testing I have done and reference in this and any other testing related article was done using Windows on an NTFS file system on a desktop computer. The NTFS file system was used as the test environment because I believe that a significant number of corporations and other forensic investigations take place using the NTFS file system. Also, the test environment regarding ability to alter a files last access date, use long filenames and alternate data streams adds to the forensic and evidentiary complexity.
PIECE_OF_TAIL.mp4 72,788,245 01/29/2022 08:19:20:653wLooks like an interesting piece of evidence.
PIECE_OF_TAIL.mp4 72,788,245 01/29/2022 08:18:56:678c A..... PIECE_OF_TAIL.mp4:Zone.Identifier 95 01/29/2022 08:18:56:678c .adataLets see whats inside that alternate data stream that was created by firefox browser when the movie was downloaded.
D:\TEMP>type "PIECE_OF_TAIL.mp4[Zone.Identifier]" [ZoneTransfer] ZoneId=3 HostUrl=https://www.dmares.com/maresware/graphics/PIECE_OF_TAIL.mp4- Don't you think knowing the information within that alternate data stream might be helpful to your investigation?
Even though zipping of files/data is a routine normal occurance, those that are conducting security or forensic exams and "save" their results or reports for delivery to managment or attorneys should consider the capabilities of these zipping programs with relation to what you are zipping. Some forensic examiners zip image files, which would work fine. Others zip extracted evidence after processing with a forensic suite. And still others massage the initial extracts and get the data down to a point where they will provide it to managment or legal persons for review.
Also consider that many of you will be asked to work a case where the suspect has a directory (thats a folder for you millenlials) on a much larger server. When you are assigned the case, either criminal or civil, you find out that the search warrant, or the company will not allow you to physically image the 100 Terabyte server if only you are looking at the suspects 500G directory. So you must either find an "imaging" tool that can properly capture that 500G, find a zipping tool that can zip that 500G, or find a copy too that can forensically copy that 500G. Whichever choice/option you choose, don't you think it wise to make sure the tool you use will properly capture (zip) and restore (unzip) the evidence when you are ready to start your analysis. How many of you actually have tested your "imaging" or zipping software to capture all the data in a tree withoug leaving or missing any evidentiary crumbs?
Next
The problem(s) you should consider, is that during your process, you may inadvertently create some unusual data files. The forensic suite may extract alternate data streams, or extract files and put them in a folder which has a long filename. I have seen forensic suites extract data to long filenames as a matter of course. Expecially when you are asking it to recreate the original path/folder structure, and you yourself are outputing the data to a subdirectory which itself starts multi-levels down the tree. You then don't know how long the ultimate extracted folder will be, and when you go to zip up the data for delivery, you may miss one or two important files. Or totally miss an alternate data stream which was hiding behind an important file. Last, but not least, does the zipping program maintain the source and ultimately unzipped file dates? If you have never tested your zipping program to see how/what it does in unusual situations, you probably should. Test not only your version, but the version that the opposition is using.
Recently brought to my attention was the fact of how do investigative agencies handle the storage and retention of their evidence. Whether it be work product, or final reports and evidence that must be stored for future (maybe long time future) reference and restoring. It would be embarrasing if your organization stored evidence in zip formats (or what you thought was a forensic copy of the evidence: see COPY_THAT article), and a few years down the road when it was time to make it available for court or other review, the unzipped data couldn't show any original meta-data, or the original zipping process missed some data, and the unzip process failed to restore the data. Your IT staff may have a completely different goal in mind when handling and storing YOUR evidence.
First, let me state: the tests I ran are not at all scientific. I consider them practical.
Second: I am not using names here because I do not want to point fingers. I’m just pointing out what I found in my unscientific tests. If my minimal tests show a
failure, why proceed further.
Third: The test suite I used was purely arbitrary. Set up for an NTFS file system, and seeded with items
which, from other tests, I knew might cause problems with zipping software, but would not be unheard of in a
forensic or backup environment. NTFS was used because most forensic analysis and reports are created on
computers running Windows and NTFS.
Fourth: I DID NOT test any encryption capabilities of the software. I use separate PGP encryption of the stand-alone zipped file when necessary. Finally: Run some
tests yourself. Don’t take my word for it.
I began by selecting three of the most popular zipping software packages. My versions may not be the most current, because I’m CHEAP and don’t spend money needlessly. Most had both GUI and command line capability. However, for consistency, and because most people prefer the GUI interface, I only used the GUI in my tests. A very important thing to remember with the GUI versions, is that unless all the correct boxes are checked, you may not get the results you expect, and may not obtain results similar to mine. But if the program hid the option so much that I couldn't find it, I may not wish to use that program in the future.
Some required in-depth choices of operations which I would consider required, so I had to look for those options I wished to implement. Each package showed different options for the same operation, and some had no option for a needed capability. I MAY have missed an important option. If I did, it only means the option was so far buried, that a normal person might also forget to look for it, and find it. I tried as best I could to locate and check all the boxes which would cover the items I tested for (ie: Long filenames, Alternate Data Streams, Date retention).
Then I created a folder with the following parameters: Files containing Unicode characters in their filename (ie: CYRILLIC names). Second, I created some files with long filenames (path/filename > 255 characters). Third, I inserted some Alternate Data Streams (ADS) in a number of the files, both those with normal length names, and long filenames.
I created those three types of files because, in a backup scenario, an investigation and subsequent evidentiary output which would probably be sent to an opposing party (attorney), one or all of these types of data (files) might be necessary to produce. Forensic suites and other forensic software operations may routinely export files with any or all of these items.
Also, I have seen discussions, where persons have been asking which methods people use to store data for "posterity". That’s a long time, not your rear end.
The common methods of storing or delivering data are in a zip format. Not only for space saving, but also for inclusion into a single file for distribution to a requesting party.
So, lets test some of the zip(ping) capability of these programs.
Again, I’m not going to name names, or identify which program failed in which area so here are the general results. If you think some of the items may belong to your processes, you might test the software yourself. What a novel idea.
First: All zipped and unzipped all filenames correctly, (UNICODE, etc). Second: Two of three processed Long Filenames correctly. (see ADS below) Third: Only one processed ADS’s in normal and LFN filenames, Fourth: Only one of them reset the last access date of the original files after the zip process. Fifth: Two of three properly reset (original) last access date of the restored file to the original access data. Sixth: Two of three properly reset all the MAC dates to the restored file. The third only reset the original ‘M’odified date.
Here is a quick and dirty spreadsheet showing which program did what. If you want to know the real name of the program contact me at: dm at dmares.com
Program # | Unicode | LFN | ADS | Reset Src Access | Reset Dest Access | Reset MAC Dest. |
---|---|---|---|---|---|---|
File #1, even though it captured the ADS files in both the normal and Long Filenames,
the options to obtain that capture proved to be very confusing. I had to try and create
the zip file over 5 times before the ADS's were properly captured. File #2, the GUI
interface was nothing less than horrible to work with. So much so, I uninstalled it
as soon as the tests were completed.
Special note of Program #4, which is WINRAR and is used in both Linux and Windows.
It is quite inexpensive (acutally I think its shareware, but I paid for a license),
when I first tested it, the program did not have the ability to reset the source
last access date. However, with one simple request, and what I think was a reasonable
evidentiary explanation, the programers agreed to include the reset of source last
access in the next version. Well, as of December 11, 2019 the version 5.80 has all
the capabilities which I tested and it has passed all my tests.
I personally prefer the command line, since I have more control. Just take a look at all the available commands and options with the command line version of WinRar (called rar.exe). "Technically" there is a limitation to the length of the path/filename in WinRar. But it is normally not a concern. If you have an exceedingly long path/filename (>2047 characters) I suggest you get to reading war and peace. (just a joke). The 2047 limit should be enough for most instances. In the next section I have provided a command line that seems to work very well at creating a self extracting exe which passes all my tests. You may want to check it out.
"c:\path_to_winrar\rar" a -sfx -r -ts+ -tsp -os _DEMO.EXE -zc:\"program files"\winrar\comment.txt folders/files-to-ad -ppassword The content of the comment.txt file which contains routine required options is: The comment below contains SFX script commands which will cause the extraction of the .exe to be silent (not ask for user input) and overwrite any existing files during the extraction. The other item which begin with a semi-colon are unnecessary and are included for other purposes not needed at this time. Silent=2 Overwrite=1 ;Setup=setup.exe ;SETUP=setup16.exe not needed ;Presetup=hello.exe ;Path=C:\temp\default_unzip_path ;PATH=.\. ;SavePath the subsequent extraction/unzip command line (which is easily included in a batch file) is C:DEMO.exe -s2 -tsp -tp+ -os -ppassword
Even though the above process seems to work and passes all my forensic, evidentiary preservation requirements, this mention is in NO WAY an endorsement of WINRAR. Don't take my word for it, and test for yourself any zipping program you use and be comfortable with its operation. I have tested and use WINRAR when preparing all my test data and it has worked admirably.
Consider any or all of the above shortcommings when you are archiving, or preparing for discovery your files.
In short, only one of the zip programs tested in this minimal test process passed all the tests. The tests included: Unicode FileName retention, LFN, ADS, reset ALL appropriate MAC dates (of source and restored files).
AND: When you actually think about it, isn't a "zipping" process a sort of copy method for retention, discovery, safe data saving? What evidence might you be missing in the zip process? Also, is next years version of the zipping program going to be able to unzip last years version. Or is product 'A' capable of processing the zip file of product 'B'.
A final thought, but not included in the above list. I tested a recent "free" version of PGP (v8). It compresses, and lo and behold, also had failures. However, since i don't use PGP to compress, only encrypt, I didn't include it in the statistics.
So, which zipping program are you using to store and restore your legacy data or evidentiary file data?
Associated articles and programs of interest:
Inventory/Catalog files Creating an inventory of evidentiary files
Forensic file copying Article tests over 40 "forensic" file copiers
Forensic Hashing Article tests over 30 "forensic" hash programs.
ZIP_IT_TAKE2 More tests for your zipping capabilities.
MATCH FILE HASHES Demonstrates hash matches using Maresware.
A HASH software buffet How-to use Maresware hash software
I would appreciate any comment or input you have regarding this article. Thank you. dan at dmares dot com,