What is File Compression?

Introduction

File compression is commonly used when sending a file from one computer to another over a connection that has limited bandwidth. The compression basically makes the file smaller and, therefore, the sending of the file is faster. Of course, when compressing a file and sending it to another computer that computer has to have a program that will decompress the file so it can be returned to “normal” and used.

The next step to compressing a single file is the combining of multiple files into a single compressed archive. By performing this combination process the archive both serves to make the transmission faster for all the files and manages to keep them together for convenience.

Finally, the next step after combining multiple files into a single archive is to maintain the organization of those files once inside the archive. If, for example, multiple files need to be in multiple directories (folders) in order to correctly work on the receiving computer then one characteristic of such an archive would be to keep that directory structure intact within the archive and having the decompression program maintain the directory structure when decompressing the archive.

Why So Many Formats?

You may have noticed in files you’ve encountered, been sent, or just seen written about that there are many different compression formats; each with their own different file extension and compression algorithm. Why so many? There are a variety of reasons. One is simply that different formats were developed for different operating systems over time and these legacy formats continue on today. Another is competition; companies try to develop new and “improved” formats to get the last little bit of fat out of the compression in an attempt to become the “next standard.” The type of file compressed has some bearing on the format of the file being compressed as well; pictures and text often get better compression ratios using different compression techniques for each. And, some people just like to develop something “new” in order to leave their stamp on the industry. Plus, probably as many other reasons as there are formats :-).

How to Download and Decompress Files

  • Download a file simply by clicking on its filename (or on whatever link the author has provided).
  • In Internet Explorer choose the save to disk option from the download dialog. In Firefox you will also be given an option to save to disk. If you are using Netscape, it may say “No Viewer Configured for File Type…” just choose the save or save to disk option.
  • It is best to save the file to a temporary directory that you’ve created just for this purpose. For example, I have a subdirectory on my hard drive named \TEMPDOWN which I use strictly for downloading and decompressing files. That way you always know where the file is and you can always move it from there to a more permanent location when you are finished with it.
  • Use a decompressing utility on the downloaded file. Here you have to use some thought. If the compressed file has a folder structure inside the file you may have to let the utility create this folder structure on your hard drive. Most archives with multiple files in them will have a text file with directions. It is usually called README.TXT, INSTALL.TXT, or some other similar file. Most decompression programs have the capability of displaying the contents of the archive and extracting/viewing single files in the archive. If you see an instruction file we recommend you read it first, to determine what steps are required to actually dearchive, install and/or launch the program.
  • Some programs do not require any additional installation process. In these cases, you can simply create a permanent directory for the program, and copy the files from your temporary directory to the permanent program directory.
  • Once you’ve completely installed the program, and you are sure that it works properly, you can go back and delete the various installation files from your temporary subdirectory. However, you should keep a copy of the original archive file just in case you need to install it again in the future.

Compression/Decompresson Software

There are a variety of programs that compress and decompress files. Some are operating system dependent and others have versions for multiple operating systems. The major programs/formats include:

  • 7-ZipWeb Link (.7z file) – A popular archive format. The free 7-Zip program can also handle many other formats.
  • GNU ZipWeb Link (.gz file) – Used on many *NIX operating systems. Many programs support this archive type.
  • LHA (.lha or .lzh file) – This is now used on multiple operating systems and is a standard on Amiga systems. Free unarchivers exist.
  • RARWeb Link (.rar file) – A proprietary format second only to .zip on Windows systems. WinRAR is a popular program to use although free unarchivers exist.
  • StuffItWeb Link (.sit file) – A popular archive format for the Macintosh although it can be found on other operating systems.
  • Tape archiveWeb Link (.tar file) – Used on many *NIX operating systems. Many programs support this archive type.
  • WinAceWeb Link (.ace file) – A format often used for CD/DVD images. The WinACE program is not free but free dearchivers exist for older versions of the format and the commercial version has a free trial period.
  • ZipWeb Link (.zip file) – Probably the single most popular archive format out there. Many programs support this archive type (both free and commercial). Even Windows itself can create and dearchive .zip files

What is goog-malware-shavar?

If you happen to look in firewall logs or perhaps browse with FiddlerWeb Link running [Fiddler is a transparent proxy that automatically adds itself to the WININET chain which logs requests and responses to allow you to see what is working and what isn’t working.] or some other program that logs HTTP information, then you may very well see some things that sound nasty. One of those that seems to appear often on many systems is “goog-malware-shavar.” In particular, the “malware” part of the entry may give one pause. But, this is one case where bad-sounding is not the same as bad.

goog-malware-shavar is Google’s anti-phishing API.

Google uses it to identify malware, specifically phishing. Google provides data for the anti-phishing feature implemented in Firefox and Google Desktop. These clients get their blacklist and whitelist data using an “update protocol”.

The protocol supports many different blacklists or whitelists. List names are in the form “provider-type-format”, e.g. “goog-phish-shavar”. Each item in a list will represent an expression that will match a malicious URL, but the exact format depends on the list type and how the content is used is application-specific.

For the “shavar” list format, hash prefixes are used to reduce bandwidth. A hash prefix is some number of the most significant bytes of a full-length, 256-bit hash.

So, when you see the goog-malware-shavar entry what follows it is information relating to the anti-phishing built into the Firefox and Chrome browsers and/or the Google Toolbar.

More Information

What is a Folder With File Extension {ED7BA470-8E54-465E-825C-99712043E01C}?

What is a folder with the file extension {ED7BA470-8E54-465E-825C-99712043E01C} and what can you do with it? In 32-bit Vista and Windows 7 (32- or 64-bit) you can create an empty folder with that name and suddenly have instant access to many configuration options in one place.

Sometimes finding the exact link to some configuration change you want to make in Windows can be a frustrating experience. There sometimes seems to be no rhyme nor reason to the organization that tells you were to obviously look. Use this little trick and you can have many of those configuration links in a single place so all you have to do is scroll down to find the one you want.

This option in Windows is commonly called “God mode” because it puts you in complete control in one place. I can go with that so here’s how to create the GodMode folder.

First, highlight the following text and then hit Control-C to put it onto your Clipboard…

GodMode.{ED7BA470-8E54-465E-825C-99712043E01C}

OK, now that you’ve done that right click on an empty space on your Desktop and select New and then Folder from the context menu. When the folder opens and shows you the new folder name highlighted just press Control-V to paste the above name into the dialog for the folder name. Press enter and you’re done. The result should look like this…

God Mode Icon

Note that the icon picture had changed from the generic folder to the above stylized computer display. If that happens you’ve done things right. [Note: It’s the extension that matters. If you don’t like “GodMode” then feel free to change that part to whatever you want.]

Now, just open the folder and look at the configuration options available to you (243 on my computer in 2010)…

Full Control Panel Display

The magic is in the specific file extension given to the folder. That CLSID tells Windows about configuration for the machine.

Enjoy.

Under The Hood

OK, but what’s really going on here? To find out we need to look in the registry. Opening the program REGEDIT and then searching for ED7BA470-8E54-465E-825C-99712043E01C brings us here…

God Mode Registry Entry

The registry key found is the one shown in the graphic. As you can see from the yellow highlighted line the action associated with this registry entry is to open all tasks in the Control Panel for the operating system. So, the folder is simply a pointer into the Control Panel but, unlike the normal Control Panel display, this display shows all the tasks one can perform via the Control Panel in one place.

So, there is no magic involved; just a pointer to a process that already is built into the operating system. Hope that doesn’t take the fun out of it though. 🙂

How to Use WinDirStat to Analyze Your Hard Disk Contents

One of the things you should do to figure out how your hard disk or network drive is organized is to make a list of all of the various file types on the disk and organize that list by file size. This gives you an instant view of what’s taking up the most space on your disk.

While there are many tools available that can do this, CKnow still likes to use the free program WinDirStat because of its speed and excellent graphical representation of the various file types on a hard disk.

It’s an older program that has not been significantly updated in features but it has been updated to keep up with the various versions of Windows. CKnow has not tested it on Windows 7 and the WinDirStat Web site does not mention Windows 7 but it should work fine and, if not, could be run in compatibility mode back to Windows Vista or even XP if needed. [Added: Tested on Windows 7 in February 2013 and it ran just fine on a 3T drive.]

You get the program from the WinDirStat websiteWeb Link which is hosted at SourceForge.

Run the installer and then start the program. When you do, a selection box will pop up and you pick the drive(s) you wish to analyze. Having done that click on the OK button (at the top of the box, not the bottom where you’d usually find one) and the program runs through the various folders on the disk(s) you selected. While doing so, a clever little animation shows you the progress (you can turn this off in the options if you don’t like clever little animations)…

WinDirStat Working

The magic, however, is in the display after the analysis is done…

WinDirStat Output

In the top left you’ll see a directory tree very much like any other directory tree. You can navigate through the various folders using that if you wish. More importantly however, note that this directory tree shows you exactly what percentage of the total used space on the disk is used by that folder and any folders under it. This gives you a much more useful look at the directory tree than Windows itself gives you.

Now, look to the right. This is a particularly helpful display. It shows the various file types on your system and, while it’s not shown above, the percentage of the used hard disk space that this file type takes up on the disk. It will also tell you the total space on the disk used by those file types and how many different files take up this space.

The fun part of the program is in the graphic display of this file type information. The graphic represents the entire hard disk(s) analyzed. Each file type is represented in this diagram by a color so as you look at the various colors you can see where the file type(s) are on the disk in relation to all of the files on the disk. Moreover, each very small square in the diagram represents a single file and these squares are then grouped into larger squares and those again into ever larger squares and so on. Each of these larger squares is a subdirectory. So, the entire diagram is the disk, each very large square is a folder off the root of the drive and then each smaller square inside each of these is another subdirectory under the folder above. This continues down the the smallest square which represents a file.

Click on a square in the colored area and the explorer display in the upper left will indicate the file that the square represents and the file type display will highlight the particular file type of that file.

For those who think this display looks familiar, it is based on the KDE program KDirStatWeb Link.

The program has a number of options to change the display but the default is quite enough for most users.