[xplorer˛] — The Registry part II: File classes
home » blog » 25 May 2008
play flash demo

In a previous article we saw how a whole section of the registry called HKEY_CLASSES_ROOT (HKCR for short) is dedicated to shell file types, used by explorer type programs to open, print and do all sorts of interesting feats with your documents. Today we'll have a closer look at this important registry hive ("folder").

In windows document types are determined by their filename extension. A plain text file called PLAIN.TXT is recognized as such by its 3-letter extension TXT. (If you can't see extensions as part of the filename in xplorer˛ you can easily turn them on). Image files have extensions JPG, GIF, PNG and the like. When a folder explorer presents a file to you, it checks on its filename extension to determine e.g. what sort of icon to use in the file listing, or what program to start when you double-click on the icon (notepad or irfanview etc).

All this document specific information is under HKCR registry key. If you open this key with regedit (the windows registry editor), you'll see a massive list of all known filename extensions. Feel free to browse and expand various familiar keys like HKCR\.txt to see what's underneath — taking care not do change anything! Each 3-letter extension corresponds to a file type, e.g. TXT falls under the more descriptive type name txtfile as seen by the default value.

Further down in you will find the definition of txtfile class itself with even more shell-related information. If you examine HKCR\txtfile you will see subkeys with self-descriptive functions, e.g. DefaultIcon tells explorer what icon to use when listing text files. More self-evident details are under HKCR\txtfile\shell, e.g. what program is associated with opening and printing text files (notepad in my case). This is the information you can change using Control Panel's Folder Options applet (File types tab) for various canonical verbs like "open" and "print".

More advanced functionality like thumbnail and infotip handlers for each document type is registered under its shellex key. The key descriptors here get a bit muddy; the one for thumbnail extractors is called {BB2E617C-0920-11d1-9A0B-00C04FC2D6C1}, say again? If you browse the registry you'll see a lot of these funny names, but all they are is just identifiers, like names you cannot pronounce <g> ensuring that they are unique. They are called class identifiers (CLSIDs).

Another strangeness with shellex handlers is that instead of a program name you get another class id, representing the object that is responsible for the work. For example the thumbnail extractor for AVI files is a cryptic object called {c5a40261-cd64-4ccf-84cb-c394da41d590}. What on earth is that? The mystery is solved under another very important key called HKCR\CLSID where all the shell objects are declared. Locating our AVI thumbnail extraction object name under HKCR\CLSID we find the all important InProcServer32 key, which tells us which DLL file is responsible for the shell feature.

All this must be inducing a headache, and it is only the beginning (try to get your head around the way MS Office document types are registered if you dare!). If you take something away from this introduction let it be that document types behavior is spelled out in the registry. You get a lot of standard functionality under the shell subkey, furnished by normal programs (executables). More exotic functionality is serviced by objects declared under shellex subkey, which correspond to DLLs. All executables, DLLs and registry are initialized when you first install a program on your computer, e.g. acrobat reader for PDF documents.

I conclude this topic with a live demonstration of registry browsing in action. xplorer˛ can search for text in complex document types like PDF and Office files. This plain text extraction functionality is all in the registry and xplorer˛ can find it from persistent handlers declared for PDF etc documents. Sometimes when you install and uninstall a lot of programs, the registry associations can become corrupted and text extraction is broken. Let's walk the registry to find the DLL responsible for PDF text extraction

Post a comment on this topic



What would you like to do next?

Reclaim control of your files!
  • browse
  • preview
  • manage
  • locate
  • organize
Download xplorer2 free trial
"This powerhouse file manager beats the pants off Microsoft's built-in utility..."

© 2002—2008 Nikos Bozinis, all rights reserved