Home » Blog
■ To be or not to be, that is the file search question
There was this old wise cowboy that everybody looked up to. At a mere quick glimpse he could give an accurate account of the number of cattle in a herd. Amazed bystanders would ask, how can you do that?
and the answer was: easy: I count the legs then divide by four
(!). This cowboy went to the wrong school, but the rules of logic are there to help make our life easier; sometimes yes
on the negative is easier than no
What I'm driving at with these "state the obvious" αμπελοφιλοσοφίες, comes from a user support question. This guy wanted to find out if xplorer² could help him find all those PDF documents that didn't have any text in them, that is they were just pictures of text that required OCR to get to the actual search-friedly content. It turns out that the mighty xplorer² search engine cannot directly reply this question. You can search for files that have some text in them but there's no way to express the negative query. Still you could identify the text-less PDFs with a 2 step approach:
- First create a list of all PDFs, proper text and image alike, e.g. searching just for *.pdf
- Then do a local search within the previous results (e.g. press <Gtrl+G>) searching for the regular expression [a-z] (matches any single letter, indicative of text content)
- Now the logical coup de gras: use Mark > Invert selection menu command and you will have a selection of PDFs that don't contain any text.
Searching for text in files is an exceptional property within xplorer² search interface, that's why it is "problematic". If you wanted to check something else, e.g. which PDF files didn't have Author information, it would be simpler:
- Insert an additional rule on Authors column
- Search for ? (question mark) which is a wildcard meaning "any single character"
- Specify NOT (negative) context, meaning match anything without author information
Things are much easier with DeskRule, a tool written specifically for file search, where the boolean algebra applies on all search rules, be it filename, property or file content — or any combination thereof. But I digress. It's like having both yes and no. Not-not-yes? Who shaves the barber? And so on :)
Post a comment on this topic »