One of the things I enjoy about working at Jobster is that there are so many interesting problems on our roadmap, far too many in fact for us to accomplish given the current size of our development team. If you’re someone who is passionate about working on interesting problems in areas like search, we’d like to talk.
Two of these interesting problems are job and people search. I’ve written on job search before, so my focus here is people search.
The internet is rapidly becoming a place that holds a vast amount of meaningful information about people (mostly early adopters these days.)
But when it comes to being able to search that information across the entire internet (vs. a walled garden) and return people that are relevant to a particular intention, for example hiring, internet search services have only begun to scratch the surface of what could be accomplished.
Resume search is an interesting and easier subset of the full people search problem. For the relatively small number of people who post resumes online, it’s a useful but far from exhaustive subset of useful information about that person.
The first and easiest step is finding pages that are likely to contain resumes. Rather than simply crawling sites that are known to contain resumes (like thejobspider.com), a fairly simple full-text query suffices to return a set of pages that with pretty good likelihood contain resumes.
I’ve put together an extremely simple resume search demo that shows this approach.
You’ll notice that I used Windows Live Search rather than Google for this simple demo.
[It’s interesting to note that the sophistication of Google’s query language is lagging behind some of the competition; consider for example the prefer: qualifier and MSN Live Search Macros . This is probably because end users, by and large, do not use sophisticated queries and because Google has shown mixed interest in being a search platform that others can build value on top of.
I think the experiments Windows Live is doing to replacing paging with an Ajax scrollbar are pretty interesting– the place I would really love to see this functionality in web email inbox.]
Thanks to services like the Alexa Web Search Platform plus their search API, this first stage can leverage comprehensive full text indexes of the web that have already been constructed.
The next, more interesting step is crawling the hits, extracting and indexing structured or semi-structured data from those pages, and using that data to improve the relevance, presentation, and searchability of the data.
Ziggs extracts names, locations, and companies, but this information alone is not enough to determine with a resume is relevant and a lot of pogo-sticking is required to interpret the result.
Pagebites improves on Ziggs by also extracting education, objective, and skill fields and promoting those in the search results.
In both cases relevance remains a challenge. Neither understands that nursing is a profession and that the search results should favor actual nurses over congressmen who happened to include nursing as a keyword.
Both Ziggs and Pagebites rely primarily on resumes. An alternative approach is seen in ZoomInfo which crawls sources like SEC filings and press releases and uses natural language parsing to attempt to extract and collate a virtual bio of the person involved from multiple sources. (See for instance this automatically constructed bio of Guy Kawasaki.) The limitations of this approach are currently accuracy– the natural language technology is not perfect– and comprehensiveness– many people do not appear in press releases or SEC filings. The open ZoomInfo service only allows searching by name, so it’s difficult to evaluate relevance, but in principle they should be able to good things with relevance.
As I was saying, all of these sites just begin to scratch the surface of constructing a complete, accurate, and easily searchable picture of someone based on their online presence. If this is the sort of problem that attracts you, please get in touch with me at the email address listed on this blog, or check out labs.jobster.com/jobs.html.