Wednesday, February 25, 2009

Science Fiction/Fantasy Taxonomy

In the last blog post I mentioned how building a kitchen sink taxonomy ended up deadened in the search capabilities because it dulled the specificity. Below is a taxonomy I build that includes several different facets for the different genres within science/fiction and fantasy literature. Rather than simply labeled all books as fiction the taxonomy includes the different sub-genres that make up each niche genre.

Taxonomy Key:
  • [TT]: “Top Term”: the highest level in the hierarchy that represents a larger group.
  • [PT]: “Primary Term”: The official term for that concept within the taxonomy.
  • [G]: “Group”: The different facets that comprise that taxonomy. Not really a specific term, but more a collection of like-terms.

Wednesday, February 11, 2009

Kitchen Sink Taxonomy

Online shopping sites have revolutionized the retail market. It could be argued that sites like changed the way Americans and the world consume. Rather than traveling to the store and browsing consumers can browse a virtual warehouse and order from the comfort of their armchair. However, lost amid the development of these sites is the value in browsing. These sites are perfect for those situations where the consumer understands what they are looking for or has a product in mind. Though currently most shopping sites do a mediocre if not terrible job of helping people find things they aren’t looking for.

A perfect example of this phenomena is While Amazon sells everything from action figures to appliances I mostly shop there for books. I tend to read mostly within the fantasy and science fiction genre which Amazon carries a decent selection. Now Amazon does do some interesting things to attempt to help users find similar material. They show which percentage of people who bought item X also bought item Y and Z. Or they allow users to tag different entries with keywords that reflects that users perceptions on the item.* However, these are all quantitative efforts which churn out results based on a secret sauce algorithm that lurks on a server buried somewhere deep below Amazon HQ. The site’s qualitative attempts to match users with new material or items is not particularly thought provoking.

I recently found myself staring at the entry on Amazon for Hardwired.

A science/fiction novel from Walter Jon Williams about a cybernetically enhanced warrior battling it out in a future earth. I found the book through a roundabout method that has only been availiable since the internet crept out of its servers in a DARPA mainframe to extend throughout the world. I had been searching for books that were similar to Old Mans War from John Scalzi. I stumbled onto the book about a year ago by way of a friend of a friend and quickly gobbled up the entire series and Mr. Scalzi’s other works. Finding books similar to Scalzi’s work in both quality and genre has been a difficult task (save for Heinlein’s Starship Troopers which I had already read). Thankfully, Mr. Scalzi provides a venue on his site for new authors to display their work he calls the “Big Idea” and I have found that an excellant place to find recommendations. There I ran across a recommendation for Forever War which in turn gave me the link to Hardwired.

Scrolling down the page I came across the subject headings for the title. For any non-librarian types essentially the subject headings are the categories (mystery, fiction etc.) that book falls into that give clues about the books content.

Hardwired Cataloging.png

The folks at Amazon described the book with eight different subject headings that basically said the same thing or attributed certain formats (graphic novel) that don’t apply to the work. Clearly, anyone looking at the subject headings would not get the best idea about what the book was about other than science fiction*. I don’t blame the cataloger or whomever decided to ascribe these traits to the novel. Even the best craftsman can only do so much with substandard tools. There Since sells/catalogs so many items their system for categorization has to cover so many subjects the granularity for the system is blunted. Rather than simply science fiction why not create a further level in the taxonomy to include military science fiction, near future science fiction etc.

Rather than having a single large taxonomy that tries to cover everything at the expense of the granular concepts it would be better to have a several smaller more specific taxonomies that can then be mapped together if need be. Right now keyword searches work cause the amount of information online is manageable. However, anyone who looks past the first two pages of results from any popular search engine will see that the results vary. Customized and specialized taxonomies will help users find the resources, but also the best resources for their needs.
*If you are looking for a better example of tagging see Librarything (Though “better” simply my opinion and in the interest of full disclosure I have met the Librarything Librarian, and the company is headquartered not far from where I grew up. Therefore, a bias on my part should be assumed.)

*Some might answer that the only people to look at subject headings/catergorization are librarians. While I would tend to agree with that assessment I would also argue that if subject headings/catergorization gave more information or were more relevant then maybe more non-librarian users might employe them.