Better (but worse) Search
February 6th, 2009

Posted by Paul

Ever since we switched over to having paginated video archives (300+ items, each with a thumbnail image was getting very large) I have been trying to figure out a good way to make the videos just as accessible as before. When they were all on one page, you could just use the search function of your browser to find any video you wanted, so I figured adding our own search function would solve the problem.

It was a great search function too: You could use boolean operators (like +, -, AND, OR), you could search for a specific phrase “using quotation marks”, and you could even you wildcard characters to find parts of words (like cheese*). To top it all off, it was backed up by a super fast fulltext index that was all set to scale up to hundreds of thousands of records without any slow down. The problem is that it wasn’t very useful. Turns out when you only have 500 records, no one cares about boolean operators, full phrase matching and wildcards and that specially optimised index that can scale up to 500,000 records, does so by excluding any words smaller then four characters. Back to the drawing board.

LoadingReadyRun’s new search function is the simplest possible system. Type in your search term and it will find every video that has that word anywhere in the title or description. No AND, no OR, no quotation marks. Type in the and you are going to get a whole hell of a lot of results. It will even find partial words. Doing it this way is not only less flexible, it is also way less efficient, but with the number of videos we have,  it seems more useful. In 10 years, when we have a couple of thousand videos in the database, we may need to revisit the problem but at the moment it seems to run just as fast as the old system.

The LRRcasts use the same system, BTW.

As always let me know if you experience any problems or wierdness.

good_search


8 Comments »

  1. I realize how the old search was great in theory and on the back-end, but from the end user’s point of view, it was far from useful. I used Boolean operators and wildcard characters all the time in my searches, because that was the most effective way I had of searching. However, the Achilles Heel of the whole system was that exclusion of all words under 4 characters. Many of your videos, especially some of your popular ones (CSI:CSI), used key words that people would remember, that were less than 4 characters, which made the search kind of, well, useless. As you grow, I hope you’ll find some solution to your problems and strike the right balance, and I wish I could help, but I’m useless in search optimization, so unless you could somehow make two databases (one of words less than 4 characters, one of words more than…which would be just as slow, if not slower when using wildcards…..)…….yeah I’ve kind of lost my train of thought here.

    FIVE MORE YEARS!

    Comment by Master Gunner — February 6, 2009 @ 6:41 pm

  2. works like a charm paul :3

    Comment by Daco — February 6, 2009 @ 6:51 pm

  3. Paul this is a great improvement. One comment though. When you search for multiple terms such as ‘those games’ the search only matches instances where the words are in the same order. For example ‘those games’ and ‘games we’ both return Those Games We Play but the queries ‘we games’ or ‘those we’ do not return anything. Would it be too complicated to break down multiword queries and search for each word individually?

    Another even more minor thing. Should not the copy right notices across the site be updated to 2009?

    Comment by Lord Chrusher — February 6, 2009 @ 8:04 pm

  4. Going out on a limb, if video information is stored in MySQL there are some fairly decent fulltext search functions built in that will handle most if not more of these cases.

    Comment by alice — February 7, 2009 @ 4:33 am

  5. I can finally find the Three PS3s video. Before, nothing i searched would bring it up, and i had to watch it on youtube.

    Comment by egsef — February 8, 2009 @ 10:36 am

  6. I don’t know what it would have been under the old search, but now if you search for “paul”, 3 PS3s is the third search result. I’m pretty sure the search has always done titles and descriptions.

    Also, I like this new search because it makes it easier to find the 64k videos.

    Comment by Yaxley — February 10, 2009 @ 8:14 pm

  7. so you have a couple of thousand “vdieos” in the database
    lol

    we even get a hit when searching Porn. but its not the porn i was wanting

    Comment by ieatninjaz — February 16, 2009 @ 10:42 pm

  8. 1 word: Regular Expressions

    How can a geek site have search without regex capabilities? come on guys :p

    Comment by BYTE-Smasher — March 13, 2009 @ 10:45 am

RSS feed for comments on this post. TrackBack URL

Leave a comment