Whole words only

Monday, December 8, 2008

Improvements in Blog Search

One of the biggest challenges in indexing a blog is to get the correct date for a posting. Blog software generates web pages dynamically, so Web servers usually report a "last modified" date of today. We continue to expand the number of blogging systems for which the Blossom indexer can identify the posting date separately from the Web server reported date. Please let us know if your search index includes a blog and the posting date is not handled correctly.

We have also been working to improve spidering of blogs by identifying archival posts from currently active posts. Archival posts are now spidered less frequently, allowing us to reduce significantly the load spidering places on a blog.