The search engine factors in the age of a document to determine its position in the search results list. Naive treatment of dynamically generated documents, e.g., pages from ASP or PHP scripts, degrades the quality of search results, because web servers don't report their age accurately. The heart of the problem is that the web server doesn't know the real last-modified date of a dynamically-generated document, so it tells the Blossom spider the document was last-modified "today".
The solution is to ignore the web server and encode the real last-modified date within the document itself. For many years, the Blossom indexer has used the meta tag http-equiv="Last-Modified" to override the date reported by the web server (see the Search Guide FAQ for details).
With the advent of standards for semantic mark-up of web pages, we are adding other encodings for the last-modified date. Currently we are testing the Open Graph protocol. In particular, the indexer will look in the HTML head section for
<meta property="article:published_time" content="YYYY-MM-DD">
or
<meta property="article:modified_time" content="YYYY-MM-DD">
We will add other protocols to the indexer as the need arises. Let us know if you are using a different protocol by sending email to Blossom Support.
Friday, February 27, 2015
Subscribe to:
Comments (Atom)