{"id":912,"date":"2009-04-13T00:00:04","date_gmt":"2009-04-12T23:00:04","guid":{"rendered":"http:\/\/www.jurecuhalev.com\/blog\/?p=912"},"modified":"2009-04-13T22:01:28","modified_gmt":"2009-04-13T21:01:28","slug":"revcanonical-and-extra-burden-on-services","status":"publish","type":"post","link":"https:\/\/www.jurecuhalev.com\/blog\/revcanonical-and-extra-burden-on-services\/","title":{"rendered":"rev=&#8221;canonical&#8221; and extra burden on services"},"content":{"rendered":"<p>With <a href=\"http:\/\/joshua.schachter.org\/2009\/04\/on-url-shorteners.html\">debate about rev=&#8221;canonical<\/a>&#8221; being the next best big thing in land of <a class=\"zem_slink\" title=\"Twitter\" rel=\"homepage\" href=\"http:\/\/twitter.com\">Twitter<\/a> and shortening services, I wanted to throw in two extra things to consider:<\/p>\n<div class=\"zemanta-img zemanta-action-dragged\" style=\"margin: 1em; display: block;\">\n<div>\n<dl class=\"wp-caption alignright\">\n<dt class=\"wp-caption-dt\"><a href=\"http:\/\/www.flickr.com\/photos\/96586445@N00\/129286848\/\"><img decoding=\"async\" title=\"eggs 2\" src=\"http:\/\/farm1.static.flickr.com\/52\/129286848_fe9c3bf045_m.jpg\" alt=\"eggs 2\" \/><\/a><\/dt>\n<dd class=\"wp-caption-dd zemanta-img-attribution\" style=\"font-size: 0.8em;\">Image by <a href=\"http:\/\/www.flickr.com\/photos\/96586445@N00\/129286848\/\">Dystopos<\/a> via Flickr<\/dd>\n<\/dl>\n<\/div>\n<\/div>\n<blockquote><p>How can we trust the rev=&#8221;canonical&#8221; URL? Who&#8217;s burden it is to prove that they&#8217;re correct URLs. What to do with misconfigured rev=&#8221;canonical&#8221; targets?<\/p><\/blockquote>\n<p><a href=\"http:\/\/shiflett.org\/blog\/2009\/apr\/save-the-internet-with-rev-canonical\">The proposal<\/a> states that they should return <a class=\"zem_slink\" title=\"URL redirection\" rel=\"wikipedia\" href=\"http:\/\/en.wikipedia.org\/wiki\/URL_redirection\">301 redirect<\/a>, but this means three things for service like Twitter to check:<\/p>\n<p>1. Hit original URL and parse HTML<br \/>\n2. Get the new URL and check if it has 301 redirect<br \/>\n3. (optional) in case 301 redirect is not there or is maybe other type of 3xx, does it go and check for original URL.<\/p>\n<p><span style=\"text-decoration: line-through;\">What to do in case of rev=&#8221;canonical&#8221; is the same URL that was just parsed, just like <\/span><a class=\"zem_slink\" title=\"Ars Technica\" rel=\"homepage\" href=\"http:\/\/arstechnica.com\"><span style=\"text-decoration: line-through;\">ArsTechnica<\/span><\/a><span style=\"text-decoration: line-through;\"> does now.<\/span> <em>Do we say fine, lets use that long URL or we then decided on 3rd party URL shortner?<\/em> (Marko points out that they&#8217;re using correct rel= and not rev).<\/p>\n<p>What do you do when you can&#8217;t resolve the domain or something goes wrong in our oh-so-stable interwebs? Does HTML need to be valid or we just use regular expression to find the rev=&#8221;canonical&#8221; part?<\/p>\n<blockquote><p>Second question is, do we really expect services to accept this extra burden?<\/p><\/blockquote>\n<p>Off-loading tiny url generation to 3rd part service like <a class=\"zem_slink\" title=\"bit.ly\" rel=\"homepage\" href=\"http:\/\/www.bit.ly\">bit.ly<\/a> gives you an URL, but doesn&#8217;t guarantee you there&#8217;s anything behind it. You can easily shorten http:\/\/foo.foo into a <a href=\"http:\/\/bit.ly\/1IUZeK\">bit.ly link<\/a>.<\/p>\n<p>This means that suddenly an operation that once took a single call to bit.ly, now takes at least a few magnitudes more CPU and network resources as pages need to be accessed, parsed and checked for validity. While this might be possible for smaller services, I highly doubt Twitter wants to implement this any time soon.<\/p>\n<blockquote><p>Any alternatives?<\/p><\/blockquote>\n<p>There might be a cheat Twitter and other services could use. If we&#8217;re so afraid that we&#8217;re lose the links, it seems that they should be kept in a database under the control of the service.<\/p>\n<p>This doesn&#8217;t fully solve the problem of long term URL maintainance, but at least it&#8217;s under the control of the same provider who stores the original context (e.g. twitts), enabling them to give you nice exports and faster expansion together with one less (perceived) liability.<\/p>\n<h6 class=\"zemanta-related-title\" style=\"font-size: 1em;\">Related articles by Zemanta<\/h6>\n<ul class=\"zemanta-article-ul\">\n<li class=\"zemanta-article-ul-li\"><a href=\"http:\/\/www.techcrunch.com\/2009\/04\/06\/are-url-shorteners-a-necessary-evil-or-just-evil\/\"> john rocker: Are URL Shorteners A Necessary Evil, Or Just Evil? (via TechCrunch) <\/a> (techcrunch.com)<\/li>\n<li class=\"zemanta-article-ul-li\"><a href=\"http:\/\/mashable.com\/2009\/04\/05\/url-shorteners\/\"> 5 Reasons Why URL Shorteners Are Useful <\/a> (mashable.com)<\/li>\n<li class=\"zemanta-article-ul-li\"><a href=\"http:\/\/www.techcrunch.com\/2009\/04\/02\/diggbar-keeps-all-digg-homepage-traffic-on-digg\/\"> DiggBar Keeps All Digg Homepage Traffic On Digg <\/a> (techcrunch.com)<\/li>\n<\/ul>\n<div class=\"zemanta-pixie\" style=\"margin-top: 10px; height: 15px;\"><a class=\"zemanta-pixie-a\" title=\"Reblog this post [with Zemanta]\" href=\"http:\/\/reblog.zemanta.com\/zemified\/647b89c0-d34d-4530-b9aa-7d87d668b8ab\/\"><img decoding=\"async\" class=\"zemanta-pixie-img\" style=\"border: none; float: right;\" src=\"http:\/\/img.zemanta.com\/reblog_e.png?x-id=647b89c0-d34d-4530-b9aa-7d87d668b8ab\" alt=\"Reblog this post [with Zemanta]\" \/><\/a><span class=\"zem-script more-related pretty-attribution\"><script src=\"http:\/\/static.zemanta.com\/readside\/loader.js\" type=\"text\/javascript\"><\/script><\/span><\/div>\n","protected":false},"excerpt":{"rendered":"<p>With debate about rev=&#8221;canonical&#8221; being the next best big thing in land of Twitter and shortening services, I wanted to throw in two extra things to consider: Image by Dystopos via Flickr How can we trust the rev=&#8221;canonical&#8221; URL? Who&#8217;s burden it is to prove that they&#8217;re correct URLs. What to do with misconfigured rev=&#8221;canonical&#8221; [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[14],"tags":[713,714,34,715,716,75],"class_list":["post-912","post","type-post","status-publish","format-standard","hentry","category-tech","tag-bitly","tag-revcanonical","tag-twitter","tag-uniform-resource-locator","tag-url-redirection","tag-web"],"acf":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.jurecuhalev.com\/blog\/wp-json\/wp\/v2\/posts\/912","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.jurecuhalev.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.jurecuhalev.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.jurecuhalev.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.jurecuhalev.com\/blog\/wp-json\/wp\/v2\/comments?post=912"}],"version-history":[{"count":7,"href":"https:\/\/www.jurecuhalev.com\/blog\/wp-json\/wp\/v2\/posts\/912\/revisions"}],"predecessor-version":[{"id":918,"href":"https:\/\/www.jurecuhalev.com\/blog\/wp-json\/wp\/v2\/posts\/912\/revisions\/918"}],"wp:attachment":[{"href":"https:\/\/www.jurecuhalev.com\/blog\/wp-json\/wp\/v2\/media?parent=912"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.jurecuhalev.com\/blog\/wp-json\/wp\/v2\/categories?post=912"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.jurecuhalev.com\/blog\/wp-json\/wp\/v2\/tags?post=912"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}