It is similar to the normalization of feed URLs but with some differences. Most importantly relative URLs have to be translated to absolute ones.
Also, bring back reencoding from iso8859-1 to utf-8 if necessary for urls and fetch_url. This had been removed in a previous commit but it's actually necessary if the URL is in latin-1.
Also changed the order at which url normalization happens, to make sure that the URL string is valid utf-8 by the time it is passed to the normalizator.
↧