errors fetching blogs
It seems we've recently started getting some more errors while fetching feeds again. Unlike #7 (closed), this seems to happen both on my home computer and our cloud-runners. So it seems this is something different than that time.
From what I can tell, there seems to be two issues:
The err.no blog doing something that planet doesn't like with IPv6:
ERROR:planet.runner:Error processing https://err.no/personal/blog/index.xml
ERROR:planet.runner:ValueError: Invalid IPv6 URL
ERROR:planet.runner: File "/builds/freedesktop/planet.freedesktop.org/planet/spider.py", line 480, in spiderPlanet
writeCache(uri, feed_info, data)
ERROR:planet.runner: File "/builds/freedesktop/planet.freedesktop.org/planet/spider.py", line 167, in writeCache
scrub.scrub(feed_uri, data)
ERROR:planet.runner: File "/builds/freedesktop/planet.freedesktop.org/planet/scrub.py", line 129, in scrub
node.value, node.base, 'utf-8', node.type)
ERROR:planet.runner: File "/builds/freedesktop/planet.freedesktop.org/planet/vendor/feedparser.py", line 2332, in _resolveRelativeURIs
p.feed(htmlSource)
ERROR:planet.runner: File "/builds/freedesktop/planet.freedesktop.org/planet/vendor/feedparser.py", line 1723, in feed
sgmllib.SGMLParser.feed(self, data)
ERROR:planet.runner: File "/usr/lib/python2.7/sgmllib.py", line 104, in feed
self.goahead(0)
ERROR:planet.runner: File "/usr/lib/python2.7/sgmllib.py", line 138, in goahead
k = self.parse_starttag(i)
ERROR:planet.runner: File "/builds/freedesktop/planet.freedesktop.org/planet/vendor/feedparser.py", line 1709, in parse_starttag
j=sgmllib.SGMLParser.parse_starttag(self, i)
ERROR:planet.runner: File "/usr/lib/python2.7/sgmllib.py", line 296, in parse_starttag
self.finish_starttag(tag, attrs)
ERROR:planet.runner: File "/usr/lib/python2.7/sgmllib.py", line 338, in finish_starttag
self.unknown_starttag(tag, attrs)
ERROR:planet.runner: File "/builds/freedesktop/planet.freedesktop.org/planet/vendor/feedparser.py", line 2324, in unknown_starttag
attrs = [(key, ((tag, key) in self.relative_uris) and self.resolveURI(value) or value) for key, value in attrs]
ERROR:planet.runner: File "/builds/freedesktop/planet.freedesktop.org/planet/vendor/feedparser.py", line 2318, in resolveURI
return _urljoin(self.baseuri, uri.strip())
ERROR:planet.runner: File "/builds/freedesktop/planet.freedesktop.org/planet/vendor/feedparser.py", line 369, in _urljoin
uri = urlparse.urlunparse([urllib.quote(part) for part in urlparse.urlparse(uri)])
ERROR:planet.runner: File "/usr/lib/python2.7/urlparse.py", line 143, in urlparse
tuple = urlsplit(url, scheme, allow_fragments)
ERROR:planet.runner: File "/usr/lib/python2.7/urlparse.py", line 210, in urlsplit
raise ValueError("Invalid IPv6 URL")
...And blogs.igalia.com seemingly not liking us:
ERROR:planet.runner:Error 403 while updating feed https://blogs.igalia.com/itoral/feed/
ERROR:planet.runner:Error 403 while updating feed https://blogs.igalia.com/nroberts/feed/
ERROR:planet.runner:Error 403 while updating feed https://blogs.igalia.com/apinheiro/feed/