Quantcast
Channel: Active questions tagged feedparser - Stack Overflow
Viewing all articles
Browse latest Browse all 106

Python feedparser's bozo is of bool type instead of int

$
0
0

Disclaimer first: I am a python newbie.

I am using feedparser version 6.0.8 (checked using pip freeze | grep feedparser) to parse Twitter feeds from random Nitter instances (using twiiit.com). Python version is 3.9.2 on Linux MX. The problem is that according to docs (https://pythonhosted.org/feedparser/reference-bozo.html) the variable responsible for indicating whether a parsed rss is a well-formated XML is of integer type but in my case it is (almost) always a bool. What is even more troubling is that when bozo is of bool type it does not indicate if a parsed feed is an xml correctly.

At first I have thought that it has something to do with implicit int to bool conversion (0 -> false and 1 -> true) but it is not. Most of the time the result is a bool of value false and it is a valid feed (at least thunderbird can parse it correctly). However it is not a rule because sometimes it is false-bozo and thunderbird cannot parse it*.

Mnimal working example:

import requestsimport feedparserurl = 'http://twiiit.com/nws/rss'print("Requesting rss feed for:"+url)resp = requests.get(url)rss_content = resp.contentd=feedparser.parse(rss_content)print('bozo value:',d.bozo)if type(d.bozo) is int:    print('bozo as int')elif type(d.bozo) is bool:    print('bozo as bool')else :    print('bozo is of unknown type')

What I expect is something like

bozo value:0bozo as int

but instead I have this:

bozo value: Falsebozo as bool

*it is a simplification, in order for thunderbird to update feeds I need to modify them so that ids from different Nitter instances are the same if they refer to the same tweet.


Viewing all articles
Browse latest Browse all 106