Quantcast
Channel: Active questions tagged feedparser - Stack Overflow
Viewing all articles
Browse latest Browse all 106

How do I pass raw untrusted text to feedparser.parse method in Python?

$
0
0
  • I am trying to use feedparser to parse text which I download using asyncio aiohttp library
  • The feed text is available HERE (Large document, hence not pasting here)
  • The documentation of feedparser.parse method mentions that you should not send an untrusted string directly to it HERE on GitHub

So here is my code where I am trying to wrap it into StringIO class

import feedparserimport iodef read():    import os    name = os.path.join(os.getcwd(), 'extras', 'feeds','zycrypto.com_1596955288219')    f = open(name, "r")    text = f.read()    f.close()    return texttext = read()parsed = feedparser.parse(io.StringIO(text))for i in parsed.entries:    print(i.summary, '\n')

However I keep getting this error

Traceback (most recent call last):  File "./server/python/test.py", line 14, in <module>    parsed = feedparser.parse(io.StringIO(text))  File "/Users/zup/.local/share/virtualenvs/myapp_v3-kUGnE3_O/lib/python3.7/site-packages/feedparser.py", line 3922, in parse    data, result['encoding'], error = convert_to_utf8(http_headers, data)  File "/Users/zup/.local/share/virtualenvs/myapp_v3-kUGnE3_O/lib/python3.7/site-packages/feedparser.py", line 3574, in convert_to_utf8    xml_encoding_match = RE_XML_PI_ENCODING.match(tempdata)TypeError: cannot use a bytes pattern on a string-like object
  • How do I pass untrusted text to the Python feedparser.parse method to make the sanitizer work on it? My feed has script tags which have not been removed. Thank you in advance

Viewing all articles
Browse latest Browse all 106

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>