Quantcast
Channel: Active questions tagged feedparser - Stack Overflow
Viewing all articles
Browse latest Browse all 106

Python Newsaggregator - stopping repeated rss entries

$
0
0

In short terms, I created a newsaggregator that gets the latest three uploads from a given feed. The script works fine, however I have been trying to edit it so that it doesn't output the same article that it output last time I ran the script.

import feedparserclass News:    def __init__(self, url):        self.url = url        self.newsfeed  = feedparser.parse(self.url)    def get_news(self):        print("##########################################")        try:            print("Publication: ", self.newsfeed.feed.title)        except:             print("Publication has no title")        print("##########################################")        for i in range(3):            print("-------------------------------------------------------------", end="")            print("-------------------------------------------------------------------")            entry = self.newsfeed.entries[i]            with open("/home/ramel/Projects/news_aggregator/read.txt", "a+") as done:                if entry.link in done.read():                    continue                else:                    done.write(entry.link +"\n")            print("Title: ", entry.title)            try:                print(entry.published)            except:                print("Date Unknown")            print(entry.link)

As you can see I have tried making another txt file that holds the url to each link. I would then check the file before outputting to see if the link is already in the file. If so, then we could continue on to the next iteration/article. Any help is greatly appreciated.


Viewing all articles
Browse latest Browse all 106

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>