I have a lot of Python rss/atom feeds in my aggregator and entries are
doubled all over the place.
Could'nt find any tool that would merge entries from several sources out
there, in a smart way, by trying to find doublons.
It calculates the diff ratio on the title and content of each entry to
it's the same entry. When the ratio is <= 0.2 it's the same entry
(hopefully :) )
Here's an example ran on these:
The result is here
(It's a one-shot xmlfile, made today, so it's not a real feed
it is still readable by any client though)
Now I've been told that this was pretty useless, and that i would better
make some clean in my feeds and do more interesting stuff in my spare
But i can't help it: everytime i see a feed related to python I just add
to my client :'). So for an unorganized person like me, a CPRSS
personnal website with this merging capability, where i can drop tons of
feeds would be perfect.
(Post originally written by Tarek Ziadé on the old Nuxeo blogs.)