Improved news selections - part II

To follow up on the last blog post, changes were made to increase news relevance per coin. The main change is how relevance is calculated per post: previously this was done by looking at how we got it (ie. post was received by following a subreddit or a certain Twitter user), now we apply simple topic detection to discussion on the post, and calculate a weighted total relevance.

This greatly reduces the amount of irrelevant news per coin, as well as in the Crypto overview: say goodbye to non-crypto news appearing in the Crypto overview, or big BTC/ETH events popping up on charts of smaller coins. (The small amount of irrelevant news that still does get through is deleted manually and help us tweak the topic detection parameters.)

List of technical changes:

  • Reduce context propagation: calculate relevance per post in a
    discussion instead of applying context to whole discussion
  • Reduce false positives when scanning for keywords
  • Calculate and index scores per context instead of only indexing
    total score

Other fixes:

  • Fixed: zooming in on historical chart ranges makes browser
    unbearably slow
  • Fix unlinked discussions from ~ Oct 2018
  • Fetch more tweets of which we have replies
  • Be more fault tolerant when dealing with badly formatted urls
  • Fix some coin icons
  • Fix some newlines in tweets
  • Fix links in tweets