Below you will find pages that utilize the taxonomy term “Machine-Learning”
A Twitter bot to create threads from web pages
@ThreaderBot listens for Twitter mentions. When someone sends it a link, the bot will:
- Download the content of that link
- Extract and clean the text
- Summarize the text into up to five sentences using the TextRank algorithm
- Answers the Twitter mention with a thread including the generated summary.
The bot is developed in PHP with Symfony, and runs as a cron job in a Raspberry PI.
Implement TextRank algorithm in TypeScript
TextRank algorithm was introduced by Rada Mihalcea and Paul Tarau in their paper “TextRank: Bringing Order into Texts” in 2004. It applies the same principle that Google’s PageRank used to discover relevant web pages.
The idea is to split a text into sentences, and then calculate a score for each sentence in terms of its similarity to the other sentences. TextRank treats sentences having common words as a link between them (like hyperlinks between web pages). Then, it applies a weight to that link based on how many words the sentences have in common. ts-textrank uses Sorensen-Dice Similarity for this.