Generates new-to-world Hacker News headlines, trained on several years of previous headlines. It uses a Markov chain generator and trigrams to produce mostly human-sounding headlines. Old bigrams version also included for comparison.
Follow the Twitter bot here!
By Sasha Laundy and David Lundgren at Hacker School.
TODO:
- Fix regex to not split on apostrophes & fix title casing
- Set it to pick a common starting word for the first seed
- Recapitalize sentences before output
- Use pickle to only generate the matrix once
- Turn into Twitter bot with heroku
- fix length of tweets
- add seed function
TODO SOMEDAY:
- cut off long tail of sentence seeds as they're less likely to lead to a new headline
- Implement real HN headlines so Twitter bot is a "is this real or not?" stream
- Implement check to make sure generated lines aren't coincidentally real ones
- Toward the end of a sentence, transition into bigrams instead of trigrams