This project is currently in the design and analysis phase. We've gathered a fairly large (225K+ comments) database of comments, primarily from Youtube, that ever-inspiring font of stupidity. We've implemented a web-based comment ranking system to seed our stupidity corpus and that's proceeding nicely. Moderator applications are now open and we're going through them as quickly as possible. We're testing CRM114 as a classification platform, initial tests with the bit entropy and correlative classifiers are pretty promising. Additionally, we've moved to a new dedicated server better suited to the heavy database work we're doing. Next on our to-do list is:

  • Train filter on seed corpus
  • Fine-tune filter, begin self-training database using seed data
  • Test filter on real world data, tune filter, rinse, repeat
  • Implement a web front end for the classifier for demonstration purposes

Check the downloads section for the alpha release.

Additionally, try out the live demo.

Wanna help out?