Sentence Generation Using n-grams

So as I was in class today browsing my documents folder, I stumbled across a whole bunch of my old chat logs from 2005.

As they were taking up space on my hard drive, I thought a fun application would be to generate sentences using n-grams. Back like science fair! (Except the other way around). I hacked together some python scripts to generate a 3MB file of everything I said in chat for a while, and fed it to the corpus. The results were pretty non-sensical as I’ve had somewhat ‘loose’ spelling conventions when I was younger.

After shopping around a bit on the web, I found a nice python toolkit to do sentence generation using n-grams

a point where people run out of space when my bookshelf is full of hot air tommorow us cool people like kids too reall sure was
IS BOTHERING ME feel like taking a felt pen to my screen now pretty picture up there instead way it could seem like I’m pulling a..
sucks spanish of all languages be nice be great maybe enjoy it tonight then get it to wash my hands going to watch some friends with
nope thats not him know him as well as my real life identity is a very good album definetly a highlightARGH your response matter, you’re windowspamming..

So armed with this, I assembled another corpus of every assignment that I could find on my hard drive (from university to high school), the results of this corpus yielded more interesting results.

Though Philosophy describes that the Stroop interference task cannot be directly perceived, rather that consciousness arises from a low-level domain.
Elephants must eat plants which only grow in a poorly-designed car and being the role of somebody.
Grammar allows thematic roles to be a saying about ‘Lottery in June, corn be heavy soon.’ First thing you know, we’d all be eating stewed chickweed
Animals can be run in two major complexity classes: P and NP.
This pumps sodium out of simple predicates extracted from data.
Clans of elephants rarely interact, and they drive the most powerful country in Europe, also because the cost structure for the formation of representations.
Canada’s healthcare policy was created to allow respiratory gasses to diffuse into the concept activations.
These words and concepts have no place in his discussion about happiness and works to solve complex problems.

So yeah, pretty fun sort of trick. My corpus is somewhat lacking in that I haven’t got a whole lot of data so it ends up being mixes of two sentences which I wrote somewhere. However the consequences are HILARIOUS. (And you can tell that I write a lot about cognitive science stuff)

This entry was posted in Uncategorized. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>