“Templates may seem like a terrible way to produce sentences until you consider the alternatives.” –Chris Pressey

How can you build an algorithm to build ‘human-readable’ novels? While there was several attempts made, humans never stumbled on a good “answer” to that particular question… yet.

However, people are still planning to sort out this “AI hard” problem. NaNoGenMo (National Novel Generation Month) is held every November…to enter, all you have to do is write an algorithm that can generate 50,000 words and then make your source code publically available. We previously discussed techniques to provide “structure” to computer-generated novels to make these novels easier for humans to read.

But what if we take a different approach to text generation? What if, instead of trying to induce structure in computer-generated works, we directly teach a computer how to write a novel?

There are rules that we have to memorise and use when writing stories. We can call these patterns “grammars”. We have to teach a human being how to write. We can also ‘teach’ a computing machine how to compose too…by essentially hardcoding in the rules of literature. We can name this model a template. We can also write ‘nested’ templates, having rules-within-rules to try and add more variation to the generated stories.

Calyx is a Ruby gem that can be used to quickly define templates. Here’s an example of how one such template can be defined:

class HelloWorld < Calyx::Grammar
  start '{greeting} {world_phrase}.'
  greeting 'Hello', 'Hi', 'Hey', 'Yo'
  world_phrase '{happy_adj} world', '{sad_adj} world', 'world'
  happy_adj 'wonderful', 'amazing', 'bright', 'beautiful'
  sad_adj 'cruel', 'miserable'
end

HelloWorld.new.generate
#"Hello bright world."

Calyx was originally built to generate the NaNoGenMo novel The Gamebook of Dungeon Tropes, a Choose-Your-Own-Adventure with procedurally-generated dungeon rooms. I have personally used it as a good prototyping tool for generating stories. Somewhere, Something is another computer-generated novel based on templates, though the programmer used Python instead of Ruby. In fact, it is far easier to find examples where templates are used in some fashion…than it is to find examples where templates are not used at all.

Templates are probably the easiest way to produce computer-generated literature, and has already been used commercially by companies such as Narrative Science, Automated Insights, and Yseop. These companies produce automated reports and news stories that are founded on real-life data and are consumed by human beings. I have also attempted to produce a similar type of system when I wrote an algorithm to generate the novel The Atheists Who Believe In God.

Templates had also been very popular with “black hat SEO” specialists. These specialists are interested in quick content generation to appease the search engines, no matter how spammy or repetitive this content is. Therefore, these specialists resort to article spinning: taking a prewritten article and then replacing most of the words with equivalent words. As a result, one article can be used as a basis to generate hundreds of “unique” content pieces. There’s even a unique syntax used for article-spinning called Spintax…and there are many parsers for this format written in languages such as PHP, JavaScript, and Ruby. Spintax has also been written to generate spam blog comments as well, modifying the possible replies in the hopes of tricking spam filters and human beings into treating the comments as genuine.

The main trouble with templates is the ‘manual labor’ involved. After all, you still need a human being to produce the templates that the computer uses to compose stories. Yet, this ‘manual labor’ can be reduced with the use of automation. Spammers has written algorithms to “automatically” generate spintax based on a human-written article. While the output of the resulting templates may be ugly, they can be cleaned up later by human beings. Thomson-Reuters also received a patent in 2015 to use machine learning to generate templates based on a corpus of preexisting news material that will then be cleaned up by human beings.

Templates can also be criticized for avoiding the problem of actually adding ‘creativity/intelligence’ to machines. The machine is not really being inspired to write in the same way as a human is…all it really does is follow orders encoded within a template. But other methods of text generation, such as Char-RNN and Markov chains, has failed in producing long-form stories, though they are technically impressive and are evocative to read in short bursts. The smarter the algorithm, the dumber the creative output…at least, for now.

Since templates are so effective in generating human-readable works, they will most likely be used in the near-future in a variety of different fields. It is likely that the first human-readable computer-generated novel will be based on templates used in aggregation with other algorithmic techniques.

Appendix

Even this blog post has been generated using ‘spintax’, and this ‘spintax’ was generated using tools that can be easily found using Google. You can see the source Spintax here. The generated spintax is very awful and required a lot of work to clean up…but AI is always advancing. Who knows what may come next?

I only used spintax only as a proof of concept of templating. I strongly disapprove of article spinning, and so does Google. That’s why it has attempted to penalize the practice during its Panda updates to its search engine algorithm.

results matching ""

    No results matching ""