Probability-based random text generation

11 months ago
42

In this video I show you a Python implementation of a 1983 article by Brian Hayes from the "Scientific American" magazine called "A progress report on the fine art of turning literature into drivel".

Using weighed probabilities and random number selections, it is possible to generate a mashed up version of an input text.

The simplest data structure to use in this case is a matrix, however, as you'll find out in the video, there are certain limitations.

Links to the article:

- https://www.semanticscholar.org/paper/A-progress-report-on-the-fine-art-of-turning-into-Hayes/066a55a3ca5fa9b38106bb29bcb5c4dd2b30d9e7
- http://bit-player.org/wp-content/extras/bph-publications/SciAm-1983-11-Hayes-drivel.pdf
- https://www.scientificamerican.com/article/computer-recreations-1983-11/

Source code:

- https://codeberg.org/frnmst/solve-computer-science-extras/src/branch/master/computer-recreations/probability-based-random-text-generation

Example using "The Complete Works of William Shakespeare" by Project Gutenberg, on 32GB RAM:

```
order: 7
length: 512
seed: 'FAST E'
---
fast eyes into his grave died for a scottish spring my lord are in the banquo o banquo lennox no indeed and he so thou meets and clarence warwick out ripe in my heels for this shall did makes me it sufficient kneels caesars prevent my lips have our dancer speaking forth the word edward of my noble duke's best save my lord a black neer lusts were buried away confidence with the glass yet he hath he bids the marketplace have been is stopp'd protest in thy affairs fram'd and they're welcome my name of hogs antonio y
```

Another example using "Alice's Adventures in Wonderland" by Lewis Carroll

```
time ./generative_web.py 7 256 "https://raw.githubusercontent.com/ElizaLo/Machine-Learning/master/Text%20Generator/alice_in_wonderland.txt"
order: 7
length: 256
seed: 'ACK TO'
---
ack to the thought alice's right eagerly that she was at the fifth bend about like' said the considered audibly what would do to be treacle from the three or might as usual come back at last word two as there's an atom of a tremble all moved on the trial' stupid
---

real 0m1.858s
user 0m1.486s
sys 0m0.220s
```

CHAPTERS

0:00 Intro
0:35 Alphabet selection
1:00 Frequency counting
1:16 Random number selection based on weighed probability (1st order)
1:31 2nd order
1:58 nth order and memory problems
2:43 Sparse matrix as a possible solution to the memory problem
2:58 generative.py script explanation5:08 generative.py script execution
5:31 generative_web.py script explanation
6:24 generative_web.py script execution
6:34 Outtro

#textgeneration #python #matrix

Loading comments...