FW as TSP

Finnegans Wake ReUnDeconstructed

Joyce pix goes here riverrun, past Eve and Adam’s, from swerve of shore to bend of bay, brings us by a commodius vicus of recirculation back to Howth Castle and Environs. … A way a lone a last a loved a long the

Above we have James Joyce’s mighty masterwork Finnegans Wake, from beginning to end, with only 99.99% of the text omitted in the middle. For those of you who have often seen and immediately recognize them but don’t remember their name or pedagogically-correct textual function, the three dots “…” are called an elision mark and indicate omitted material.

If Finnegans Wake was a piece of music, as perhaps it is, it might have the words da capo written over the end of the last line, which might remind the reader who has forgotten the beginning during the long and tortuous trip down river (down the Liffey to Dublin, through the beautiful poetry of Anna Livia Plurabel and friends) to the end of the book, that this book is itself a “commodius vicus of recirculation”, an unbroken circle of text which might just as well be read starting near the printed end, with the unbroken musical sentence:
A way a lone a last a loved along the riverrun, past Eve and Adam’s, from swerve of shore to bend of bay, brings us by a commodius vicus of recirculation back to Howth Castle and Environs.

Finnegans Wake is thus topologically a circle, or in more correct modern graph-theoretic terms a cycle. Creating a cycle from a set of vertices and various possible edges has been a vital part of mathematics since Leonard Euler began topology with his “Seven Bridge of Konigsberg” problem.

If we specify a metric and insist that the cycle should be as short as possible while joining all vertices, we are restating the notorious Travelling Salesman Problem — that is to say the notoriously hard mathematical problem of the travelling salesman, not the hard ethical problem of the notorious travelling salesman, protagonist of many off-colour jokes involving a sequence of innocent farmer’s daughters.

Finnegans Wake and other text that go on da capo from the end back to the beginning need a name, so for the moment let’s call them text cycles.  Text cycles are slightly more interesting that the more linear texts we are used to because they are so evocative of the notorius TSPs. But the more linear ones are only a very little bit easier to solve.   Whether we do want a text cycle, or will be content with the more usual linear texts, creating such texts from scattered and poorly organized snippets of text is the essential core functionality to be provided by System Leibniz.

System Leibniz deals with many kinds of textual or linguistic objects, but it’s most important functions involve objects that are pairs or triplets of shorter objects.  The best results, so far, seem to be in work with sentence-pair objects, that is to say with pairs of sentences.  It is also very interesting and useful to deal with word-pair objects and paragraph-pair objects. Since such problems produce rather lengthy results, many of the examples and demonstrations given below deal instead with letter-pair (or phoneme-pair) objects — such examples are even more interesting than the longer ones, but somewhat less useful, so far.  The result of applying (the crude early prototypes versions of) System Leibniz to letter-pair objects are word-like things that are of linguistic or poetic interest but are often not to be found in dictionaries and so not useful for practical writtne communication.

2 + 2 = 3

If we have the letter-pairs ‘bo’ and ‘ot’ the common letter ‘o’ permits these two pairs to be combined in the triplet (or word) ‘bot’.

We might also think of this with the aid of the mnemonic device 3 = 2 + 2 as the notion that the word ‘bot’ is made up of the two letter pairs ‘bo’ and ‘ot’, so we can break down longer sequences of letters into letter-pairs then recombine them later. Included below is a short Python program for breaking words down into letter-pairs, but recombining them is not so easily done.

Let us state the problem this way: while the word-breaking, letter-pair-extracting program is short and simple, because there is only one way to break down a word into overlapping pairs, the inverse problem is potentially much harder since there may be many ways to recombine the letters into valid words, and some of these recombinations may be better than others.

You can download here (coming soon) a file of the most common English words, which has been ordered according to word-frequency in some text corpus, with the most common words near the top of the file.   Let us define the quality or goodness of a letter-pair-recombination word by its position in this word-frequency list. For illustrative purposes here is a small sample of these words and a small set of words which are all different recombinations of the same letter pairs. Below are included links to short programs for finding such sets of words.

(to be continued ad nauseam)

…    click to go back to System Leibniz    or,     try other shouldn’t-be-visible pages


Copyright © 2001 Douglas P. Wilson

This entry was posted in Old Pages. Bookmark the permalink.

Leave a Reply