j
jaipkg.dev
packages / library / Unfurl

Unfurl

second-releaselibrary

Turing complete language with one operator.

No license · updated 9 months ago

Installation

Unfurl is just one executable, unfurl (unfurl.exe on windows). It can be downloaded here.
You may need to give it executable permissions on linux:

chmod +x ./unfurl

After that you can just run the executable in a terminal and it will tell you what to do.

The executable contains explainers on every subject, under various -explain flags. The rest of this readme consists of those explainers (generated by unfurl itself!), in case you're not near your computer at the moment. The -explain flags have syntax highlighting (which this readme does not have), so I definitely recommend using them.

1- Unfurl Explanation

Unfurl is really nothing more than an elaborate substitution machine.
You input rules:

a := b;

And queries:

bbb:

And the evaluator applies the rules to the queries:

aaa

Interestingly enough, there is an enormous wealth of complexity that can be extracted from this simple paradigm.
For example, some rules and inputs have multiple different combinations. The evaluator always evaluates every single possible combination.
As an example:

aa := b;
aaa:

Has two possible solutions:

ba
ab

A more elaborate example:

aa := u;
aa := v;
aaa:

Results in four possible solutions:

ua
au
va
av

The evaluation continues applying rules until no more changes happen. This has two implications.

  1. Infinite evalution is possible:
n := nn;
n:
  1. The rules are applied to 'intermediate' results. This is where most of the real useful complexity of Unfurl comes from.
    As a basic example:
N0 := 1;
N1 := 2;
N2 := 3;
NNN0:

Results in:

3

When running the evaluator with -all-states, it prints all the intermediate steps as well.

NNN0, intermediate
NN1, intermediate
N2, intermediate
3, solved

1.1- Writing .furl files (aka my parser is not so smart so don't shake it too much it'll throw up)

The only valid input is ASCII.
All whitespace is ignored and removed (so spaces will not show up in rules, queries, or solutions).
Comments can be placed between parentheses (like so (they can nest)).

Rules consist of an argument, followed by := and a result, ending with ;:

Arg := Res;

Queries consist of the query string, ending with :;

Query:

Arguments, results, and queries can be any ASCII string besides whitespace, nor can they contain :, =, ;, (, ).

When calling the evaluator on multiple files at once, all their rules and queries are combined into one larger evaluation pool.

2- Turing Completeness

Unfurl is turing complete. The reason why anything is Turing complete is always somewhat vague, since it comes down to "Can you do the things that a Turing machine can do?"
The standard way to go about proving this is to provide a method to simulate any Turing machine, or to simulate something else that can simulate any turing machine.
In Unfurl's case, both have been done. The latter via an implementation of rule 110, and the former will be outlined in here. Furthermore, we will look at some important qualities of Unfurl that make these things possible.

2.1- Important Qualities of Equivalence

2.1.1- The Readhead

Turing machines sit on a tape and move cell-by-cell. Unfurl evaluates one tape in full, but in order to apply rules, it subdivides the tape into smaller 'subtapes' to compare against. In fact, it subdivides in all possible ways, which means that there is always a set of 'subtapes' that look like a 'scanning window' of the full tape: just like the one a Turing machine sees. Let's take the following rule and query:

a := >A;
aaa:

When we look at the intermediate results, we can see that the first three applications appear as if there is a readhead going through the tape, capitalizing the letter it looks at.

>Aaa, intermediate
a>Aa, intermediate
aa>A, intermediate
...

Of course, the evaluator continues, because we did not fully constrain it.
To constrain it, we can include the readhead in the query, as well as including it in every rule, so that evaluation can only happen wherever the readhead is.

>a := >A;
>aaa:
>Aaa

Now this readhead doesn't move on its own, but making it 'walk' can be easily done from here:

>a := A>;
>aaa:
>aaa, intermediate
A>aa, intermediate
AA>a, intermediate
AAA>, solved

2.1.2- The Unbounded Tape

Turing machines sit on an infinite tape. At least, they are not principially bounded. Unfurl begins with a finite tape, but there are many possible rulesets that expand the tape during evaluation, since the length of a rule's result can be larger than its input:

a := bbb;
aaa:
bbbbbbbbb

A good way to make use of this is to have 'end markers' on the tape, and some rules to expand the tape when required. Using our readhead from the previous section, we can make an orderly and infinitely expanding tape:

>a := a>;
>| := >a|;
>aaa|:

2.1.2- The Internal State

Finally, Turing machines have internal state. Unfurl does not have this at all. However, that is not the end of the line; really the important part is the state of the whole system, tape and 'internal state' (or M-configuration) combined. This all-encompassing state can be easily transposed onto an Unfurl tape. The most straightforward way is to include the M-configuration next to the earlier defined readhead. In the following example, 1, 2, 3, and 4 are M-configurations.

1> a := a 2>;
2> a := a 3>;
3> a := a 4>;
1> aaaaa:
aaa4>aaa

A careful reader might notice that the M-configuration can serve as the readhead itself, but it is good for readability to give an indication of which symbol is currently 'under' the readhead.
Notably, this last example is already a pretty direct transcription of a Turing machine. Indeed, we now have all the tools to start generalizing.

2.2- General technique for converting a Turing machine to Unfurl substitution rules

There is of course more than one way to do this, but the technique laid out here approaches optimal.

2.2.1- Query formatting

The query can consist of any number of symbols from the alphabet, but should have 'end-markers' on both ends. For example:

|001010|:

Taking q to be the initial M-configuration, the readhead on the tape is formatted like so:

{q>:
|001 {q>0 10|:

The spacing is only to help us see the current symbol 'under' the readhed. { is added to the readhead for purposes that will become clear later, but for now it also serves to seperate between readhead and tape even more.
A tape consisting of only empty cells can be formatted as follows:

|{q>|:

2.2.2- Transcribing Rules

Essentially, it is a simple translation of the 5-tuple state table.
Take the M-configuration (q) and the scanned symbol (s) on one side, the symbol to print (p) and the next M-configuration (n) on the other.

{q> s := {n> p;

However, the readhead often also needs to move. When this is a move to the right, this is done simply by moving the readhead right of the printed symbol in the rule result:

{q> s := p {n>;

When moving left, we flip the readhead. The next section will explain how that is dealt with.

{q> s := <n} p;

2.2.3- Boilerplate

Next to the state-table, there are two (and a half) more things that need to be arranged before a functional Turing machine can be implemented.

The first is the tape expansion, as outlined in 2.1.2, including expansion to the left:

>| := > 0|;
|< := |0 <;

We use 0 as the 'empty cell' symbol here, which can of course be changed.

The second thing to be arranged is the flipped readhead from earlier. To define any move of the readhead, that move has to be defined for every symbol in the alphabet of the Turing machine, for every M-configuration.

s <q} := {q> s;

The amount of rules here is the same as the amount of entries in the 5-tuple state table, but these left-moves place a lot less strain on the mind, as there is no state change or printed symbol.

The last half thing is to define what halting looks like, as well as producing intermediate steps for the final solution set. Defining the following rule for every M-configuration will split off the intermediate steps:

{q> := [q>;

This convention can also be used to extra clearly denote when the machine has halted:

{f> s := [H> s;

f here is the final M-configuration that results in halting.
But we can optionally choose for halting to mean the complete deletion of the readhead:

{f> s := s;

2.2.4- Total Amount of Rules and Limits Thereof

Let Q be the total amount of M-configurations, and let G be the total amount of symbols in the alphabet (including the 'empty cell' symbol).
The given technique requires R amount of rules for such a machine:

R = 2QG + 2

This does not count the rules required to split off intermediate sets (as that is more of a debugging feature), but it would add Q amount of rules.

The given value of R is also the minimum amount of Unfurl substitution rules that a transcribed Turing machine can have.
There are proofs and some arguable exceptions for this, but for now this is presented as fact, with proof by intimidation.
Feel free to contact me with any questions or objections.

3- What is this good for?

I wanted to make a really simple language that I could practice nontrivial multithreading on.
After I'd made the reference interpreter, I realized that it's really fascinating to have a single-operator turing complete language, and it's a fun sort of puzzle activity to play with.

This being said, at some point I would like to make a 'program searcher', that can find rulesets to solve problems on its own.