Variables and variable naming in penguin
----------------------------------------

In the original implementation, all the variable names in penguin were
just strings, like in any other interpreter. But it turned out that
the vast majority of them (especially at the SMT level) were pasted
together from components, and so rather than just set that up to
produce unique strings I decided to make the naming explicit and carry
around the qualifiers.

At the logic level there are three kinds of variable names:

- "plain" variable names are absolute, global, unqualified, etc. They
are just a string.

- "control" variable names are an int and a string. The int is the
instruction number; the string is the per-instruction name. This can
be thought of as N different variables, one for each instruction; it
can also be thought of as a single variable instantiated separately
for each instruction. Operand values for instructions, if symbolic,
are control variables and have control variable names. These variables
are per-target-program (hence the name) and are not repeated across
start states in the guess phase of synthesis.

- "state" variable names are also an int and a string, and the int is
still the instruction number. The difference is that these refer to
values related to or derived from machine state, and need to be
instantiated separately for each start state in the guess solver runs.
They are indexed by instruction number rather than program point
because apparently they were originally by program point but that
didn't work very well -- because we process programs by treating
instructions as transfer functions, we tend to generate scratch
variables related to executing them on a per-instruction basis. And
the state variables that appear at the logic are exclusively such
scratch variables. Variables for machine state only appear when
lowering to SMT. (And for that matter, these scratch state variables
only come into existence during symbolic execution; they have no place
in specs.)

The type "varname" in the logic has one case for each of these.
(And a VarMap is a map whose key type is "varname".)


In symbolic.ml there is a type for symbolic values: symbolic, which is
either a concrete value or a symbol name. This is used in the
representation of symbolic instructions, and the symbol name is always
a control variable name. Consequently the symbol name ("symi") is a
pair with a number and a string; the number is the instruction
number. (The string is generally the name of an instruction field.)

In the long run when things are better organized this symbolic type
should probably be moved out of support/ because it's really part of
the instruction representation and not a standalone thing.

Anyhow, the type "symi" (symbol with int) is the non-type-tagged part
of a control variable name.


At the SMT level there are five kinds of variable names:

 - "plainvar", like logic-level plain names and derived from them;
just a string.

 - "controlvar", like logic-level control names and derived from them:
an instruction number and string.

 - "statevar". These have a start state id as well as an instruction
number and string, and come from the logic-level state variable names.
They are instantiated separately for each start state, unlike plain
and control variables.

 - "machinestate". These have a start state id and a program point and
a string. They arise from ordinary state references in the logic.
These are instantiated separately for each start state.

- "golemvar": these have a start state id and a string. They arise
from golem state references in the logic. they are instantiated
separately for each start state, but only once across the whole
symbolic program.

A start state id can be one of:
- INIT (an initial state for beginning synthesis)
- STATE (an ordinary, concrete, start state for a guess phase
- VERIFY (the abstract/symbolic start state for a check phase)

There can be multiple initial states because to get an additional
distinct initial start state the other initial start states already
found have to be fed in. This was the subject of experimentation at
one point; the experimentation wasn't successful so perhaps the
support should be ripped out.

Note that the previous (as of 11/2020) incarnation of SMT-level
variable naming also had a "qualifier" field for many (perhaps all)
variable name types. This was used by the code in smtcanon.ml that
turns math ints and machine ints into explicit systems of booleans.
However, smtcanon.ml is currently disabled and I don't think it was
ever used; if it's ever brought back additional support for the
qualifiers will be needed, because they need to be read out of the
solver results. (Basically the code in question converts each variable
into one variable per bit, so it appends the bit number to the
variable name.)


Note also that currently the namespaces in the variable names output
to the solver do not seem to be protected (in that e.g. if the name of
a variable begins with "S0", that's treated as the start state ID, but
there's nothing to escape or distinguish user-supplied variable names
that begin with "S0") and this is probably waiting to break the first
time someone uses silly variable names in an input spec. XXX.
