Friday, June 7, 2019

The Semantic Trinity

I've been thinking a lot recently about what people mean when they use the word "semantics", specifically in relation to programming languages. If you go on Wikipedia, you'll find many different kinds of semantics described that are used in computer science.

A macroscopic view of the work that compilers do, is that it's simply a translation of semantics that are ergonomic for humans into semantics that are ergonomic for machines. From this view, we get a 2-Semantic model. To put it more concretely, we have one set of semantics for source code (let's call it HS for high-level semantics), and one set for executing on hardware (let's call it LS for low-level semantics).

One of the problems of computer science (and also a reason why it's exciting) is that the field is constantly changing. This means that both HS and LS designs change. With the 2-Semantic model we are constantly throwing away old code that works, either because it makes assumptions about hardware that are no longer true, or because it's written in old languages that don't interface with new languages.

Many modern languages have moved to more of a 3-Semantic model that utilize a semantic layer between HS and LS, which we can call MS. We've entered an era of virtual machines and abstract runtimes. This buffer layer means that it's easier to swap out the semantics on the human side or the machine side, and still preserve the usefulness of legacy code. But what about the design of the semantics for this middle layer?

There are many modern choices for MS. Most of them come from a clear Turing machine heritage, like JVM, CLR, LLVM, etc. Some functional languages choose a lambda calculus based MS, although there doesn't seem to be any standardized MS shared between functional languages (hopefully GRIN will change that).

Semantics that are easy for humans to understand come with human baggage, and semantics that are easy for machines to understand come with machine baggage. What about developing semantics that have as little baggage as possible? It seems that that is the domain of mathematics, although the mathematical community has its own baggage.

I would like to see a MS that lasts 50-100 years, and becomes almost universal among languages. I don't think any of the current layers are viable for that sort of longevity or mass adoption.

So what are the problems with the current MS designs?
  1. They don't capture semantics from HS that are relevant to optimization in LS
  2. They expose too many low-level details that aren't relevant to high-level languages
  3. They are more complex than they need to be
  4. They are biased towards particular HS designs
I think it's possible to make that long-lived MS, and I'm working on an attempt to do so, although it would involve some compromises to the HS layers. Hopefully most language designers will agree with the problems I've laid out, because what I will propose as an alternative is fairly radical.

Monday, April 9, 2018

Deconstructing Lambdas, Closures and Application

New Core Grammar

I've removed App and Closure from the core grammar and in their place added SetEnv and Defer. I believe this change will make optimizations easier.

The new grammar is:
  • Zero
  • Pair
  • Env (was Var)
  • Abort
  • Defer (replacing and similar to Closure)
  • SetEnv (replacing and similar to App)
  • Gate
  • PLeft
  • PRight
  • Trace

A Naive Implementation of Lambda Calculus

SIL started off as an implementation of simply-typed lambda calculus, and to understand its current design, it's helpful to know what it was when it started.

Most naive implementations represent variable identifiers as strings in the AST, and the runtime keeps track of a stack of unbound identifiers as well as a map of bound identifiers and their values.

So for example, in
 (\y -> (\x -> x + y)) 1 2
 The runtime would
  1. process lambda x, pushing "x" to the stack and creating a closure
  2. process lambda y, pushing "y" to the stack and creating a closure
  3. take 1, pop "y" off the stack, add an entry for "y:1" in the bound variable map, and return the closure from step 1
  4. take 2, pop "x" off the stack, add an entry for "x:2" in the bound variable map, and evaluate the closure
I didn't want to have strings in my AST, though. My goal has always been to keep the core grammar and an interpreted runtime for SIL as simple as possible, so I don't want to store strings or integers in the AST. Since combinations of Pairs and Zeros is sufficient to represent strings, integers, and pretty much any other type of data, I stick with them.

So my initial implementation used De Bruijn indices so that variables could simply be represented by integers encoded in SIL Pair/Zero combinations.

Closure vs Lambda

When I first started SIL, the runtime representation was exactly the same as the input grammar, except it had Closures. In a fit of concision obsession, I wondered if could make the two representations the same.

What is the difference between a lambda and a closure with an empty environment? I decided there wasn't any compelling difference, and removed lambdas from the grammar.

Then I decided to represent closure environments in the core grammar. All the environment is, is a stack of values to be assigned to variables, and because the core grammar already had a Pair representation, I could nest Pairs to create a stack, terminated with a Zero.

Once a stack of bound variables is available, referring to a particular variable is simply a matter of how you traverse the stack. Going to the right is equivalent to popping a variable off the stack; going to the left is equivalent to selecting a variable. It's similar to De Bruijn indices, where the index number is the number of rights taken in the path through the stack.

So a lambda is equivalent to closure with a Zero for the stack, representing empty.

The Mechanics of Function Application

With the closure environment represented explicitly, I could see the rules of how function application changed the stack. I noticed that rearranging the stack is matter of traversing the existing stack and building a new stack, both of which could be largely accomplished with the existing Core Grammar.

So rather than have an instruction that both adds something to the closure stack and then evaluates the closure, why not have an instruction that simply evaluates a Pair of a closure and its environment? Then you can pass in any environment you want. This is what SetEnv does.

At this point the Closure representation is a pair of a function and an environment built for it. Since it's a pair, why not just make it a Pair and remove Closure from the grammar?

But what is the function part of what was a Closure? It turns out it can simply be something that isn't evaluated immediately, but only when later an environment is given to it. Thus we get Defer, which doesn't evaluate what is inside it, but returns it verbatim.

The Good Part

So what's the point? Why bother with these trivial changes?

Every step of the way through these changes, I made sure that the original lambda calculus could be represented in the new system, and still functioned correctly.

In the new grammar, a lambda is represented by a Pair containing grammar wrapped in Defer, and Env. When this Pair is evaluated, it "binds" the environment by evaluating Env.

But this also leaves open the possibility for a new type of lambda: one that does not bind the existing environment. By changing Env to Zero, we can guarantee that functions are self-contained.

In fact, using this new grammar, it's very easy to tell if a given expression has unbound variables: if the AST has Envs that aren't underneath a SetEnv.

Partial evaluation is now very simple - just traverse the AST and evaluate any sections where variables are fully bound.

Errors at Runtime, or Before?

The new Abort instruction allows exceptions, but also other types of errors. Runtime exceptions are the worst kind of errors, as they either can't be handled by the program, or are often handled in some sort of ad hoc way that may not be appropriate for the specific exception that occurred.

It's clear that catching errors as early as possible is desirable. Usually programmers use unit tests or possibly sophisticated type checking to catch errors.

With a partial evaluation pass in SIL, all AST sections containing Abort instructions that don't depend on unbound variables will be processed, thus giving a sort of "compile time" error catching mechanism.

We can also lift error checking sections of the code out of the rest of the AST, so that anything that would cause a runtime error can be processed after input but before any other calculations are made, possibly returning faster feedback to the user.

With SIL frames, if we save the old processing state before processing new input, if processing new input generates an error, we can return the user to the old state and inform that they need to make new input to attempt to avoid future errors.

Of course, there will still be situations where any possible user input will generate an error, and those situations are unsolvable (unless we want to go further back and save even older program state to restore to).

Thursday, September 28, 2017

SIL: Explorations in non-Turing Completeness


I want to demonstrate that Turing Completeness is unnecessary for general purpose computing, through something I have called SIL.

SIL either stands for Stand-In Language, or Simple Intermediary Language. I intended it to stand for one thing, then forgot what it was, then backronymed the other meaning.

SIL has a very simple core grammar, based on the simply typed lambda calculus, consisting of 10 instructions. On top of this, is a Prelude.sil file that defines common functions you find in functional programming languages, like fold, map, zip, etc.

SIL's type system is very simple. It's constructed of a generic "data" type, a pair type, and an arrow type (taking a type and returning a type). I'm working on adding refinement typing which will be the key to runtime optimizations.

At the stage I'm at, I believe I have demonstrated that general purpose computing is possible in SIL. For evidence, please look at the tictactoe.sil file, which implements a 2-player (no AI) game. If you're not convinced this is an adequate demonstration, please tell me something else to write!

(note: the master branch is outdated, for the best mix of up-to-dateness and non-brokeness, try the pairType branch)

The Core Grammar

  1. Zero: this can represent the number zero, or a terminator
  2. Pair: this is a pair of other expressions. We can use nested pairs to represent lists, numbers, or really any form of data.
  3. Var: this is misnamed. It actually represents the current closure environment. by manipulating it, we can pull out specific variables
  4. App: this feeds data to a function
  5. Anno: this is a deprecated instruction, applying one expression encoding a type annotation to another expression, to be evaluated at compile time. In the refinementTypes branch, work is being made to replace this with a more generic "Check" expression that allows checking an expression against a precondition function.
  6. Gate: creates conditional behavior. What is passed in acts as a switch which needs to be applied to a Pair. If the input is Zero, it selects the left side of the pair, anything else, and it selects the right side.
  7. PLeft: returns the left side of a Pair
  8. PRight: returns the right side of a Pair
  9. Trace: this instruction is simply used for debugging while designing the rest of the language. It echoes the expression within to the console, then evaluates it.
  10. Closure: Originally this was a lambda expression, representing a function. For historical reasons (and possibly future reasons?), I changed this to a closure, but it mostly serves the purpose of a lambda - creating a function.
Using nested pairs we can represent a stack, and this representation is used for closure environments. Values on the stack are the left sides of pairs, and the right sides hold the remainder of the stack. Thus, PLeft Var is an expression representing the first value on the stack, and PLeft (PRight Var) represents the second value.

Functional programmers may be aware that a fixed point combinator can be used to make non-recursive functions recursive. Through this same principle, we can apply a non-recursive function to a church numeral together with a function to return a default value, and get a function that can recurse limited to the magnitude of the church numeral. This is how SIL handles recursion.


Most practical programs can potentially run forever. How can a language where all expressions terminate be useful?

 

The key to this, is plugging the core grammar into input/output loop "frames". Each frame takes a pair of the input and program state, and returns a pair of output and program state. If running the SIL expression generates a null program state, the program is aborted, otherwise the state is fed back into the frame, and the program continues executing.


The Parser

 

Since constructing SIL ASTs by hand to write programs isn't fun, I've written a simple parser. It translates a simple Haskell-like syntax into the AST.

Identifiers (variables) in the syntax are translated to series of instructions that will look up that variable in the relevant closure environment.


The Type Checker

 

I am in the middle of rewriting the type checker to support refinement types, so I don't want to give too much information here that will be quickly out of date. With refinement types, type checking will be a two stage process.

The key function of the first stage is making sure that the expression will terminate. For each element of the core grammar, we can infer a simple type composed of 4 type elements: ZeroType, PairType, ArrowType, TypeVariable (which should really be called EnvironmentType)

Since SIL is based on simply typed lambda calculus, when we infer a simple type for an expression, if that type is self-referential (as in the case of a fixed-point combinator), that expression will not terminate, and we should generate a type error.

With refinement typing, the whole SIL language will be made available to validate expressions. Each expression can be pared with a function written in SIL. If the expression has no free variables, it can be processed at compile time, and alert the programmer if they have semantic errors.

To handle expressions with free variables, we can run an optimization process on the AST to move all refinement checks to the top level of the AST. At the top level, either the refinement checks will depend on no free variables and can be run at compile time, or will be dependent upon non-deterministic runtime values.

If a user of a program written in SIL inputs data that does not pass runtime checks, the error can be reported to the user before any other execution is performed, and the user can instead input different data. This prevents the SIL program from entering many unrecoverable states. The exception (pun unintended) is when there is no possible user input that will pass validation.


Summary

 

Hopefully this gives a good view of the current state and near future of SIL. I'm hoping for other contributors and/or feedback.

Wednesday, December 14, 2016

A Better Model of Computation

Modern software engineering is based on problematic fundamentals.

One thing I loved about computers when I was young, was their deterministic nature. If you understood their rules, you could perfectly predict what they would do. However, as the complexity of computers and software has grown, they increasingly behave in practically non-deterministic ways. Programs often crash or suddenly run out of memory, or hang, or get really slow for no apparent reason.

Some of this in unavoidable, since complexity is the enemy of practical predictability. But we could be doing much, much better than how we are now.

Computer programs have side effects that we care deeply about, such as memory use, CPU time utilization and even heat generation. But if you ask a programmer how much resources their program uses, you'll be lucky if they can give you even the vaguest of answers. We measure all these effects at a top level, and hope that our benchmark runs are representative of real world use. What we should be doing, is using the computer to measure these things from the bottom up.

What we should have, is a world where we know exactly how long a computer program will take to execute down to the nanosecond, given specific inputs.

But wait, doesn't that involve solving the halting problem?

Yes, but only if we're using Turing complete languages. That's why we need to abandon Turing completeness as our foundation paradigm.

To illustrate my perspective, we can segregate the Turing Machine model into two parts, a terminating pure function taking a state and returning a state, and a simple fixed-point combinator for feeding the pure function into itself. The fixed-point combinator is trivial and not worth talking about; all the interesting logic lives in the terminating part.

The pure functional programming community has been working on many such segregations of deterministic and non-deterministic code already. The FRP model (if you get past the continuous time part, which distracts from my main point) shows that we can use a state machine to bridge the IO behavior of a computer and the pure functions that actually define what we care about in software. It's a small jump from there, to eliminating non-terminating pure functions entirely.

Of course, perfect deterministic behavior relies on an open hardware model, which unfortunately is not the case in our current world. It would take the compiler having a perfect model of the CPU it was compiling for, with all its caches, branch prediction and instruction prefetch circuitry modeled in software. But even without that, the terminating paradigm should allow compilers to give us bounding predictions that are much more detailed than current big omega metrics.

So if this model of computation is truly better, how to go about demonstrating it?

My plan is to develop a simply typed lambda calculus, marry it to an FRP framework, and write a simple game like tic-tac-toe in it. After that, I want to add in refinement typing and work it into the FRP framework so that inputs that would violate preconditions are not allowed rather than generate runtime errors.

There is much more I could talk about, such as implications for operating system design, opportunities for optimization and how to transpile existing languages into a terminating model, but I'll wait until I have a working proof-of-concept to write about what comes next.

Thursday, September 18, 2014

The Stupidity of the Blockchain Fragmentation

The world is now filled with cryptocurrency fans, and it would be great, but we're cannibalizing our movement. It's Bitcoin against BitSharesX against Ethereum against Ripple... we've created a situation where we're unwilling to have the others succeed, because it threatens our investments.

Our infighting is unnecessary.

I invested in the Ethereum ICO. Shortly after making my investment, I stumbled on a proposal on bitcointalk. I'm glad that my money has gone towards rewarding Vitalik Buterin and all the rest of the development team for their efforts, but I feel I have to take a public stand now, and say I hope Æthereum succeeds, and Ethereum does not, even though I own a greater proportion of ether than bitcoins.

For those who don't have the patience to read the Æthereum proposal, basically it's all the same code, but a different blockchain. This different blockchain will contain a snapshot of all bitcoin address balances at a certain time, and a mechanism with which bitcoin holders can use their private (bitcoin) keys to retrieve a proportional amount of ether. So bitcoin holders can upgrade to a (possibly) superior technology and keep their original investment safe.

All new cryptocurrencies need this ability. If you see any new ICO that does not give bitcoin holders an automatic stake, you should consider it a scam. Maybe the developers have some quixotic scheme for a more equitable initial distribution of coins, but I'm going to make the judgement that's unlikely to pan out in any form for any coin.

Let me make it clear that I do support developers and initial investors reaping a reward. But I think that can be safely capped at 10-20% initial stake, with bitcoin holders getting the rest.

Cryptocurrency investors should only need to invest once in the idea of cryptocurrency, in archetypal bitcoin, and be able to reap the rewards no matter how the technology changes. It's not the blockchain that matters, it's the balances that matters; it's the weight of emotion: hope, and yes, greed, symbolized in electronic numbers that matters.

Bitcoin sooner or later will be replaced, and we need to have an orderly exit. We can't just hack anything on top of a bitcoin blockchain in perpetuity and expect security and scalability.

We need to stop doing things like BitShares PTS. We already have a blockchain indicating interest in cryptocurrency, and it's called bitcoin.

I really, really hope this post or something similar said in a better way gets popular and permeates the entire cryptocurrency world.

Sunday, April 19, 2009

Learning Happstack

I've been trying to shoehorn Haskell into my job for a while now. Occasionally I'll use it for small command-line tools, but usually it doesn't save any time in such a scenario, so it's hard to justify.

But we do have some internal webapps that need updating. Some of them do horrible things. For instance, to create a customer account, a webapp will write a bunch of raw shell commands to a database where they will be later pulled out by a daemon and run directly. I dunno why this was done, poor-man's threading perhaps.

I'm not touching that particular webapp, since I'm afraid modifying it will unravel more than I hope to correct, but I'm hoping I could shoehorn in some happstack apps in a few other places.

So let's jump right into some code I wrote to try out a simple templated app:


> import Happstack.Server
>    (ServerPartT, Response, dir, toResponse, simpleHTTP, Conf (Conf))
> import Happstack.Helpers (HtmlString (HtmlString))
> import Control.Monad.Trans (lift)
> import Control.Monad (msum, liftM)


If you're going to use happstack, it seems you have to be pretty comfortable with navigating between different levels in composite monads.


> import Text.StringTemplate.Helpers (renderTemplateGroup)
> import Text.StringTemplate (directoryGroup)


It seems that HStringTemplate is the standard way to template HTML in happstack. I think that using '$' as brackets is pretty ugly in HTML. Ideally the template language should mix well with HTML. I've wondered if this couldn't be done with a modified version of html comments (<!--* template code *-->). It's rather verbose, but it makes for valid HTML before the template is rendered. But this is a small complaint.


> exampleMethod :: IO [(String, String)]
> exampleMethod = return $ [("hello", "Hello, World.")]
>
> methodsAssList :: [(String, IO [(String, String)])]
> methodsAssList = [("exampleMethod", exampleMethod)]


I wanted to show in this example how requests for different paths could be handled by different methods. In this example that handling is very simplistic. I think a web framework should have a standard way of handling requests, like breaking the request into a path parameter, a request method and a map of string request variables. I haven't discovered such a thing in happstack yet.


> makeMethods :: IO [(String, IO String)]
> makeMethods = do
>    templates <- directoryGroup "templates"
>    let
>       makeRenderTemplateFunction (name, assListM) = let
>             renderFun = do
>                assList <- assListM
>                return $ renderTemplateGroup templates assList name
>          in (name, renderFun)
>
>    return . map makeRenderTemplateFunction $ methodsAssList


This function is used because the HStringTemplate library can load a group of templates in a single directory at once with "directoryGroup". Then we can translate each method name to a name of a template (i.e. templates/exampleMethod.st)

So all the exampleMethod written above does, is give an association list of variables to replace in the template, and their values.


> fillRequest :: (Monad m) => String -> m String -> ServerPartT m Response
> fillRequest name method =
>    dir name . lift .
>       liftM (toResponse . HtmlString) $ method


This simply associates a method with a path from the root.


> responses :: IO (ServerPartT IO Response)
> responses = liftM (msum . map (uncurry fillRequest)) makeMethods


Now we make a response for each path request to its method, and group these
responses together in one response.


> main = responses >>= simpleHTTP (Conf 8080 Nothing)


finally, we set up the server with some basic settings (i.e. the server responds on port 8080), and set the response.

The only other thing necessary is writing the template at "templates/exampleMethod.st":

<html><body><h2> $ hello $ </h2></body></html>