Thursday, September 17, 2015

Domain Weirdness

If you're seeing this page, you probably clicked an old link! I'm working on migrating to Hakyll, but right now things aren't quite working correctly, and all links to blog.gallabytes.com/* are getting redirected to here.

Tuesday, September 8, 2015

Types in High School: A Reflection

1. What's this post about?
2. My story
3. What advice do I have for high schoolers?
4. Anything else?

1 What's this post about?

I've been in a reflective mood lately. I'm about to move to Santa Barbara for undergrad (I'll be majoring in math in UCSB's College of Creative Studies, a wonderful program that more people ought to know about), and I've been thinking a lot about the past year.

This post grew out of a conversation with Bob Constable at OPLSS, in which he asked me to write about my experience learning type theory in high school.

2 My story

2.1 Getting into type theory

I spent junior year immersed in functional programming, but I carefully avoided the world of dependent types. They sounded shiny, but I hadn't yet sat down and read about them. This began to change when I was at the Mathematica Summer Camp, working on a project implementing different lambda calculi in Mathematica. After having extreme difficulty, I decided I should start learning about the theories behind the functional languages I so enjoyed. Thankfully, one of the instructors there also knew a fair bit about type theory, and on his advice I began to read the HoTT book.

At first, I found it to be impenetrable. Hoping to get a better understanding of the theory behind HoTT, I went searching for gentler intros. I toyed around with Oxij's Brutal Intro to Agda, and this course from Conor McBride at Cambridge, but I still didn't really know what I was doing. Eventually, I settled on working through Software Foundations. It took a long time, but SF was by far the most important book in my development as a mathematician. The interactivity, the pacing of the book, and the thoroughness of the material grew me both as a programmer and a prover. I still recommend it to programmers who haven't done any serious proof-based mathematics and to mathematicians who want a more concrete intro than the HoTT book.

After Software Foundations, I read Adam Chlipala's brilliant book Certified Programming with Dependent Types. I enjoyed the automated style of the book. Where SF taught me how to prove theorems, CPDT taught me how to do it elegantly. I was finally starting to get a strong intuition for how to actually use dependent types.

2.2 Becoming a type theoretic native

Somewhere around this point I decided I was ready to start learning about HoTT again. The correspondence hadn't stopped being mysterious to me, and I was finally ready to change that. Lacking the general mathematical background it seemed to expect, it was a bit of a slog (and I still don't really understand much of section 2). Most of my leaps in understanding came from catching the spatial meaning of concepts that I only knew abstractly. In particular, once I realized that path induction was just contracting paths, everything made sense.

I ended up giving a talk on HoTT for the SF Types, Theorems, & Programming Languages meetup in May. I had some experience teaching abstract math to non-mathematicians at my high school (an odd place that I ought to say more about at some point), but this was to be my first real lecture. Probably the most important thing I learned was the absolute necessity of giving people spatial intuition wherever possible. An algebraic characterization of path induction lacks the force of its spatial meaning - that segments are contractible. When I explained it that way, suddenly things made a lot of sense to people. In particular, it makes the unprovability of K obvious (a topic which I spent a long time confused about).

A month later I had graduated high school and was off to OPLSS, where I met a bunch of interesting people and learned a lot of interesting things.

2.3 OPLSS

I honestly don't know where to start. It was the first time I met people who were doing active research in type theory, and it was an extremely valuable experience. Whether I was learning category theory from Ed Morehouse (wonderfully complete lecture notes here), hearing about the foreign land of computational type theory, or pushing the limits of inductive recursive definitions, every minute was filled with interesting conversation and new things to learn about.

Part of the magic of the whole program had to do with the range of people there. The crowd ranged from people who'd never seen a dependent type before to professors doing research in type theory. No matter the topic, if you had a question about something type theory related there would be someone you could ask about it. I can't recommend it enough to anyone interested in learning more about type theory.

It was at OPLSS that I began talking with Nate Thomas about applying type theory to work at MIRI.

2.4 A first real project

It didn't take long for those conversations to turn into projects. A couple weeks after I got back from OPLSS I began an internship with MIRI working on systems for doing provability logic in type theory (some code for which you can find here). I was at once terrified and excited. I'd gotten very comfortable being the kid who knew more than people expected him to. It was the first time people expected me to do something new, not just to know things.

I've leveled up faster during the last few months than any other time in my life. If I had to guess why, I'd point to these reasons:

A good project - working on reflecting type theory has required me to learn more about higher inductive definitions, induction recursion, provability logic, semantics, and different ways to push the capabilities of theorem provers, among other things.
Collaborating - for most of my life, I've been self taught, and anything I've worked on has been similarly isolated. While at MIRI, I've still been fairly independent, but for the first time in years I've reliably had people to talk to about what I was doing and get feedback from. Which brings me to…
The people - even on quiet days, the MIRI office is always filled with smart people who know a lot of things I don't. There's a lot to be said for learning by social osmosis, and I've been experiencing a lot of it.
Scale - I'd never worked on a project of this size before. The closest I'd come was probably a scheme interpreter I wrote back in Junior year while learning Haskell, which, while probably larger in terms of sheer LoC, was also much more guided (by the excellent Wikibook tutorial, Write Yourself a Scheme in 48 Hours).

2.5 Today

My internship is wrapping up, and I'm preparing to move out. During the year, I'm hoping to continue working on reflective type theory in addition to some miscellaneous formalization projects for the Lean standard library.

3 What advice do I have for high schoolers?

3.1 Should you learn type theory?

I would say so. Type theory is like set theory in that the fundamental concepts are fairly simple, but their applications are deep and not yet fully understood. It doesn't take much learning before you can start working on a formalization project, and I would contend that those are some of the best ways to get started in research.

3.2 What should you read to learn type theory?

Here are the sane defaults:

If you have some programming background, start reading Software Foundations, and maybe supplement it with the Lean tutorial.
If you have general mathematical background, pick up the HoTT book
If you have neither, it's really up to you, though I'd recommend learning more about functional programming first. For that I'd recommend SICP (interactive version here).

Another interesting perspective is Bob Constable's not-quite-a-book Naive Computational Type Theory. It aims to do the job of Paul Halmos' Naive Set Theory for type theory, and does a fairly good job. One interesting exercise that I haven't tried yet is to work through all the exercises in JonPRL, a modern and fairly compact implementation of computational type theory.

3.3 General advice on learning math?

Don't trust your math classes about what math is. Unless you're in a very unusual situation (you'll already know if you are) then your math classes are almost certainly not actually math.

Oh, also, don't be afraid to not have a clue what's going on. That never really stops happening.

4 Anything else?

We really, really need better introductory materials for type theory. I'm currently working on another post motivating this and sketching out some approaches for fixing this. Another useful thing would be something along the lines of Stephen Diehl's What I Wish I Knew When Learning Haskell for type theory.

Wednesday, June 4, 2014

Liar Paradoxes and Godel Numbering: An Introduction to Incompleteness

The previous post on recursion had an ulterior motive: laying the groundwork for an explanation of Gödel's Incompleteness Theorems. It took me a long while to fully grok the underlying logic, but through no fault of the theorems themselves. Rather, I've noticed a dearth of good online explanations. The Wikipedia artile is obtuse to say the least, Gödel Escher Bach and Gödel's Proof cost money, and neither of them is web-based anyway. So let's fix that, shall we?
Just a short prewarning: the different parts of this post will feel somewhat disconnected at first, but it'll all come together by the end.

The Liar Paradox
Ah, the Liar Paradox - we've all encountered it in one form or another, but the one most useful to us is this:

> This sentence is a lie

Or, if you like recursive acronym's:

> INIAL's name is a lie

Which expands to

> "INIAL's name is a lie" is a lie

On and on to infinity. The interesting thing about this sentence is that, while it doesn't make sense to call it true, it doesn't really mean anything to call it false either. It is instead stuck in a strange oscillating limbo of contradiction where it's not clear the statement has any meaning at all. In a word, the statement is inconsistent, as is any system that can prove it. This notion isn't yet all that useful, but what we'll find by the end of this article is that it's actually quite portable.

Gödel Numbering
Let's first take a short detour though. At some level, we know we can represent any string of characters with a string of bits - after all, this is exactly what computers do to store text. And, similarly, it's pretty trivial to interpret a string of bits as a number. Now, in and of itself this doesn't seem that revolutionary, but the reducibility of strings to numbers actually leads to an incredibly important result in mathematics - namely, that you can reason about arithmetic from within itself. After all, for any string $\sigma$, "$\sigma$ is a valid theorem in Peano Arithmetic" becomes a valid predicate, as does "$\sigma$ is a true statement of number theory". Suddenly, we've found a way to get arithmetic to swallow its own tail and can begin to use it for metamathematical reasoning.
Indeed, all axiomatic systems can be embedded in some manner similar to this - all it takes is some sufficiently clever ways of doing typographical manipulations of numbers.

A Short Interlude: MIU Gödelized
You may or may not remember my post on the MIU System, but in case you don't, here are the axioms again (note: I use lowercase characters for placeholders, and upper case for MIU characters)

- If we can construct x, we can construct xU
- If we can construct Mx, we can construct Mxx
- If we can construct xIIIy, we can construct xUy
- If we can construct xUUy, we can construct xy
- We can construct MI

A very primitive way to encode this is to make a number, where every M is a 3, every I is a 1, and every U is a 0. If we do this, we can rephrase the axioms in terms of numerical operations:

- If we can construct x, we can construct $ x \cdot 10 $
- If we can construct $ 3 \cdot 10^n + x $, we can construct $ 3 \cdot 10^{2n} + x \cdot 10^n + x $
- If we can construct $ x \cdot 10^{n+3} + 111 \cdot 10^n + y$, we can construct $x \cdot 10^{n+1} + y $
- If we can construct $ x \cdot 10^{n+2} + y$, we can construct $x \cdot 10^n + y $
- We can construct $ 31 $

Of course, this method of encoding is relatively primitive and hard to reason about, but it's more than sufficient as a proof-of-concept. And, importantly, it demonstrates one of the most important facts of positional notation - typographical operations become as simple as separating the positions out and doing funny operations on the individual pieces. But, as much as I'd love to delve into a long rant about number and string encoding, we have a topic to get back to.

Bringing it all together
So, where are we going with all this nonsense? Well, we've already seen how simple it is to embed statements like "$\sigma$ is a valid theorem in Peano Arithmetic", so why not use the weaker notion "$\sigma$ is provable in Peano Arithmetic"? (Or, more formally, "$\exists\lambda$ such that $\lambda$ is a valid proof of $\sigma$"). Or, similarly, "$\sigma$ is not provable in Peano Arithmetic". Now, this notion itself isn't that useful, but we can construct a true statement $\epsilon$ that reads "$\epsilon$ is not provable in Peano Arithmetic" - suddenly, the statement is talking directly about its own provability. And, in a similar vein to the liar paradox, it can't be proven without also showing inconsistency. So, either we can prove both $\sigma$ and $\neg\sigma$ or we can't prove $\sigma$ (and therefore $\sigma$ is true). And, if we can't prove $\sigma$, then we've just constructed a statement that is both true and can't be proven. This creates a dichotomy: either a formal system is inconsistent, or it's incomplete. And so ends Hilbert's Program.

Monday, April 28, 2014

Recursion Explained

This is the first in a series of posts with a specific, targeted audience, which means there's going to be a relatively wide variation in level-of-explanation. Some of the posts are going to very clearly be me-writing-for-someone-who-already-knows-lots-of-relevant-information, and some (like today's) will be me-writing-for-someone-who-wouldn't-normally-read-this-blog. Anyway, today's topic is what recursion is and why we should care. If you already think recursion is important, this post'll be pretty surface-y and review-heavy. So, let's get started, shall we?

Recursive-Xzibit thinks Recursive-Xzibit thinks Recursive-Xzbit thinks ... that recursion is fun

So yeah, recursion. The basic idea is something that acts on itself, or something with multiple different levels. And oh, the places you can go with self reference!
The most recent/significant pop-cultural example I can think of is the movie Inception, where dreams could occur within dreams within dreams, and what happened in one level would bleed down into all the levels below it (e.g. when gravity stopped in the hotel because the truck they were dreaming in was falling off a bridge).
Recursion comes in a few different classes, and I've yet to hear any good terms to differentiate them, so I'm going to make some up:

Unidirectional recursion

Feedback only goes in only one direction
Example 1: The Fibonacci Sequence (each element is the sum of the previous two elements, the first few are $0, 1, 1, 2, 3, 5, 8, 13, 21...$) - the later parts of the list are completely unimportant in figuring out the first few
Example 2: Sorting a list (using quicksort) - one simple and efficient way to sort a list is a monotonic recursion - you pick the first number off the list, split the rest of it in two by putting all the numbers smaller than the first one on one side and bigger on the other, then sort each of those sublists

Bidirectional recursion

Feedback can go in either direction
Example: Inception - the environment you're dreaming in influences the dream, but the events within the dreams have lasting influence in the outside world (e.g. planting an idea in someone's head, taking useful information, etc)
Example 2: INT - this graph is sorta strange. It's constructed entirely out of smaller versions of itself - any single point contains the whole graph. The lower levels are exactly the same as the higher ones. Cool, right?

Mutual recursion

Two things that are defined, each one in terms of the other
This one's generally a bit more complicated than the other two
One set of good examples can be found in Hofstadter Sequences - specifically, the Figure-Figure sequence
It's pretty neatly defined in terms of itself and its complement (the set of all numbers not in the sequence)
The first number of the sequence ($R(0)$, if you will) is $1$
The first number not in the sequence (Let's call it $S(0)$) is $2$
The rest of the sequence is defined $R(n) = R(n-1) + S(n-1)$
So, for example, $R(1) = R(0) + S(0) = 3$, and $R(2) = R(1) + S(1) = (R(0) + S(0)) + S(1) = (2+1) + S(1) = 3 + S(1) = 3 + 4 = 7$
Let's write out another one just for kicks:
$R(3) = R(2) + S(2) = (R(1) + S(1)) + S(2)$
$= ((R(0)+S(0))+S(1))+S(2) = (3+4)+5 = 12$

extremely important

Monday, April 7, 2014

Playing with Probabilities

MDists.lhs

The Header

Imports, pragmas, and the module declaration

> {-# LANGUAGE TupleSections #-}
> module MDists where
> import Data.Map (Map, (!))
> import qualified Data.Map as M
> import Data.Function
> import Data.List
> import Data.Ratio

The Code

> type P = Rational
> type PDist a = Map a P

Hypotheses are getting represented as probability distributions with a probability attached to them

> data Hypothesis a = H {conditionals :: PDist a,
>                          chance :: P} deriving (Show)

Next, the distribution over hypotheses - let’s make strings our labels, just because strings are easy to use

> data HDist a = HD {hps :: Map String (Hypothesis a)} deriving (Show)

Helpers

First, we want to implement a few basic funcitons that will allow the rest of these to work nicely For example, we’ll want to be able to apply some function [0, 1] → [0, 1] to the probabilities we’re juggling here

> applyToProbability :: (P -> P) -> Hypothesis a -> Hypothesis a
> applyToProbability f (H d p) = H d $ f p

That one’s pretty simple, but on its own not that useful. Instead, let’s wrap it up so we don’t have to deal with the individual hypotheses - instead, we’ll map it over a the hypothesis distribution

> applyToProbs :: (P -> P) -> HDist a -> HDist a
> applyToProbs f (HD hps) = HD $ M.map (applyToProbability f) hps

Now, the main reason we’re defining those two functions is to write a function renormalize that takes a distribution and makes sure the probabilities actually sum to 1 without changing any of the ratios between probabilities

> renormalize :: HDist a -> HDist a
> renormalize h@(HD dist) = applyToProbs (/normalizer) h
>   where normalizer = M.foldl' (+) 0 . M.map chance $ dist

Our basic guess function is just a small wrapper around M.lookup - it fetches the likelihood of a hypothesis on its identifier If we don’t have a hypothesis in our distribution, it assigns it a probability of 0, but otherwise just fetches the probability from the hypothesis

> guess :: Ord a => HDist a -> String -> P
> guess (HD hps) str = case M.lookup str hps of Nothing -> 0
>                                               Just h -> chance h

Updating!

Now, our not-so-secret goal here was to model Bayesian updates If you’re not familiar with Bayes’ Theorem, here it is:

$$p(H|D)=p(D|H)\frac{p(H)}{p(D)}$$

To explain: the probability of a hypothesis being true, on some data, is the probability of that data given the hypothesis (p(D∣H)) multiplied by the probability we had previously assigned the hypothesis (our “prior”, or p(H)), then divided by the probability of that data under all hypotheses (p(D)). We don’t deal directly with the probability of the data under all hypotheses - instead preferring to renormalize the distribution after the fact

Of course, we want to automate this adjustment. The first step is to multiply each probability by the numerator - p(D∣H) ⋅ p(H)

> pAndCond :: Ord a => a -> Hypothesis a -> Hypothesis a
> pAndCond event (H dist prior) = H dist $ prior * likelihood
>   where likelihood = M.findWithDefault 0 event dist

With that done, we just renormalize, effecitvely dividing by p(D) We know this works because if we divided by anything other than the renormalizer, the ending probabilities would sum to some number other than 1, so, on pain of contradiction,
p(D) = renormalizer

> updateOnEvent :: Ord a => HDist a -> a -> HDist a
> updateOnEvent (HD dist) event = renormalize . HD . M.map (pAndCond event) $ dist

Of course, we want to be able to update on a series of events, so we fold over a list of events This fold ends up being fully strict, so instead of a foldr we use foldl' (hooray tail recursion!)

> update :: Ord b => HDist b -> [b] -> HDist b
> update = foldl' updateOnEvent

Construction helpers

We’ll use these utilities to construct and solve some toy problems a little further down First, a few more helpers that’ll be really important for constructing the toy problems

This one just normalizes a hypothesis, so we can think of it in terms of allocating odds instead of probabilities

> normalizeHyp :: Hypothesis a -> Hypothesis a
> normalizeHyp (H m p) = if totalP == 1 then (H m p) else H (M.map (/totalP) m) p
>   where totalP = M.foldl' (+) 0 m

Next, this one just makes a hypothesis distribution from a list

> fromList :: [(String, Hypothesis a)] -> HDist a
> fromList = renormalize . HD . M.fromList

Toy Problems

Monty Hall

We’ll start out by getting a list of doors

> drs = "abc"

Next, our function for generating a hypothesis for which door gets opened based on our guess and the correct answer

> isIn :: Char -> Char -> Hypothesis Char
> isIn guess correct = normalizeHyp . flip H 1 . M.fromList $
>                      [(dr,1) | dr <- drs, dr /= guess, dr /= correct]

And, finally, the problem itself! The identifiers are each just the single character name of the door (a, b, or c) The hypotheses are each generated assuming the identifier is the correct door according to the isIn function

> montyHall :: Char -> HDist Char
> montyHall guess = fromList [([dr],isIn guess dr) | dr <- drs]

> shouldYouSwitch = guess (update (montyHall 'a') "b") "a" < 1 % 2

And there you go! The boolean shouldYouSwitch starts from you picking door a, then Monty opens b, and checks if you have worse than even chances of it being behind a

Cookies

We’ve got two bools of chocolate and vanilla cookies The first bowl is three quarters vanilla and one quarter chocolate, while the other is half and half

> bowl1 = normalizeHyp $ flip H 1 $ M.fromList [('v',3),('c',1)]
> bowl2 = normalizeHyp $ flip H 1 $ M.fromList [(a,1) | a <- "vc"]

Next we’ll make a distribution from both the bowls

> bowls = fromList  [("b1",bowl1),("b2",bowl2)]

From this, given a stream of cookies, it’ll guess the chance you were drawing from one of the bowls

M&M’s

Some background: The mixture of M&M’s has been changed a few times In 1995, blue M&M’s were introduced Given a bundle of M&M’s, what’s the probability it came from a 1994 mix vs a 1996 mix?

color	percentage in ’94	percentage in ’96
brown	30	24
green	10	20
orange	10	16
yellow	20	14
red	20	13
tan	10	0
blue	0	13

Now, let’s encode that into hypotheses

> mix94 = normalizeHyp . flip H 1 $ M.fromList
>         [("brown",30),("green",10),("orange",10),("yellow",20),("red",20),("tan",10)]

> mix96 = normalizeHyp . flip H 1 $ M.fromList
>         [("brown",24),("green",20),("orange",16),("yellow",14),("red",13),("blue",13)]

And, finally, our hypothesis distribution

> bag = fromList [("94",mix94),("96",mix96)]

Thursday, March 13, 2014

Exporting Literate Haskell With HSColour and Pandoc

In yesterday's post I said I would detail the workflow from literate Haskell + markdown to postable html, so here we go:

Requirements: You need to have Haskell installed somehow. I generally recommend the Haskell Platform installed through your OS's package manager (pacman, apt-get, homebrew...). At a minimum, you need ghc and cabal-install Next, you need hscolour and pandoc installed. If you already have them, great! If not, run these commands:

cabal install hscolour
cabal install pandoc

Once you have those installed, it's relatively simple - just run this script:

There you go! Now you've highlighted the code bits, and left the rest to be formatted as markdown!

Isomorphism, Equality, PQ, and Other Ways of Saying the Same Thing

Note: this whole document is literate code generated by a workflow I'll talk a bit more about tomorrow. It's a short shell script that's quite useful. You can find the program as a gist on github.

So we’ll first take as an axiom that math is based on axioms. So far so good? Well, an important thing to note with this is that it’s really easy for axioms to have different significance - it all comes down to what metaphor you use in your brain to describe the axiomset
For example, MIU seems wholly abstract, but it turned out to be isomorphic to some operations on the integers modulo 3. And that isomorphism allowed some things to be made clear about it that weren’t otherwise.
In Chapter 2, Hofstadter defines some axioms that he calls the “PQ system”. These axioms are:

The only valid characters are -, P, and Q
-P-Q-- is a theorem
∀x. xP -Q -x is a theorem
∀xyz. if xP yQ z is a theorem, then xP -yQ -z is a theorem

First lets define our types Hyphen strings are essentially equivalent to natural numbers, so we’ll make that explicit

> data Nat = Zero | S Nat deriving (Show, Read)
> data Str = PQ Nat Nat Nat deriving (Show, Read)

Next we’ll define a few numbers for convenience

> one = S Zero
> two = S one

And this is all we really need Now let’s define the basic axiomset

> a1 = PQ one one two
> a2 (PQ a1 a2 s) = PQ (S a1) a2 (S s)
> a3 (PQ a1 a2 s) = PQ a1 (S a2) (S s)

Now we generate a list of all the theorems by iterating a2 starting from a1, and then mapping and iterating a3 over that list. Isn't it wonderful what you can do with lazy evaluation?

> theorems = map (iterate a3) $ iterate a2 a1

Now this is great if we just want the list of all axioms, but not really that useful if we want to check if something’s in there. You know, given the infinite axiom space and all. So right now checking for theoremhood is actually really computationally difficult - you have to produce all axioms and just check if something’s in there. Now, there are some strings where it’s obvious that it’s not a theorem - P-Q-PPP is malformed, like 3$&*(! So, how would we go about making a deterministic test that finishes in finite time?
This is your cue to try and solve the problem yourself before spoilers.
So, to review:

We’ve got an infinite list of theorems, generated by some recursive rules
Our job is to somehow figure out if any given (valid) string is going to be somewhere in that list
There are two recursive rules doing the generating, so we can’t even do a linear scan through the infinite set
And, most algorithms to do diagonal scans are annoyingly complicated for what should be a simple test
Neither of those “full scan” approaches can ever give us certainty that something is a nontheorem - just that we haven’t found it yet

Well, the simplest way is to exploit an isomorphism between PQ and arithmetic. You can see this if you interpret xP yQ z as "$x + y = z$". Hofstadter tried to make it clear with the suggestive names P and Q. I tried to make the hint stronger, by defining PQ in terms of natural numbers. But, to do a good theoremhood check, we first need to actually write out the code for natural number arithmetic
Just for kicks, lets actually implement the Peano Axioms, or at least the ones we need.
First, let’s define equality

> instance Eq Nat where

First, m = n iff S(m) = S(n) (Peano Axiom 8) An equivalent to the above is that S(m) = S(n) iff m = n

>     (S n1) == (S n2) = n1 == n2

Equality is reflexive (Peano Axiom 1)

>     Zero == Zero     = True

Now that we’ve defined both our truth patterns, we can rest assured that every other case will turn out false

>     _ == _           = False

Next we’ll define all the arithmetic required by the Num typeclass

> instance Num Nat where

Zero is the additive identity, and this definition of addition actually looks about the same as Peano Addition If we’re adding Zero, then we can just return the other number And, S(n) + a = S(n + a). The pattern match to S(n1) + S(n2) is just a way of increasing the speed at which the recursion terminates

>   (S n1) + (S n2) = S . S $ n1 + n2
>   n + Zero = n
>   Zero + n = n

Next we’ve got multiplication, if only because it’s required by the typeclass. It’s just repeated addition, so this is pretty simple. The first rule is that anything multiplied by Zero is Zero. The second rule is that for any n1*S(n2) = n1 + n1*n2.

>   n1 * (S n2) = n1 + n1 * n2
>   Zero * _ = Zero
>   _ * Zero = Zero

More num class fillers

>   negate = undefined

>   abs = id

>   signum _ = one
>   fromInteger 0 = Zero
>   fromInteger n = S $ fromInteger $ n - 1

So now we’ve got the tools to create a function that would test axiomhood, but it’d be horribly inefficient. It’d look something like this:
isTheorem (PQ a1 a2 s) = a1 + a2 == s
Now, that’s all well and good, but it first has to do the O(n) addition of a1 and a2, then the O(n) equality of a1 + a2 and s. Surely we can make this into one n, instead of two? Well, the way to do so is actually really easy - we just exploit the isomorphism between addition and subtraction.
First, we define our recursive rule

>   (S n1) - (S n2) = n1 - n2

Next, our base cases. Subtracting Zero from anything is just a null op, and we don’t have a clear definition in PQ of negative numbers - you can’t have -5 hyphens

>   n - Zero = n
>   Zero - _ = undefined

So from there our work is really easy - we can just recurse down once The algorithm below will take exactly (a1 + a2) iterations to check for theoremhood

> isTheorem (PQ a1 a2 s) = s - a1 - a2 == Zero

And with that we’re done - we’ve defined the PQ system, written out something to list all the theorems, discovered some novel properties of it, and leveraged those to write a quick theoremhood tester. Not bad, eh?

Gallabytes