Arc Forum | Why parents?

Arc Forum

	Why parents?
	3 points by dagfroberg 3621 days ago \| 8 comments
	Why not skip the parents?Prefix Polish Notation don't need it. And i think also it even is possible to ignore incapsulation of function arguments. And if not, just put it in enclosure. A shunting yard algorithm will do. Imagine how beautiful and compact such a Lisp without parenthesis would be. Taste those words!

3 points by rocketnia 3621 days ago | link

Much of what you're saying sounds feasible and even familiar. Removing parens from Lisp and adding infix operators is a popular idea, popular enough that we've made a list of projects that explore the possibilities:

https://sites.google.com/site/arclanguagewiki/more/list-of-l...

---

As I've learned more about the ML family of language syntax (e.g. SML, OCaml, Haskell, Elm, Agda, Idris), I've come to the conclusion that ML-style language designs treat their syntax as s-expressions anyway. Where Lisp function call expressions are nested lists, ML expressions are nested "spines." ML languages treat infix spines as sugar for prefix spines, just like we're talking about for Lisp.

  gcd (a + b) (c - square d)           (* ML *)
  (gcd (+ a b) (- c (square d)))       ; Lisp
  (gcd ((+) a b) ((-) c (square d)))   (* ML again *)

In terms of precedence, ML's prefix calls associate more tightly than infix ones. That's the other way around from Arc, where the infix syntaxes associate more tightly than the prefix ones:

  (rev:map testify!yes '(no no yes))  ; returns (t nil nil)

I think ML's choice tends to be more readable for infix operators in general. The alternative form of the gcd example would look like this:

  gcd a + b c - (square d)

In this example, the + and - look like punctuation separating the phrases "gdc a", "b c", and "(square d)", and it takes me a moment to realize "b c" isn't a meaningful expression.

So I think the ML syntax is a good start for Lisp-with-syntax projects. For the syntax I've been designing, I've stuck with Lisp so far, but I want to figure out a good way to integrate ML-style syntax into it.

---

That said, there's a reason I've stuck with Lisp for my syntax: We can't parse ML syntax into nested spines unless we know all the operators that it uses. If we want to support custom infix operators, then the parser becomes intertwined with the language of custom operator declarations, and we start wanting to control what operator declarations are created during pretty-printing. These extra moving parts seem like an invitation for extra complexity, so I want to be careful how I integrate ML-style infix into my syntax, if at all.

-----

2 points by zck 3618 days ago | link

> gcd a + b c - (square d)

> In this example, the + and - look like punctuation separating the phrases "gdc a", "b c", and "(square d)", and it takes me a moment to realize "b c" isn't a meaningful expression.

Agreed. This is possibly my main complaint with mixing infix syntax with prefix syntax.

> gcd (a + b) (c - square d) (* ML ) (gcd ((+) a b) ((-) c (square d))) ( ML again *)

This is the same in Haskell^1. But I really dislike how this is done. It conflates calling a function (by surrounding it in parentheses) with converting an infix function to prefix. And if the function is unfamiliar, you don't know which it is. I assume this is taken care of by the parser, so it's into the arbitrariness of syntax, not the clear standard of function application^2.

[1] I assume Haskell got it from ML.

[2] This is unless you assume some insane return type that's basically "if called infix with two arguments, return a number; if called with zero arguments, return a prefix function of two arguments that returns a number".

-----

2 points by rocketnia 3617 days ago | link

"I assume Haskell got it from ML."

I think so. I'm primarily familiar with Haskell, but I hope those examples at least work in SML too.

A lot of the syntactic similarity starts to break down once the examples include lambdas, pattern matching, and top-level declarations, but I think that's similar to how the similarity between Lisp dialects breaks down when we look closely enough.

---

"It conflates calling a function (by surrounding it in parentheses) with converting an infix function to prefix."

I was making it look similar to Lisp syntax for the sake of argument, but parentheses aren't actually used like Lisp function calls there. In that example, parentheses are just used for two purposes:

- Grouping.

- Referring to infix variables (+) without applying them, which would otherwise be impossible to do.

The syntax for function application is juxtaposition with left associativity, so "gcd a b" is grouped as "(gcd a) b".

Agda[1] and some notes on Epigram 2[2] track "spines" of elimination contexts for the purposes of type inference and implicit argument elaboration. I think Haskell might use spines to resolve type class instances as well. In that case, the meaning of gcd in ((gcd a) b) can depend on the types known for a and b. With this kind of trick, Haskell's instance resolution can supposedly be used to write what are effectively varargs functions[3].

[1] http://www2.tcs.ifi.lmu.de/~abel/talkIHP14.pdf

[2] http://mazzo.li/epilogue/index.html%3Fp=1079.html

[3] https://wiki.haskell.org/Varargs

-----

3 points by akkartik 3621 days ago | link

"We can't parse ... unless we know all the operators that it uses."

Yes, that was my reaction as well. I thought https://en.wikipedia.org/wiki/Polish_notation requires either a clear separation between operators and values (so no higher-order functions) or all operators to be binary. Parentheses (whether of the lisp or ML or other variety) permit mult-iary higher-order operations.

Also, doesn't https://en.wikipedia.org/wiki/Shunting-yard_algorithm require some sort of delimiter between operations? I thought it didn't work for nested operations without commas or parens, or the simplifying assumptions in the previous paragraph.

-----

2 points by rocketnia 3620 days ago | link

I think those simplifying assumptions are consistent with what dagfroberg was saying. I think "Polish notation" and "shunting-yard algorithm" make sense as terminology here, even if dagfroberg and I might each have slight variations in mind.

---

"so no higher-order functions"

I think we can get read-time arity information for local variables, but only if our read-time arity information for the lambda syntax is detailed enough to tell us how to obtain the arity information for its parameter. For instance...

  -- To parse 3: We've completed an expression.
  3 =:: Expr
  
  -- To parse +: Parse an expression. Parse an expression. We've
  -- completed an expression.
  + =:: Expr -> Expr -> Expr
  
  -- To parse fn: Parse a variable. Parse an expression where that
  -- variable is parsed by (We've completed an expression.). We've
  -- completed an expression.
  fn =:: (arg : Var) -> LetArity arg Expr Expr -> Expr
  
  -- To parse letArity: Parse a variable. Parse an arity specification.
  -- Parse an expression where that variable is parsed by that arity
  -- specification. We've completed an expression.
  letArity =:: (arg : Var) -> Arity n -> LetArity arg n Expr -> Expr
  
  -- To parse arityExpr: We've completed a specification of arity
  -- (Expr).
  arityExpr =:: Arity Expr
  
  -- To parse arityArrow: Parse a specification of some arity m. Parse a
  -- specification of some arity n. We've completed a specification of
  -- arity (Parse according to arity specification m. We've completed a
  -- parse according to arity specification n.).
  arityArrow =:: Arity m -> Arity n -> Arity (m -> n)
  
  -- ...likewise for many of the other syntaxes I'm using in these arity
  -- specifications, which isn't to say they're all *easy* or even
  -- *possible* to specify in this metacircular way...

This extensible parser is pretty obviously turning into a half parser, half static type system, and now we have two big design rabbit holes for the price of one. :)

Notably, Telegram's binary protocol actually takes an approach that seems related to this, using dependent type declarations as part of the protocol: https://core.telegram.org/mtproto/TL

-----

1 point by zck 3618 days ago | link

I prefer the parentheses-based Lisp notation to any notation I've seen that removes them. Perhaps I haven't seen a notation that is simple enough for my taste.

The big benefit that parentheses have is that they make the correct parsing of the text obvious, especially when combined with decent automatic indentation^1. If I never have to read something like `[x * 2 if x % 2 == 0 else x for x in a_list]` or `[x * (2 - x % 2) for x in a_list]` again^2, I'll be thrilled.

So my question is: unless you can make the correct parsing obvious, why skip the parens?

[1] Hello, Emacs!

[2] Both from http://stackoverflow.com/questions/7619868/python-list-compr... .

-----

2 points by dagfroberg 3621 days ago | link

... oh, why did I mentioned Shunting Yard that's the most "unlispy"? Well it ensures the code translates to "Lisp without parenthesis"! Write with '(' and ')' if you want, it will remove it in the prettyprint intermediate code anyway. And you even write mathematical expressions in a usual way and the outcome will be prefixed! Your problems are solved..

-----

1 point by dagfroberg 3621 days ago | link

... or you want it the other way? Then the next step is easy to put back parenthesis.. parse it and write parenthesis back around each construct. Put it then in Lisp interpreter. Will it run after such a prettyprinter juggling?

-----