Arc Forum | rkts's comments

Arc Forum

new | comments | leaders | submit | rkts's comments

6 points by rkts 6443 days ago | link | parent | on: Binary Search Trees in Arc

Bug:

  (def other (side) (case side 'l 'r 'r 'l))

should be

  (def other (side) (case side l 'r r 'l))

This was causing bst-rem to behave incorrectly when removing a node with two children.

-----

3 points by pg 6443 days ago | link

Thanks! I updated the code.

-----

1 point by rkts 6444 days ago | link | parent | on: Binary Search Trees in Arc

Here's a sloppy version 0: http://benstoker.com/code/arcbst.ml

I implemented all the bst- functions except bst-rem-edge, which seems redundant given bst-rem and bst-edge.

-----

2 points by rkts 6443 days ago | link

Update: I've rewritten my solution above to be as close as possible to the original Arc code. All the function interfaces should be the same. Equality is determined with (=), which is the closest thing OCaml has to Arc's is.

I've also created a separate version that's closer to how I would prefer to implement it:

http://benstoker.com/code/mybst.ml

The main differences are:

1. The comparison function is specified using a functor. This allows the user to specify it only once instead of in every call to insert, find, and so on. (I'd like to see something like ML's functors in Lisp someday.)

2. The comparison function handles equality as well as order.

3. edge is replaced by min_elt and max_elt.

4. rem-edge is gone.

5. The output of to_list (aka elts) isn't backwards.

-----

2 points by pg 6443 days ago | link

Are you sure it works the same? Bst-rem-edge is simpler than bst-rem; it doesn't rebalance the tree.

-----

2 points by rkts 6443 days ago | link

Maybe I'm not understanding your code (it's kind of cryptic...) but it does seem the same to me. When bst-rem removes a node with one child, it just grabs the other child. This is the same thing bst-rem-edge does.

Your 'bubble' function calls bst-edge to find the left/rightmost node and then calls bst-rem-edge to remove it. You could just as well call bst-edge and then bst-rem on the result; either way you traverse the tree twice. I can't imagine a situation where you'd just want bst-rem-edge by itself.

Come to think of it, I don't know why you need bst-edge either. It's simpler to just have min and max functions that return an element, as in the Haskell solution.

-----

2 points by rkts 6444 days ago | link | parent | on: Binary Search Trees in Arc

Not to mention this is a cause of potential bugs: you have to supply the right f< with every call, or you'll get weird behavior.

-----

2 points by pg 6444 days ago | link

This code is derived from the bst code I wrote for ACL. I remember when I wrote it thinking people might want to use different ordering functions when operating on the same tree. Now that doesn't seem so likely. But are there conceivable cases when one might want to?

-----

4 points by almkglor 6444 days ago | link

By definition a binary search tree has its l link as < r link. If a different f< suddenly causes the r link to be < l link, then your bst suddently becomes incorrect and weird stuff happens.

-----

3 points by pg 6443 days ago | link

Suppose you have a bst that persists for a long time. Suppose the things you're sorting are not mere integers, but a structures that represent something in the world, ranked according to some scoring function. Suppose you're able to figure out a slightly more accurate scoring function. If you're passing f< as an argument on each bst operation, you can just start using the new ranking fn. Whereas if you've been storing the comparison fn in the tree as you've been building it, you're probably going to have to rebuild the tree if the comparison fn changes.

I'm not saying this is going to be common enough to drive the design of a bst library. But it's by no means logically impossible to change f<.

-----

4 points by almkglor 6443 days ago | link

Suppose we have two f<, f<old and f<new, to be used in a binary search tree b.

f<new may safely be used to replace f<old iff:

  all x in b: all y in b: (f<old x y) == (f<new x y)

This stems from the binary sort tree invariants:

  all x in b!l: (f<old x b!v)
  all x in b!r: (f<old b!v x)

And the definition of all x of b:

  all x of b = {all x in b!l, b!v, all x in b!r}
  all x of nil = {}

This is because if even a single case is wrong, that part of the binary search tree concerned with keeping track of the correct order becomes wrong. In such a case, it is necessary to rebuild the tree.

Of course, note that the above condition is necessary only for all current existing nodes, i.e. if we have a new m:

  all x in b: m != x

then it is okay that:

  all x in b: (f<old m x) != (f<new m x)

While your style allows such an edge case, it is arguably such an edgy edge case that you might as well rebuild the tree if f<old != f<new; this is largely because you have to check that f<new == f<old for all current members of the tree, and that checking is O(N^2), whereas rebuilding the tree is O(N) taking O(2N) space (and your nondestructive bst takes O(N) space for each operation anyway).

-----

3 points by pg 6443 days ago | link

I was thinking about cases where you'd be willing to tolerate a small amount of misordering in the tree (since you were till then using the old scoring function, which presumably misordered things or you wouldn't have improved it). But deletion wouldn't work if you changed the scoring fn, and the number of cases where you both want to change the ordering function and never need to do deletions must be small enough that a general-purpose bst lib doesn't need to support that.

-----

3 points by almkglor 6442 days ago | link

I don't think you quite understand. A misordered binary search tree will fail lookups - (bst-find ...) will fail, even if that object is returned by (bst-elts ...). If you're going to create a general-purpose bst lib with "you can change the sort function without rebuilding the tree!" then you'd better make plenty damn sure there's a big warning saying "...but if you do (bst-find ...) might fail." Personally, I'd rebuild the tree anyway, because if the change is significant enough to change the sort function, it's big enough to completely destroy the meaning of the binary search tree.

-----

3 points by pg 6442 days ago | link

I get that. When I think of using a bst I think of what News.YC needs: the top-ranked 100 or so stories out of 100,000, and no need for find or delete. (Though currently News.YC just truncates the list and sorts it, which would stop working at very high submission rates.)

-----

1 point by almkglor 6442 days ago | link

c/ref Mythical Man-Month, Frederick P. Brooks.

A programming systems product takes about nine times as much effort as the component programs written separately for private use. I estimate that productizing imposes a factor of three; and that designing, integrating, and testing components into a coherent system imposes a factor of three; and that these cost components are essentially independent of each other.

-----

6 points by rkts 6446 days ago | link | parent | on: Any reason not to use "?" instead of "o" for optio...

No: parameter lists in Arc are destructuring.

-----

3 points by rkts 6446 days ago | link | parent | on: Any reason not to use "?" instead of "o" for optio...

Another (fairly obvious) idea: since optional parameters can't be followed by non-optional ones, just have a single ? for all optional parameters, as in CL:

  (fn (a b ? (c 3) (d 4)) ...)

This is shorter and mirrors the dot notation for rest parameters.

-----

4 points by pg 6446 days ago | link

I wanted to leave open the option of other types of parameters besides optional ones.

-----

2 points by bogomipz 6444 days ago | link

Does the suggestion really close that door? For instance, if you wanted to add keyword arguments, they could go like this:

  (fn (a b ? (c 3) (d 4) ! (e 5) (f 6) . rest) ...)

Combining all four argument flavors in the same function is of course not a good idea.

-----

1 point by kennytilton 6442 days ago | link

Looks like how common lisp works. And agreed: mixing optional with keyword args would be a nasty thing to do to users. Probably they saw that it was /possible/ and said sure, why not?

-----

7 points by rkts 6446 days ago | link | parent | on: Any reason not to use "?" instead of "o" for optio...

Even more annoying (to me) is that this means o can't be a variable in a destructuring bind:

  arc> (let (x) '(blah) x)
  blah

  arc> (let (o) '(blah) o)
  Error: "cadar: expects argument of type <cadarable value>; given ((o))"

-----

3 points by rkts 6448 days ago | link | parent | on: Reference parameters

Yes:

  (let (a b) (f x) ...)

although technically this is a destructuring-bind, not a multiple-value-bind.

-----

2 points by rkts 6449 days ago | link | parent | on: Reference parameters

A Lisp variable is implicitly a pointer to an object. Passing it to a function generates a copy of the pointer, which is pass-by-value. With pass-by-reference, the function would get a reference to the pointer. So e.g. reverse could be called for side effect: you could say (rev xs) instead of (= xs (rev xs)).

-----

2 points by rkts 6449 days ago | link | parent | on: Reference parameters

Consider deleting a node from a binary tree. You want a function that looks at a node and, if it's a match, unlinks it from its parent. Passing by reference allows you to do this cleanly. The alternative is to peek ahead at each subnode (messy) or to pass information up the call stack (inefficient, as the function can no longer be tail-recursive).

I'm not advocating C++ references, which I think are too implicit. If a function call foo(x) can change the value of x, there needs to be some visual indication of this. I'd prefer something like plain C pointers, with which you can pass the address of an object: foo(&x).

Of course an alternative is store objects wrapped in containers, and this isn't too bad a solution.

  (def ref (x) (obj contents x))
  (mac deref (x) `(,x 'contents))

But of course this is inefficient. If anything, this argues for adding arrays to Arc.

-----

1 point by rkts 6453 days ago | link | parent | on: Forum Request

Google works fine for me. I don't see a reason to reinvent the wheel here.

-----