Help to create a tree where nodes have reference to parents


#1

I have a tree

type tree =
  | Leaf(int)
  | Node(int, list(tree));

which I would like to convert to a new tree but each node should have a pointer to the parent. I tried to create something like

 type newTree =
   | NewLeaf(int, option(ref(newTree)))
   | NewNode(int, option(ref(newTree)), list(newTree))

but with no success.

My best shot was with the following type

type newTree = {
  value: int,
  parent: option(ref(newTree)),
  mutable children: list(newTree),
};

which I could use to write my conversion function

let rec convert = (input, parent): newTree =>
  switch (input) {
  | Leaf(v) => {value: v, parent, children: []}
  | Node(v, c) =>
    let node: newTree = {value: v, parent, children: []};
    let children = List.map(newInput => convert(newInput, Some(ref(node))), c);
    node.children = children;
    node;
  };

As you can see, I have children as mutable only to allow creating node before I start the recursion to create its children. Is there a better way to solve this problem?


#2

You can try it like this, but it is not as expressive as using mutable:

type tree = 
   | Leaf of int 
   | Node of int * tree list

let rec xx = Node (3, [Leaf 0; xx])

#3

My whole need is to have a pointer to the parent in each node, and your type doesn’t have that. I ended up using a similar type but I still want to know a nice solution for this problem because having pointer usually can let you write faster algorithms.


#4

I think the solution you came up with is totally reasonable. In general, getting two objects to point at each other in Reason/OCaml is tough, and mutability is often appropriate. However, I’ll put a few alternative solutions below.

For all the alternatives below, I’m going to assume you’ve already defined the following:

module Input = {
  type tree =
    | Leaf(int)
    | Node(int, list(tree));
};

Alternative 1

If, for example, you don’t like the fact that children is mutable (maybe you want to expose this type in your .rei and don’t want to allow users to mutate the children), that’s solvable with something very close to what you have. Just do two passes over the tree and fill in the parents on the second pass. Note, parent is still mutable, so we haven’t changed anything that much.

type tree = {
  value: int,
  mutable parent: option(tree),
  children: list(tree),
};

let rec firstPassNoParents = input : tree =>
  switch (input) {
  | Input.Leaf(v) => {value: v, parent: None, children: []}
  | Node(v, c) =>
    let children = List.map(firstPassNoParents, c);
    {value: v, parent: None, children};
  };

let fillParents = tree => {
  let rec fillParents = (parent, tree) => {
    tree.parent = parent;
    List.iter(fillParents(Some(tree)), tree.children);
  };
  fillParents(None, tree);
};

let create: Input.tree => tree =
  input => {
    let tree = firstPassNoParents(input);
    fillParents(tree);
    tree;
  };

Alternative 2 (doesn’t actually work)

This is what I was hoping would work, but Reason/OCaml isn’t having any of it. Note the use of let rec node = ... and cs = ...;. You can define certain values in a mutually recursive way, but apparently not these values. The compiler error is:

This kind of expression is not allowed as right-hand side of `let rec’

I don’t claim to understand this too well, but the manual talks about what values are and aren’t allowed here (http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#s%3Aletrecvalues). If you’re fluent in context-free grammars, go at it.

(again, the following does not work)

type tree =
  | Leaf(int, option(tree))
  | Node(int, option(tree), list(tree));

let create: Input.tree => tree =
  input => {
    let rec loop = (input, parent) =>
      switch (input) {
      | Input.Leaf(v) => Leaf(v, parent)
      | Node(v, children) =>
        let rec node = Node(v, parent, cs)
        and cs = List.map(input => loop(input, Some(node)), children);
        node;
      };
    loop(input, None);
  };

Alternative 3

So here’s a working version of the above code, but using Lazy.t which is basically a deferred computation stored in a ref. I’m not sure it’s different enough from Alternative #1 to post, but it has the slight benefit that it’s clear that the parent is only going to be computed once and it wouldn’t be settable by users of the api. It’s probably a bit too complicated, but if you care a lot about your exposed interface (.rei) then it might have the best properties.

type tree =
  | Leaf(int, Lazy.t(option(tree)))
  | Node(int, Lazy.t(option(tree)), list(tree));

let create: Input.tree => tree =
  input => {
    let rec loop = (input, parent) =>
      switch (input) {
      | Input.Leaf(v) => Leaf(v, parent)
      | Node(v, children) =>
        let rec node = lazy (Some(Node(v, parent, Lazy.force(cs))))
        and cs = lazy (List.map(input => loop(input, node), children));
        get_exn(Lazy.force(node));
      };
    loop(input, Lazy.from_val(None));
  };

#5

First of all, thanks @russelldmatt for such detailed reply!

I’ve learned a lot with your response and I would like to write this down to make sure that I understood it correctly.

The following code should print same instance because Some(t) is not creating a copy of t but using the proper instance t to wrap in an Option. I also checked that Js.log(t) triggers an infinite recursion. I have to admit that I misinterpreted the section about pointers in OCaml’s documentation. I think I was looking for a C-like solution and I though ref was the way to go. After reading it again and doing this experiment, I feel I got the idea. :slight_smile:

type tree = {
  value: int,
  mutable parent: option(tree),
  children: list(tree),
};

let t: tree = {
  value: 1,
  parent: None,
  children: []
};

t.parent = Some(t);

switch (t.parent) {
| Some(parent) => parent == t ? Js.log("same instance") : Js.log("not same instance")
| None => Js.log("unreachable")
};

I loved how you defined rec filParents inside fillParents! I use to create other functions (e.g. fillParentsRec) what makes the module API terrible.

It is sad that Alternative 2 is not possible. I loved it.

My current solution
After play with this problem, I realized that I was trying to solve two different problems using the same data structure (what is OK). One problem is keep the tree to be transversed later and other is to have a pointer for a node which allow operations like “go to parent”, “go to first child” or “go to next sibling” in an efficient way. I ended up using another data structure to represent an edge between the node and its parent.

type visitor =
  | Visitor(t, option(visitor)); /* (node, parentVisitor) */

#6

No problem. I think you understood my response correctly. Re pointers in OCaml: yea, it’s closer to “everything is a pointer” than not, even though it doesn’t really look that way (which is pretty nice, IMO).