Trying to understand greater-than and less-than


#1

Howdy all. I’m trying to grasp the basic behaviors of operators and whatnot. The referential and structural equality makes perfect sense, but when applying less-than and greater-than to data structures leads to some behavior that I don’t understand.

let obj0 : School.person = {
    name: "Dave",
    age: 20
}

let obj1 : School.person = {
    name: "Bill",
    age: 25
}

let obj2 : School.person = {
    name: "Bill",
    age: 25
}

obj0 > obj1 // Evaluates to true

I’ve been playing around with the values and can cause the comparison to evaluate to false by changing the name fields to different strings. For example, changing “Bill” to “S” results in false. If I keep the names unchanged, no matter the size of the age field of obj1, it remains true.

    obj1 > obj2 // Evaluates to false

Identical objects results in false, which makes sense. And if I change the age of Obj1 to 26, then it evaluates to true, indicating that the compiler is indeed trying to coerce some sort of numeric value from the structures for the sake of the comparison. This also indicates that using the > and < operators will trigger a structural comparison.

I’m assuming that the comparison starts comparing from the top of each object and reaches a conclusion as soon as a field is found to be different, but how does it coerce a numeric value from a string?


#2

If you look at what it compiles to, you’ll see the > operator it uses Caml_obj.caml_greaterthan, which uses Caml_obj.caml_compare, which, since ML records are compiled to arrays, boilds down to this branch I guess.

If you’re not cool with the default behavior, you can always define your own comparison in your module:

module School = {
  type person = // ...

  let (>) = (a, b) => ...
}

// usage

School.( obj0 > obj1 ); // uses School.(>) inside the parens

Mind that:

  1. You can only have one > per module, so if there are multiple types where you customize comparing logic, they’re gonna need to be in separate [sub]modules. Not a problem, modules are cheap.
  2. It might be better for composability to use functors and comparables, like described in the module docs (or rather even something Belt-compatible). That’s an obviously more advanced topic, well worth learning in the end, but maybe not right off… well, see for yourself, how you steep you want your learning curve :slight_smile:

#3

Thanks a bunch for the input. I’m moving through the JS code now.

I have no interest in customizing the behavior. Coming from other languages, that sort of programming voodoo is something I am explicitly against. I love default behaviors, I just want to understand how it’s doing what it does.


#4

Ok, I followed the path you set me on and it ends up as a vanilla JavaScript compare of two strings. This process just starts at index 0 of each string and does a charCode comparison until it finds a difference.

"Apple" > "banana" // false

“A” is charCode 65 and “b” is charCode 98.

My concern is that this behavior may not map over to OCaml, since I know very little about native OCaml I’m cautious about pitfalls from working in a too JS-oriented way.


#5

Well, IMHO, when you compare non-scalar values for anything other than equality, be it referential or deep/shallow, you’re asking for magic :slight_smile: Creating a custom comparison functions/operators is much more explicit.


#6

Just to clarify something: the default behaviour is called polymorphic comparison and it is actually considered a bit of a voodoo in the OCaml community, and discouraged for custom data types. It is encouraged to define your own explicit comparison functions so that there are no surprises with comparisons. See https://blog.janestreet.com/the-perils-of-polymorphic-compare/


#7

Excellent! That is exactly the information that I was looking for. So it appears that what the JS is doing is indeed what OCaml does. A pitfall, to be sure, but a small one.


#8

Yeah, merely human-sized :slight_smile:


#9

BuckleScript (and js_of_ocaml) go to sometimes extraordinary lengths to guarantee this. JSOO is at (or close to) 100% matching, BS intentionally breaks a few things to produce nicer JS.

For example OCaml strings don’t support unicode, which JSOO enforces through wrapping and conversion, but BS compiles to raw JS strings so unicode characters break assumptions in the Char stdlib functions. This turns out to be fine in BuckleScript because most people building with it don’t use the OCaml stdlib for string manipulation, the native JavaScript tools for this are more appropriate (Js.String.length(x) compiles to x.length not a function call).