String literals and String module issues


#1

Overview

I posted this as an Issue on Github, but I figured this might be a better place to have those discussions. I’ve been working with Strings a bunch, both natively and within JavaScript, and there seems to be some edge cases with both string literals as well as the String module itself:

  • odd exceptions & errors when retrieving substrings
  • parsing issues with escaped strings (esp \")

This isn’t the first time I’ve run into this with Reason, but it is the first time it’s caused a bit of a show stopper for me (I’m currently working around it in JavaScript by calling JavaScript’s String.prototype.substring method, I’ll figure out a work around for native). These all seemed pretty surprising to me, but perhaps I just am misunderstanding something pretty fundamental.

Thank you for your help!

Substring issues:

Reason # let src = "\"test\"\n\"foo\"\n"
;
let src: string = "\"test\"\n\"foo\"\n";
Reason # String.length(src)
;
- : int = 13
Reason # String.sub(src, 8, 11)
;
Exception: Invalid_argument("String.sub / Bytes.sub").
Reason # String.sub(src, 1, 4)
;
- : string = "test"

Parsing issues:

I ran into this in a different context, wherein Escaped strings cause the compiler to complain about missing braces, but I ran into a similar edge case today:

Reason # let src1 = "\"test\""foo\"";
let src1: string = "\"test\"";
Reason # String.length(src1);
- : int = 6

Substrings

The length of the string is 11, but String.sub returns an error; when we start to slice more towards the beginning of the string, we see some odd behavior:

Reason # let src2 = "\"test\"\"foo\"";
let src2: string = "\"test\"\"foo\"";
Reason # String.length(src2);
- : int = 11
Reason # String.sub(src2, 7, 9);
Exception: Invalid_argument("String.sub / Bytes.sub").
Reason # String.sub(src2, 5, 7);
Exception: Invalid_argument("String.sub / Bytes.sub").
Reason # String.sub(src2, 4, 6);
- : string = "t\"\"foo"
Reason # String.sub(src2, 3, 5);
- : string = "st\"\"f"
Reason # String.sub(src2, 4, 6);
- : string = "t\"\"foo"
Reason # String.sub(src2, 3, 6);
- : string = "st\"\"fo"
  • I wouldn’t have expected an Invalid_argument for anything between 5 & 9 in this string
  • I also wouldn’t have expected String.sub(src2 ,4, 6) to return "t\"\"foo", which looks almost like it’s returning the tail of a list or the like
  • The behavior isn’t consistent, String.sub is different for 3 to 5, 4 to 6, and 3 to 6.

#2

Hi, check the API documetation: https://reasonml.github.io/api/String.html#VALsub

I think you are expecting the arguments of String.sub to be the start and end indexes?

They are actually the start and length. That’s why you’re getting exceptions in some of these cases.

Btw, tip to make your string literals easier to read–you can use Reason’s verbatim string syntax:

let src = {|"test"
"foo"
|}

#3

d’oh. :man_facepalming:t3: you know, I read that 5-6 times, and I still missed that, thank you!

Thank you for the tip on literals; I actually was just simplifying the case that I had here, which was iterating over output from another process.

The parsing issue with string escapes is still odd, however. This caused some consternation when I was hunting down the initial issue in another code base


#4

Yeah that one is weird. When I try it in the Reason Playground it’s actually a parse error, so I’m not sure how it’s being parsed in the REPL. Are you targeting JavaScript using BuckleScript/ReScript? Or targeting native with esy/opam and dune?


#5

I’ve generally been targeting JavaScript, but when creating the issue I was sanity checking myself in rtop, to see if I was just doing something else weird, so it seemed the same across both…


#6

If you’re targeting JavaScript, I assume you’re using BuckleScript? I think if you can, it’s a good idea to upgrade to ReScript (both compiler and syntax), as that is the supported toolchain by that team going forward. If JSOO, then you may want to double-check if the string parsing issue is still surfacing with the latest published Reason version (3.7.0), and if so, narrow down the GitHub issue to just that.