Iterate over a string / pattern match on a string


#1

I have been doing some programming excercises lately which all includes some sort of parsing a string / iterating over a string input. Unless I use String.map and then are forced to get back a new string, or use impure String.iter, I have to write my own explode-function (I haven’t found any explode function in any of the standard libraries) to get a list of chars from the string input. And «exploding» the string seems to be to only way to apply pattern matching on a string.

I wonder if others have experience parsing / pattern matching on strings, and if they have other solutions than what I described here?


#2

If you want to iterate over the characters in a string you can get a sequence of it with String.to_seq then you can map/filter/iter/whatever over it via the Seq module.

If you want an explode function though it’s pretty trivial to make:

let explode input = input |> String.to_seq |> List.of_seq

Or more efficiently:

let explode input =
  let rec aux idx lst =
    if idx<0 then lst else aux (idx-1) (input.[idx] :: lst)
  in aux (String.length input - 1) []

If you are using the Core library then you can use String.to_list_rev or if using batteries then String.to_list or if using extlib then String.explode. The built-in OCaml library is kept super simple overall.

But no, can’t pattern match on part of a string, only an entire string, I’d generally opt for a proper parser if I’m parsing a string as they are succinct, powerful, and fast. :slight_smile:

If you don’t mind using PPX’s then ppx_regexp in opam adds the ability to Regex pattern match like:

match%pcre input with
| {|^$|} -> ""
| {|(?<t>.*:\d\d|} -> "whatever else"
| _ -> "fallback"

The PPX also supports tyre routes for efficient routing-style matching (like for a web server) very succinctly. :slight_smile:


#3

Thanks a lot for your thorough answer! These libraries provides a quite extensive collection of functions for dealing with strings, this is really helpful.

I have only written ReasonML compiled with bucklescript so far, and from what I found, neither of these libraries (Core, extlib, batteries and pcre) are compatible with bucklescript. So this needs to be compiled to native, please correct me if I’m wrong.

I wasn’t aware of all these ocaml libraries, so I started exploring on the basis of the ones you suggested, and I found a couple more that could be helpful: humane-re and containers.

For those reading this and want to test these ocaml-libraries in ReasonML, here is how I did it:

  • Followed the instructions on esy.sh to install esy and starter repo:
    • npm install -g esy
    • git clone https://github.com/esy-ocaml/hello-reason.git
    • cd hello-reason
    • esy
  • Installed the ocaml I wantet to test from opam, e.g. batteries: esy add @opam/batteries
  • Added the newly installed library to dune file ./bin/dune:
    (executable
      (name Hello)
      (public_name Hello.exe)
      (libraries lib batteries))
    
  • Change content of .bin/Hello.re to test the new library:
    open Batteries;
    
    let () = { 
      "xNxox xXx"
      |> String.to_list /* String is now from batteries library */
      |> List.filter(x => x != 'x')
      |> String.of_list
      |> print_endline
    };
    
  • Build project: esy build
  • Run code: esy test

#4

The easiest way is to just git clone ... them and include their source in your project if you want to use them with bucklescript (bucklescript doesn’t use the ocaml ecosystem very well so it’s often best to just bypass it all). :slight_smile:

Or find someone who already made a bucklescript package for it instead. :slight_smile: