Exenum : Exhaustive Enumerations (OCaml library)

exenum is a library that helps building exhaustive or semi-exhaustive data sets, typically for intensive unit testing.

[Collapse]API & Compilation tips

API

See the ocamldoc generated API.

How to compile a project which uses exenum

The package name is exenum. The main module name is ExEnum.
If you are using ocamlfind and ocamlbuild, just add the following in your _tags file: <**/*> : package(exenum)

[Collapse]Install exenum

  • The easiest way to install exenum is to use opam : opam install exenum
  • Otherwise, download the latest source package from the project home page then, just make ; make doc ; make install

[Collapse]A quick example

Let us consider an example.
Assume you have written a function that takes one argument, which is a list of integers. For testing purposes, you wish to generate several different integer lists. With the exenum library, building the relevant enumeration is as simple as:
Download file: ex_intlist.ml
open ExEnum

(* Build a new enumeration of type int list. *)
let enum_intlist = e_list e_int
In order to print tested values, we define a function that converts an int list to a string:
(* Helper function that builds a string from a list of items *)
let sep map sp l = List.fold_left (fun acu x -> if acu = "" then map x else acu ^ sp ^ (map x)) "" l

let string_of_intlist l = "[" ^ (sep string_of_int ", " l) ^ "]"
Finally, let us output the first terms of our enumeration:
let () = show enum_intlist string_of_intlist 0 11
Value #0 is []
Value #1 is [2]
Value #2 is [-1]
Value #3 is [1]
Value #4 is [0]
Value #5 is [-2]
Value #6 is [3, 2]
Value #7 is [-3]
Value #8 is [3]
Value #9 is [-1, 0]
Value #10 is [1, 1]
(The order of values in the enumeration is purposedly partially shuffled, in order to avoid long boring sequences of similar values.)
We note that computing values at large indices is efficient:
(* Show terms computed at a very large index. *)
let index = Big_int.power_int_positive_int 10 200 (* Indeed, this is 10^200. *)
let () = bigshow enum_intlist string_of_intlist index 2
Value #0 is [-91, 83, -78, -59, -21, 45, -97, 79, 40, 50, 30, 46, -80, -45, 57, 70, -35, -71, 47, -29, -78, 58, -68, -12, -47, -32, -16, 36, 57, -51, 19, -33, -58, -37, 56, 38, 62, 2, 66, -65, -34, 36, 28, 46, 44, 53, 6, 10, 26, -54, -18, 35, 32, 40, 49, -2, -33, -42, -12, -26, -18, 8, 33, 41, 1, 29, -26, 8, 29, 5, 28, -7, -5, 32, -10, 17, 10, -9, 17, 5, 19, 21, -15, 20, 14, -18, 8, 3, -16, -4, 7, 8, 11, -6, -3, -1, 6, 2, -7, -2, -2, 1, 0, -1]
Value #1 is [91, 33, 73, 53, 43, 12, -32, 53, -66, -69, -81, -17, -72, -86, -87, -41, 52, 54, 43, -19, -76, 24, -52, -77, -77, 0, -37, 69, 1, -5, -40, 16, -19, 68, 4, 11, -64, 15, -62, -58, 45, -49, -20, 46, -2, -45, -24, 56, -21, 41, -2, -14, 13, -5, 13, 10, 2, 41, -40, 27, -19, -20, -33, 1, -11, 29, -16, -12, -14, -3, 19, 8, 25, -11, -13, -11, 23, -24, -20, -5, -23, 24, -22, 8, -2, 9, 14, -7, 2, 2, 15, 11, -7, -1, 5, 7, 5, 5, 4, 4, 3, -1, 3, -2]
(Notice how these two values, which are adjacent in the enumeration, are significantly different.)

Typical usage

  • Build an enumeration, as shown above.
  • Use a for loop to iterate over indices from 0 to, say 4 000 000.
  • For each index in the loop, generate the corresponding test value, using let test_value = ExEnum.get enum_intlist (Big_int.big_int_of_int index)
  • Instead of printing this value, use it in your own unit test to validate your function.

As a benefit, if a unit test happens to fail, you just need to know the index of failure to be able to rerun the test immediately.

[Collapse]A few more examples

[Collapse]With a record type

We illustrate enumerations over nested, non-recursive types.
Download file: ex_record.ml
(* Consider these two types. *)
type foo = Foo | Bar of int * int | Moo of string

type bar =
    { these: foo ; 
      mist: string ;
      covered: bool list ;
      mountains: int option }
Before anything else, we need functions to print those values.
let string_of_foo = function
  | Foo -> "Foo"
  | Bar (n1, n2) -> Printf.sprintf "Bar (%d, %d)" n1 n2
  | Moo s -> Printf.sprintf "Moo (\"%s\")" s

(* Helper function that builds a string from a list of items *)
let sep map sp l = List.fold_left (fun acu x -> if acu = "" then map x else acu ^ sp ^ (map x)) "" l

let string_of_bar bar =
  Printf.sprintf "\n  { these = %s,\n    mist = \"%s\",\n    covered = [%s],\n    mountains = %s }\n"
    (string_of_foo bar.these) bar.mist (sep string_of_bool ", " bar.covered)
    (match bar.mountains with None -> "None" | Some i -> "Some " ^ (string_of_int i))
Now, let us build the enumerations corresponding to both types.
open ExEnum

(* Building an enumeration for type foo is straightforward. 
 * We just have to read the type definition: foo is a sum type, that is, an union. *)
let enum_foo = 
  union [ single Foo ;
	  map (pair e_int e_int) (fun (a,b) -> Bar (a,b)) ;
	  map e_string (fun s -> Moo s)	]

(* Building an enumeration for type bar is no more difficult. 
 * bar is a record type equivalent to a 4-tuple. *)
let enum_bar =
  map (tuple4 enum_foo e_string (e_list e_bool) (e_option e_int))
    (fun (these, mist, covered, mountains) -> { these ; mist ; covered ; mountains })
Look at the very first terms of the enumeration.
let () = show enum_bar string_of_bar 0 10
Value #0 is 
  { these = Foo,
    mist = "",
    covered = [],
    mountains = None }

Value #1 is 
  { these = Bar (-1, 1),
    mist = "",
    covered = [],
    mountains = None }

Value #2 is 
  { these = Bar (-1, -1),
    mist = "",
    covered = [],
    mountains = None }

Value #3 is 
  { these = Bar (-1, 0),
    mist = "",
    covered = [],
    mountains = Some -1 }

Value #4 is 
  { these = Bar (0, 0),
    mist = "",
    covered = [],
    mountains = Some -1 }

Value #5 is 
  { these = Bar (0, 1),
    mist = "",
    covered = [],
    mountains = Some 1 }

Value #6 is 
  { these = Bar (0, -1),
    mist = "",
    covered = [],
    mountains = Some 1 }

Value #7 is 
  { these = Moo (""),
    mist = "",
    covered = [],
    mountains = Some 0 }

Value #8 is 
  { these = Bar (1, -1),
    mist = "",
    covered = [],
    mountains = Some 0 }

Value #9 is 
  { these = Bar (1, 0),
    mist = "",
    covered = [],
    mountains = None }
And look at a very large index.
let index = Big_int.power_int_positive_int 10 100 (* This is 10^100. *)
let () = bigshow enum_bar string_of_bar index 2
Value #0 is 
  { these = Moo ("tf6_fB26=)gsFJ*=2&l5x:8T"),
    mist = "Klb`|XvA;#AIQtKs5iAdDf|g",
    covered = [true, false, true, false, true, true, false, false, false, false, true, true, true, true, true, false, false, false, true, true, true, true],
    mountains = Some 5 }

Value #1 is 
  { these = Moo ("mC+3tX'V=,(?X|wQSCa6m',^"),
    mist = "GFsne|v>,?7@vU@krz_|?aAQ",
    covered = [false, true, true, false, false, false, true, true, false, false, false, true, true, true, false, true, false, false, false, false, false, true],
    mountains = Some 5 }

[Collapse]With a recursive type

Consider the following familiar recursive type:
Download file: ex_recursive.ml
type term = Var of string | App of term * term | Lambda of string * term
(We omit the definition of a printing function).
We now build a recursive enumeration:
open ExEnum
let e_vars = from_list ~name:"variables" ["x" ; "y" ; "u" ; "v"]

let rec e_term =
  lazy begin
    (* Explicit recursion *)
    let r_term = pay e_term in
    union 
      [ map e_vars (fun v -> Var v) ;
	map (pair r_term r_term) (fun (t1,t2) -> App (t1, t2)) ;
	map (pair e_vars r_term) (fun (x,t) -> Lambda (x,t)) ]
  end

let e_term = Lazy.force e_term
The main point is to define a lazy enumeration and to invoke pay before using the recursive enumeration.
(* Let us print the first terms of the enumeration. *)
let () = show e_term to_string 0 16
Value #0 is x
Value #1 is v
Value #2 is u
Value #3 is y
Value #4 is (x x)
Value #5 is (v y)
Value #6 is (u u)
Value #7 is (y v)
Value #8 is (fun x -> v)
Value #9 is (fun y -> x)
Value #10 is (fun u -> y)
Value #11 is (y u)
Value #12 is (x u)
Value #13 is (v v)
Value #14 is (fun u -> x)
Value #15 is (fun v -> y)
let index = Big_int.power_int_positive_int 10 400 (* This is 10^400. *)
let () = bigshow e_term to_string index 2
Value #0 is ((((((fun x -> (v v)) ((fun v -> y) (fun x -> x))) ((fun v -> (x x))[...skip 20 more lines...]
Value #1 is (((((((v u) y) ((fun v -> v) (y y))) (((y u) (x v)) ((fun x -> u) (u v))))[...skip...]
Noticeably, these terms are computed very quickly on a seven-year old laptop.

[Collapse]Developpers'corner: _oasis file used to build exenum

This is an interesting excerpt of the _oasis file that I use to build library exenum:
Download file: _oasis
## Exported library
Library "exenum"
  Path:            src/
  Modules:         ExEnum
  BuildDepends:    exenum.internals
  Install:         true

## This library packs internal modules, so that they may not conflict with anything else.
Library "exenum_internals"
  Path:	           src/
  Modules:         Convenience
  InternalModules: Exen, ExtArray, Parts, Shuffle
  FindlibParent:   exenum
  FindlibName:     internals
  Pack:            true
The benefits of splitting the library in two parts are the following:
  • Only one module is visible at top-level: ExEnum.
  • The internal modules, which should not normally be used directly by the user, are packed in a single library file: exenum_internals.
  • The module namespace is not polluted by these internal modules. There is no problem if the user has a module Shuffle of his own. (No need to prepend every library module with a given prefix, e.g. Exenum_Shuffle, Exenum_Parts, ...).
  • Note also that module Convenience might be of interest for some users. Its interface is therefore installed together with the main library. This internal-but-exported module is accessible by (for instance): open Exenum_internals.Convenience.
  • When developping the library itself, internal modules are made accessible by: open Exenum_internals.