Try it online at doc.sherlocode.com !
Sherlodoc is a search engine for OCaml documentation (inspired by Hoogle), which allows you to search through OCaml libraries by names and approximate type signatures:
- Search by name:
list map
- Search inside documentation comments:
raise Not_found
- Fuzzy type search is introduced with a colon, e.g.
: map -> list
- Search by name and type with a colon separator
Bogue : Button.t
- An underscore
_
can be used as a wildcard in type queries:(int -> _) -> list -> _
- Type search supports products and reordering of function arguments:
array -> ('a * int -> bool) -> array
First, install sherlodoc and odig:
$ opam pin add 'https://github.com/art-w/sherlodoc.git' # optional
$ opam install sherlodoc odig
Odig can generate the odoc documentation of your current switch with:
$ odig odoc # followed by `odig doc` to browse your switch documentation
Which sherlodoc can then index to create a search database:
# name your sherlodoc database
$ export SHERLODOC_DB=/tmp/sherlodoc.marshal
# if you are using OCaml 4, we recommend the `ancient` database format:
$ opam install ancient
$ export SHERLODOC_DB=/tmp/sherlodoc.ancient
# index all odoc files generated by odig for your current switch:
$ sherlodoc index $(find $OPAM_SWITCH_PREFIX/var/cache/odig/odoc -name '*.odocl' | grep -v __)
Enjoy searching from the command-line or run the webserver:
$ sherlodoc search "map : list"
$ sherlodoc search # interactice cli
$ opam install dream
$ sherlodoc serve # webserver at http://localhost:1234
The different commands support a --help
argument for more details/options.
In particular, sherlodoc supports three different file formats for its database, which can be specified either in the filename extension or through the --db-format=
flag:
ancient
for fast database loading using mmap, but is only compatible with OCaml 4.marshal
for when ancient is unavailable, with slower database opening.js
for integration with odoc static html documentation for client-side search without a server.
Odoc 2.4.0 adds a search bar inside the statically generated html documentation. Integration with dune is in progress, you can try it inside a fresh opam switch with: (warning! this will recompile any installed package that depends on dune!)
$ opam pin https://github.com/emileTrotignon/dune.git#search-odoc-new
$ dune build @doc # in your favorite project
Otherwise, manual integration with odoc requires to add to every call of odoc html-generate
the flags --search-uri sherlodoc.js --search-uri db.js
to activate the search bar. You'll also need to generate a search database db.js
and provide the sherlodoc.js
dependency (a version of the sherlodoc search engine with odoc support, compiled to javascript):
$ sherlodoc index --db=_build/default/_doc/_html/YOUR_LIB/db.js \
$(find _build/default/_doc/_odocls/YOUR_LIB -name '*.odocl' | grep -v __)
$ sherlodoc js > _build/default/_doc/_html/sherlodoc.js
The sherlodoc database uses Suffix Trees to search for substrings in value names, documentation and types. During indexation, the suffix trees are compressed to state machine automatas. The children of every node are also sorted, such that a sub-tree can be used as a priority queue during search enumeration.
To rank the search results, sherlodoc computes a static evaluation of each candidate during indexation. This static scoring biases the search to favor short names, short types, the presence of documentation, etc. When searching, a dynamic evaluation dependent on the user query is used to adjust the static ordering of the results:
- How similar is the result name to the search query? (to e.g. prefer results which respect the case:
map
vsMap
) - How similar are the types? (using a tree diff algorithm, as for example
('a -> 'b -> 'a) -> 'a -> 'b list -> 'a
and('a -> 'b -> 'b) -> 'a list -> 'b -> 'b
are isomorphic yet point tofold_left
andfold_right
respectively)
For fuzzy type search, sherlodoc aims to provide good results without requiring a precise search query, on the basis that the user doesn't know the exact type of the things they are looking for (e.g. string -> file_descr
is incomplete but should still point in the right direction). In particular when exploring a package documentation, the common question "how do I produce a value of type foo
" can be answered with the query : foo
(and "which functions consume a value of type bar
" with : bar -> _
). This should also work when the type can only be produced indirectly through a callback (for example : Eio.Switch.t
has no direct constructor). To achieve this, sherlodoc performs a type decomposition based on the polarity of each term: A value produced by a function is said to be positive, while an argument consumed by a function is negative. This simplifies away the tree shape of types, allowing their indexation in the suffix trees. The cardinality of each value type is also indexed, to e.g. differentiate between list -> list
and list -> list -> list
.
While the polarity search results are satisfying, sherlodoc offers very limited support for polymorphic variables, type aliases and true type isomorphisms. You should check out the extraordinary Dowsing project for this!
And if you speak French, a more detailed presentation of Sherlodoc (and Sherlocode) was given at the OCaml Users in PariS (OUPS) in March 2023.