Tools for generating workloads on Neo4j.
I use this for benchmarking and load testing Neo4j instances. It provides a framework where you can either design your own mixed read/write workload from strategies provided, or design your own reads/writes to execute against a database, while keeping track of execution stats.
You can run workloads timed (i.e. for 5,000 ms) or numbered (i.e. 5,000 runs).
Usage: run-workload.js -p password
[-a address]
[-u username]
[-n hits] how many total queries to run
[--ms milliseconds] how many milliseconds to test for
[--workload /path/to/workload.json] probability table spec
[--query CYPHER_QUERY] single cypher query to run
[--schema /path/to/schema.json] schema for generated records (only used with
--query)
[--batchsize [1000]] number of records from schema to generate per batch
[--concurrency c] how many concurrent queries to run (default: 10)
[--checkpoint cn] how often to print results in milliseconds (default: 5000)
[--fail-fast] if specified, the work will stop after encountering one
failure.
You may only specify one of the options --n or --ms.
You may only specify one of the options --workload or --query
Options:
--help Show help [boolean]
--version Show version number [boolean]
-a address to connect to [default: "localhost"]
-u username [default: "neo4j"]
-p password [required]
-d database
--schema batch schema file
--batchsize number of records per batch, usable only with schema
-n number of hits on the database
--ms number of milliseconds to execute
--workload absolute path to JSON probability table/workload
--query Cypher query to run
--concurrency [default: 10]
--checkpoint [default: 5000]
Examples:
run-workload.js -a localhost -u neo4j -p Run 10 hits on the local database
secret -n 10
Simply pass any arguments the command recognizes to the docker container.
docker run --tty --interactive mdavidallen/graph-workload:latest -a my-neo4j-host.com -u neo4j -p password 2>&1
yarn install
node src/run-workload.js -a localhost -u neo4j -p password
See the workloads
directory for the format of the probability table.
You can use the script npm run graph-workload
as a synonym for running the index.js file, but keep in mind npm requires an extra --
argument prior to passing
program arguments, as in, npm run graph-workload -- --n 20
npm run graph-workload -- -a localhost -u neo4j -p admin --query 'Unwind range(1,1000000) as id create (n);' -n 50 --concurrency 4
Fake/mock data can be generated with functions from fakerjs.
Using this technique you can generate your own data and create custom load patterns. Similar to other Neo4j utilities, the batch will be present in the query form: "UNWIND batch AS event".
npm run graph-workload -- -a localhost -u neo4j -p admin \
--query 'CREATE (t:Test) SET t += event' \
--batchsize 1000 \
--schema /absolute/path/to/schemas/myschema.json
See src/schemas/user.json
as an example of a schema you can use in this way. Keys are field names to generate, values are the faker functions used to populate that field.
If you use the --query
option, you may also use --mode READ
or WRITE
. This enables the program
to use explicit read or write transactions, so that when queries are sent to a cluster, they are routed
appropriately according to the Neo4j routing rules.
As of Neo4j 4.0, sessions support multi-database. Use the -d
or --database
argument to direct
where the workload should go. By default, the workload goes to the default database (usually neo4j
).
Example:
npm run graph-workload -- -a neo4j://my-cluster -u neo4j -p admin -d mydb
yarn run test
docker build -t mdavidallen/graph-workload:latest -f Dockerfile .
- Stress tester has a number of 'read strategies' and 'write strategies'
- There is a probability table; the stress tester rolls random numbers and picks a strategy based on the probability table.
- By tweaking which strategies are available and what their probability is, you can generate whichever kind of load you like
- You can write a new strategy to simulate any specific kind of load you like.
See workload.js for details.