-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Observation]: Workers on the same process report Ready every time a different worker completes. #2911
Comments
This "ready" event is emitted when the "blocking" (dedicated) Redis connection is ready. This event is actually generated by IORedis, signalling that the connection has been correctly stablished and is ready to accept Redis commands: Line 323 in 8204ea3
So in your case I am not sure why this happens but there are situations when we need to force-disconnect the blocking connection and reconnect again, that would generate a new "ready" event: Line 698 in 8204ea3
Having said that, I suspect you are using this event for something else, as this event as is is not particularly useful. If you describe your use case we may be able to find a better way to do it than using this event. |
Much appreciated for the details on the origination. We are looking to identify the number of worker threads that are globally connected and working by Worker and Queue. e.g. if there are 10 instances of Worker A on Pod A and 5 instances of Worker A on pod B we would observe two values of 10 and 5. If Pod B had 2 workers that had issues with their redis connections (memory overflow something impacting limiting cpu timeouts heartbeats), The values would start reporting as 10 and 3. We are looking to alert when worker instance counts go outside of a boundary. |
What about the "getWorkers" and "getWorkersCount" APIs, are they not useful for your case? In BullMQ you can also set a specific name for your workers if you want and you will get this name back when calling "getWorkers": https://api.docs.bullmq.io/interfaces/v5.WorkerOptions.html#name |
Thank you these are definitely the aggregations we are looking for. Will calling these operations on a timer, like every 5 seconds be a problem? |
Probably not a problem, depends if you have many workers or not, you can make some benchmarks to see how long that call takes. I would try to increase the interval as much as possible, every 30 seconds even better. |
Thank you for the time and feedback! |
Version
v5.26.2
Platform
NodeJS
What happened?
We've added in metric collection to services and have been observing the worker.on('ready') event.
https://github.com/taskforcesh/bullmq/blob/8204ea3635b3042e8537c1d3f92584c572735a31/src/classes/worker.ts#L137C1-L143C1
While listening to this event, we can observe that if we have a worker formation like:
Pod A | Ready Count
WorkerA | 1
WorkerB | 1
WorkerC | 1
Pod B | Ready Count
WorkerA | 1
WorkerB | 1
WorkerC | 1
After single job is processed on any worker, we'll say WorkerC on Pod A
Pod A | Ready Count
WorkerA | 2
WorkerB | 2
WorkerC | 1
Pod B | Ready Count
WorkerA | 1
WorkerB | 1
WorkerC | 1
The other workers ready event is triggered at the completion of Worker C.
Are these counts expected? If the events are ok, the question is why would other workers report becoming unblocked?
How to reproduce.
No response
Relevant log output
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: