[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Observation]: Workers on the same process report Ready every time a different worker completes. #2911

Closed
1 task done
yacman opened this issue Nov 18, 2024 · 6 comments
Closed
1 task done
Labels
bug Something isn't working

Comments

@yacman
Copy link
yacman commented Nov 18, 2024

Version

v5.26.2

Platform

NodeJS

What happened?

We've added in metric collection to services and have been observing the worker.on('ready') event.

https://github.com/taskforcesh/bullmq/blob/8204ea3635b3042e8537c1d3f92584c572735a31/src/classes/worker.ts#L137C1-L143C1

While listening to this event, we can observe that if we have a worker formation like:

Pod A | Ready Count
WorkerA | 1
WorkerB | 1
WorkerC | 1

Pod B | Ready Count
WorkerA | 1
WorkerB | 1
WorkerC | 1

After single job is processed on any worker, we'll say WorkerC on Pod A

Pod A | Ready Count
WorkerA | 2
WorkerB | 2
WorkerC | 1

Pod B | Ready Count
WorkerA | 1
WorkerB | 1
WorkerC | 1

The other workers ready event is triggered at the completion of Worker C.

Are these counts expected? If the events are ok, the question is why would other workers report becoming unblocked?

How to reproduce.

No response

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@yacman yacman added the bug Something isn't working label Nov 18, 2024
@manast
Copy link
Contributor
manast commented Nov 18, 2024

This "ready" event is emitted when the "blocking" (dedicated) Redis connection is ready. This event is actually generated by IORedis, signalling that the connection has been correctly stablished and is ready to accept Redis commands:

this.blockingConnection.on('ready', () =>

So in your case I am not sure why this happens but there are situations when we need to force-disconnect the blocking connection and reconnect again, that would generate a new "ready" event:

bclient.disconnect(!this.closing);

Having said that, I suspect you are using this event for something else, as this event as is is not particularly useful. If you describe your use case we may be able to find a better way to do it than using this event.

@yacman
Copy link
Author
yacman commented Nov 18, 2024

Much appreciated for the details on the origination. We are looking to identify the number of worker threads that are globally connected and working by Worker and Queue. e.g. if there are 10 instances of Worker A on Pod A and 5 instances of Worker A on pod B we would observe two values of 10 and 5. If Pod B had 2 workers that had issues with their redis connections (memory overflow something impacting limiting cpu timeouts heartbeats), The values would start reporting as 10 and 3.

We are looking to alert when worker instance counts go outside of a boundary.

@manast
Copy link
Contributor
manast commented Nov 18, 2024

What about the "getWorkers" and "getWorkersCount" APIs, are they not useful for your case?
https://api.docs.bullmq.io/classes/v5.Queue.html#getWorkers

In BullMQ you can also set a specific name for your workers if you want and you will get this name back when calling "getWorkers": https://api.docs.bullmq.io/interfaces/v5.WorkerOptions.html#name

@yacman
Copy link
Author
yacman commented Nov 18, 2024

Thank you these are definitely the aggregations we are looking for.

Will calling these operations on a timer, like every 5 seconds be a problem?

@manast
Copy link
Contributor
manast commented Nov 18, 2024

Probably not a problem, depends if you have many workers or not, you can make some benchmarks to see how long that call takes. I would try to increase the interval as much as possible, every 30 seconds even better.

@yacman
Copy link
Author
yacman commented Nov 18, 2024

Thank you for the time and feedback!

@yacman yacman closed this as completed Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants