[go: up one dir, main page]

Skip to content
Kivanio Barbosa edited this page Jul 27, 2023 · 96 revisions

Frequently Asked Questions

How does sidekiq compare to resque or delayed_job?

Essentially all three perform the same task, executing background jobs, but go about it differently:

  1. Delayed Job uses your SQL database for storage and processes jobs in a single-threaded process. It's simple to set up but the performance and scalability aren't great. I would not use delayed_job for systems processing 100,000s of jobs/day.
  2. Resque uses Redis for storage and processes messages in a single-threaded process. The redis requirement makes it a little more difficult to set up, compared to delayed_job, but redis is far better as a queue than a SQL database. Being single-threaded means that processing 20 jobs in parallel requires 20 processes, which can take a lot of memory.
  3. Sidekiq uses redis for storage and processes jobs in a multi-threaded process. It's just as easy to set up as resque but more efficient in terms of raw processing speed. Your worker code has to be thread-safe.

What kind of performance can I expect to see with sidekiq?

Performance is too variable to give a simple answer. Most server-side work is dominated by I/O time and Sidekiq shines when lots of I/O is present. You should see an order of magnitude improvement or more with one Sidekiq process vs one Resque or Delayed::Job process.

One large customer I'm aware of is processing ~500,000 jobs/min with Sidekiq with another customer reporting a peak of ~50,000 jobs/sec across several Redis shards. Note that on dedicated hardware, Redis should be able to handle anywhere from 5,000 to 20,000 jobs/sec depending on features in use. After that, you'll need to shard your application to use multiple Redis instances or use multiple independent applications.

One busy Redis (output from redis-cli info):

redis_version:4.0.9
redis_mode:standalone
os:Linux 4.15.0-1009-aws x86_64
multiplexing_api:epoll
atomicvar_api:atomic-builtin
gcc_version:7.4.0
uptime_in_seconds:3279660
uptime_in_days:37
hz:10

# Clients
connected_clients:3005
blocked_clients:125

# Memory
used_memory:396704632
used_memory_human:378.33M
used_memory_rss:738783232
used_memory_rss_human:704.56M
used_memory_peak:8213446064
used_memory_peak_human:7.65G
used_memory_peak_perc:4.83%
used_memory_overhead:105619133
used_memory_startup:782448
used_memory_dataset:291085499
used_memory_dataset_perc:73.52%
total_system_memory:32655642624
total_system_memory_human:30.41G
maxmemory:26214400000
maxmemory_human:24.41G
maxmemory_policy:noeviction
mem_fragmentation_ratio:1.86
mem_allocator:jemalloc-3.6.0

# Stats
total_connections_received:577443
total_commands_processed:163849263454
instantaneous_ops_per_sec:45476
total_net_input_bytes:20173258216040
total_net_output_bytes:24490441661513
instantaneous_input_kbps:6749.39
instantaneous_output_kbps:7769.90
rejected_connections:0
keyspace_hits:10854195921
keyspace_misses:3763186978
pubsub_channels:1
pubsub_patterns:331
latest_fork_usec:20204

# Keyspace
db0:keys=266128,expires=261259,avg_ttl=153283764

Wouldn't it be awesome if Sidekiq supported {MongoDB, postgresql, mysql, SQS, ...} for persistence?

Not really. Redis provides me with an efficient set of data structures to build functionality on top of. I'd need to abstract those structures to an API, write adapters for each system, test on various versions and then document functional and performance limitations of each system. Just so some small percentage of my user base can avoid Redis. Sidekiq uses all of the data structures Redis provides: lists, sorted sets, hashes.

If you want a queueing system that uses X, use a queuing system that uses X! Sidekiq's mantra is simple and efficient. Redis is both. Abstracting data storage is neither.

Why am I seeing a lot of "Can't find ModelName with ID=12345" errors with Sidekiq?

Your client is creating the Model instance within a transaction and pushing a job to Sidekiq. Sidekiq is trying to execute your job before the transaction has actually committed. Use Rails's after_commit :on => :create hook or move the job creation outside of the transaction block.

How can I process a certain queue in serial?

You can't, by design. Sidekiq is designed for asynchronous processing of jobs that can be completed in isolation and independent of each other. Jobs will be popped off of Redis in the order in which they were pushed but there's no guarantee that Job #1 will execute fully before Job #2 is started.

If you need serial execution, you should look into other systems which give those types of guarantees.

Note you can create a Sidekiq process dedicated to processing a queue with a single worker. This will give you serial execution but it's a hack.

Also note you can use third-party extensions for sidekiq to achieve that goal.

Why don't CSS/JS/IMG assets load properly when I go to Sidekiq's Web UI?

Issue with Static Asset Serving

This is always a web server configuration issue. Your web server config probably has special rules for serving those static assets which don't forward the request to Rails if the asset does not exist on the filesystem, causing a 404 response instead. For example:

location ~ /assets/  {
  try_files $uri;
}

This will search the filesystem for any uri that matches /assets/, which is fine for your Rails assets like /assets/application.css that live in {Rails.root}/public/assets but it also matches requests for Sidekiq's assets /sidekiq/assets/. Since Sidekiq's assets live in the .gem file, a filesystem check will fail and return a 404. The fix is simple:

location ~ ^/assets/  {
  try_files $uri;
}

Use ^ to limit the scope of the match to top-level assets only.

Issue with X-SendFile

Rack uses the X-SendFile header to serve Sidekiq's web UI assets out of the gem. Since your gem installation is typically located outside of your deployed project root and some web servers do not serve files outside of the project root by default for security reasons, you may need to tweak your web server configuration.

For example, in Apache you would need something like the following:

XSendFile on
XSendFilePath /mnt/my_project-production/shared/bundle/jruby/1.9/gems/sidekiq-2.5.3/web/assets/javascripts/vendor/

NB: the asset paths will change with each gem upgrade, so you'll need to remember to update your server configuration whenever you update your Sidekiq installation or use a tool that can automate the process for you.

X-SendFile on Heroku

If you are having issues with CSS/Images loading on the Sidekiq Web UI and you deploy your app to Heroku you will need to update the following in your production.rb environment config file.

config.action_dispatch.x_sendfile_header = 'X-Accel-Redirect'

Heroku uses Nginx to server up static assets so it needs this specific x_sendfile_header to serve the Sidekiq assets correctly.

How do I push a job to Sidekiq without Ruby?

If you are integrating two different systems, you might want to kick off a job without the benefit of the Sidekiq::Client API. The Sidekiq message format is quite simple and stable: it's just a Hash in JSON format. Here's the bare bones way to do it in Ruby; you can translate to the language of your choice. A unique identifier must be generated and provided for the Job ID.

require 'securerandom'
require 'json'

redis = Redis.new(:url => 'redis://hostname:port/db')
msg = { "class" => 'MyWorker',
        "queue" => 'default',
        "args" => [1, 2, 3],
        'retry' => true,
        'jid' => SecureRandom.hex(12),
        'created_at' => Time.now.to_f,
        'enqueued_at' => Time.now.to_f }
redis.lpush("queue:default", JSON.dump(msg))

To get the queue to show up under the "Queues" tab and the "Enqueued" count to be correct, you must also add the queue name to the queues set:

redis.sadd("queues", "default")

This is the simplest setup. Adding support for namespaces or ActiveJob is more complicated.

How can I tell when a job has finished?

You have two options:

  1. Use a 3rd-party plugin to track the status of a given job; see the Related Projects page. If you want to surface the status of a job in your app's UI, have your page poll the server every few seconds, and then leverage the functionality of one of these plugins to send the status back via JSON.

  2. Sidekiq Pro's Batches feature can fire a callback in your code when one or more jobs has completed.

What happens to long-running jobs when Sidekiq restarts?

By default, Sidekiq 6+ gives workers 25 seconds to shut down. (Run Sidekiq <6 with -t 25.) This is carefully chosen because Heroku and AWS ECS give a process 30 seconds to shutdown before killing it. After 25 seconds, any remaining jobs still in progress are pushed back onto Redis so they can be immediately restarted when Sidekiq starts back up. Remember that Sidekiq will run your jobs AT LEAST once so they need to be idempotent. This is one example of how a job can be run twice.

Sidekiq Enterprise supports rolling restarts where Sidekiq will wait as long as necessary for jobs to finish, hours if necessary.

How do I ensure a job processes on a given machine?

Example: you have job which processes a large uploaded file on the filesystem but you have Sidekiq running on several machines. How do you ensure that a file uploaded to "app-1.example.com" is processed by a Sidekiq on "app-1.example.com"?

Easy, use hostname-specific queues. Start up Sidekiq with -q `hostname` -q default so Sidekiq will listen to a queue for the current hostname.

Alternatively tell each Sidekiq process to listen to a queue named after the machine’s hostname. In your config/sidekiq.yml, do this:

---
:verbose: false
:concurrency: 25
:queues:
  - <%= `hostname`.strip %>

Sidekiq runs the YAML file through ERB automatically so you can easily add the queue dynamically.

In your worker, configure jobs to go to the hostname-specific queue: sidekiq_options :queue => Socket.gethostname. Just make sure that hostname returns the same value as Socket.gethostname.

How do I cancel a Sidekiq job?

Sidekiq does not provide this functionality; it's safer and better for the application to do it. You should implement something like this:

class MyJob
  include Sidekiq::Job

  def perform(args)
    return if cancelled?
    # do stuff
  end

  def cancelled?
    Sidekiq.redis { |c| c.exists("cancelled-#{jid}") == 1 } # Use c.exists? on Redis >= 4.2.0
  end

  def self.cancel!(jid)
    Sidekiq.redis { |c| c.set("cancelled-#{jid}", 1, ex: 86_400) }
  end
end

How do I safely rename a Worker?

If you have jobs in Redis it is not safe to rename a Worker since the name is serialized into the job payload. Here's one weird trick they don't want you to know to rename Workers safely:

class MyNewWorker
  ...
end
# XXX Delete this alias in a few weeks when old jobs are safely gone
MyOldWorker = MyNewWorker

It's that easy!

How to calculate the number of Redis connection used by Sidekiq?

web_connections = (web_dynos * (client_conn * web_threads))
concurrency  = (max_connections - web_connections - (redis_reserved * worker_dynos)) / worker_dynos
server_connections  = concurrency + redis_reserved

Source, tool, additional resource

Previous: Signals Next: Testing

Clone this wiki locally