首页 > 代码库 > How GitLab uses Unicorn and unicorn-worker-killer

How GitLab uses Unicorn and unicorn-worker-killer

GitLab uses Unicorn, a pre-forking Ruby web server, to handle web requests (web browsers and Git HTTP clients). Unicorn is a daemon written in Ruby and C that can load and run a Ruby on Rails application; in our case the Rails application is GitLab Community Edition or GitLab Enterprise Edition.

Unicorn has a multi-process architecture to make better use of available CPU cores (processes can run on different cores) and to have stronger fault tolerance (most failures stay isolated in only one process and cannot take down GitLab entirely). On startup, the Unicorn ‘master‘ process loads a clean Ruby environment with the GitLab application code, and then spawns ‘workers‘ which inherit this clean initial environment. The ‘master‘ never handles any requests, that is left to the workers. The operating system network stack queues incoming requests and distributes them among the workers.

In a perfect world, the master would spawn its pool of workers once, and then the workers handle incoming web requests one after another until the end of time. In reality, worker processes can crash or time out: if the master notices that a worker takes too long to handle a request it will terminate the worker process with SIGKILL (‘kill -9‘). No matter how the worker process ended, the master process will replace it with a new ‘clean‘ process again. Unicorn is designed to be able to replace ‘crashed‘ workers without dropping user requests.

This is what a Unicorn worker timeout looks like in unicorn_stderr.log. The master process has PID 56227 below.

 技术分享

 

翻译如下:

GitLab使用Unicorn,一个预先分配的Ruby Web服务器来处理Web请求(Web浏览器和Git HTTP客户端)。 Unicorn是一个用Ruby和C编写的守护进程,可以加载和运行Ruby on Rails应用程序;在我们的例子中,Rails应用程序是GitLab社区版或GitLab企业版。

Unicorn具有多进程架构,以更好地利用可用的CPU内核(进程可以在不同的内核上运行),并具有更强的容错能力(大多数故障只保留在一个进程中,并且不能完全取消GitLab)。在启动时,Unicorn的“主”进程使用GitLab应用程序代码加载一个干净的Ruby环境,然后生成继承这个干净的初始环境的“工作者”。 “主人”从不处理任何要求,留给工作人员。操作系统网络堆栈将进入的请求排队并将其分发给工作人员。

在一个完美的世界里,主人会产生一次工作池,然后工作人员一个接一个地处理传入的Web请求,直到时间结束。实际上,工作进程可能会崩溃或超时:如果主人注意到工作人员处理请求太长时间,则将使用SIGKILL(‘kill -9‘)终止工作进程。无论工作流程如何结束,主程序将重新替换新的“干净”流程。 Unicorn旨在能够替代“坠毁”的工作人员,而不会丢弃用户请求。

这是Unicorn工作超时在unicorn_stderr.log中看起来像什么。主程序下面有PID 56227

 

GitLab has memory leaks. These memory leaks manifest themselves in long-running processes, such as Unicorn workers. (The Unicorn master process is not known to leak memory, probably because it does not handle user requests.)

To make these memory leaks manageable, GitLab comes with the unicorn-worker-killer gem. This gem monkey-patches the Unicorn workers to do a memory self-check after every 16 requests. If the memory of the Unicorn worker exceeds a pre-set limit then the worker process exits. The Unicorn master then automatically replaces the worker process.

This is a robust way to handle memory leaks: Unicorn is designed to handle workers that ‘crash‘ so no user requests will be dropped. The unicorn-worker-killer gem is designed to only terminate a worker process in between requests, so no user requests are affected.

This is what a Unicorn worker memory restart looks like in unicorn_stderr.log. You see that worker 4 (PID 125918) is inspecting itself and decides to exit. The threshold memory value was 254802235 bytes, about 250MB. With GitLab this threshold is a random value between 200 and 250 MB. The master process (PID 117565) then reaps the worker process and spawns a new ‘worker 4‘ with PID 127549.

翻译如下:

GitLab有内存泄漏。这些内存泄漏在长时间运行的过程中显现,例如独角兽工人。 (独角兽主进程不知道会泄漏内存,可能是因为它不处理用户请求。)

为了使这些内存泄漏易于管理,GitLab带有独角兽工人杀手宝石。这个宝石猴子补丁独角兽工作人员在每16次请求后进行记忆自检。如果Unicorn工作人员的内存超过预设限制,则工作进程退出。独角兽主人会自动替换工作进程。

这是处理内存泄漏的强大方式:Unicorn旨在处理“崩溃”的工作人员,因此不会丢弃任何用户的请求。 unicorn-worker-killer gem被设计为仅在请求之间终止工作进程,因此不会影响用户请求。

这是一个Unicorn工作的内存重新启动看起来像unicorn_stderr.log。你看到工人4(PID 125918)正在检查自己并决定退出。阈值内存值为254802235字节,约为250MB。使用GitLab,该阈值是200到250 MB之间的随机值。主程序(PID 117565)然后收到工作进程,并使用PID 127549生成一个新的“工作人员4”。

技术分享

 

How GitLab uses Unicorn and unicorn-worker-killer