Single thread you don't block, Redis time delay problem analysis and deal with it

By Kimberly Perry,2015-09-05 00:28
23 views 0
Single thread you don't block, Redis time delay problem analysis and deal with it

    Single thread you don't block, Redis time delay

    problem analysis and deal with it

    Single thread you don't block, Redis time delay problem analysis and deal with it

    Redis event loop in a thread processing, as a single-threaded program, it is important to ensure the event processing time delay is short, so, subsequent task in a cycle of events does not block;

    When redis data volume reach a certain level (for example, 20 g), the blocking operation is particularly serious impact on performance;

    Below we conclude what are time consuming in redis scenarios and response method;

    Take long command causes obstruction

    Keys, such as the sort command

    Keys command used to find all the key fits a pattern of a given pattern, the time complexity is O (N), N is the number of the key in the database.When number tens in the database, this command will cause the read/write threads blocked a few seconds;

    A similar command has sunion sort operations such as;

    If must use keys, a sort of needs of the business operations such as how to do?

    The solution:

    In architecture design, has a "tap", said the request and will deal with fast processing slow separation request to open, otherwise, the influence of the slow to fast, not let the fast also fast;It is very obvious in the design of redis, redis pure memory operations, epoll non-blocking IO event handling, these fast in one thread, and persistence, AOF rewrite, Master - slave synchronous data the time-consuming operation is single open a process to deal with, do not affect slow fast;

    Also, since you need to use the keys of these time-consuming operation, then we will put them off, such as open a single redis slave node, specialized for keys, sort, such as time-consuming operation, these queries generally will not be online real-time business, the query will slowly, slowly is mainly to complete the task, but has no effect for online time-consuming task quickly;

    Smembers command

    Smembers command set is used to get the complete, the time complexity is O (N), N is in the collection number;

    If a collection of the save thousands of magnitude of the data, a back can cause the event processing thread blocked for a long time;

    The solution:

    And sort keys such as command, smembers online real-time application scenario is likely to be a command, use frequency is very high in shunt a recruit is not suitable for here, and we need more from the design level to consider; In the design, we can control the amount of collection will be set for several general keep within 500;

    Such as the original with a key to store a year of record, the data quantity is big, we can use the 12 keys to maintain records of 12 months respectively, or 365 keys to keep records of every day, will set the size of the control in the acceptable range;

    If it is not easy to set multiple sub divided into collections, and insist on a large collection to store, so can consider to use when taking collection SRANDMEMBER key (count);Returns a specified number of a set of random, of course, if you want to traverse the set of all elements, this command is not suitable for;

    The save command

    Use the save command event handling thread for data persistence;When large amounts of data, can cause the thread block for a long time, our production, reids memory 1 G save need around 12 s), the redis is block;

    Save blocked the event processing thread, we can't even use the redis - cli view the current state of the system, due to "save much when to save the end, the current" such information is difficult to know;

    The solution:

    I didn't think to use the save command, at any time in the need of persistent use bgsave is reasonable choice (of course, this command will also bring problems, later talk to);

    The fork block

    In redis need to perform time-consuming operation, will create a new process to do, such as data persistence bgsave:

    Open RDB persistence, when achieve persistent threshold, a new process for redis will fork persistence, adopted the copy of the operating system - on - wirte replication strategy when writing, the child to share with the parent Page.If the parent Page (4 k) per Page is modified, the parent process create a copy of the Page, will not affect the child process;

    Fork new process, although can be Shared data content does not need to copy, but will duplicate the process space memory page table before, if the memory space of 40 g each page table entries (consider consumption 8 bytes), then the page table size is 80 m, the copy is need time, if you use a virtual machine, especially the Xen virtual server, will take longer.

    In our test server nodes, 35 g data bgsave moment will block more than 200 ms;

    Similarly, the following process of these operations have the fork;

    ; The Master to the slave synchronization data for the first time:

    when the Master node receiving slave node to the syn synchronous

    request, will generate a new process, the memory dump of data to

    a file, and then synchronization to the slave node;

    ; AOF log rewrite: use AOF persistent way, to do AOF documents rewrite

    operation will create a new process to do rewrite;(rewrite the

    existing files will not go to read, but directly using the data in

    memory write archive logs);

    The solution:

    In order to deal with large memory page table copy when the impact of some available measures:

    1. Control the maximum amount of memory each redis instance;

    Don't let the fork brought too restrictive, can control the fork

    from the memory of time delay;

    General advice not to exceed 20 g, can according to their own to

    determine the performance of the server (the greater the memory,

    the longer persistence, copy the page table, the longer the event

    loop obstruction to extend)

    Sina weibo for advice is no more than 20 g, and our test, the virtual

    machine to guarantee the application of burr is not obvious, may under 10 g;

    2. Use large pages, use the default memory page 4 KB, so, when using 40 gb of memory, a page table is 80 m;And the expansion of each memory page into 4 m, a page table only 80 k;This copy page table almost no obstruction, will also improve fast page table buffering TLB (translation lookaside buffer) shot;But large pages also has a problem, when writing copy, as long as any one of the elements in a page quickly is modified, the page block requires a copy (the granularity of COW mechanism is page), so when writing during replication, consumes more memory space;

    3. Using physical machine;

    If some selected, physical machine is the best solution, of course, than the convenient;

    Virtualization implementation, of course, also have a variety of, in addition to Xen system, most modern hardware can quickly copy the page table;

    But the virtualization is generally complete online company, will not change, because we are the cause of the individual server when faced with only Xen, can only think about how to use it well;

    4. Put an end to the production of the new process, do not use the persistence, not on the primary node provides the query;To implement has the following solutions:

    1) use only single, don't open the persistent, don't hang a slave node.The most simple, there would be no new process;But the plan is only for the cache;

    How to do the high availability of the project?

    To do the high availability, can hang a message queue, in the front of the write redis, used in the message queue pub - sub to do distribution, ensure that every write operation on at least two nodes;Because all nodes of the same data, only need to use a node for persistence, the node external query are provided;

    2) the master - slave: open the persistence on the master node, the master node do not provide query, query provided by the slave node, from node does not provide persistent;In this way, all the fork time-consuming operation in the main nodes, and query request is provided by slave node;

    The scheme's problem is how to deal with after the main node is broken?

    Simple implementation scheme is the Lord shall have no alternative, is broken, after foreign redis cluster can provide only read, but cannot update;After being main node tostart, continue to update operation;For before the update operation, can use the MQ cached, such as the main node digest after fault during write requests;

    If you use the official Sentinel will upgrade is given priority to,

    the overall implementation is relatively complex;Need to change the

    available from the IP configuration, its from a query node, let the

    front-end query load is no longer fell on the new master;Then, to

    let go of sentinel switching operation, the relationship between

    before and after the need to guarantee;

    Persistence of obstruction

    Perform persistence (AOF/RDB snapshot) have a great impact on system performance, especially the server nodes with other read/write operation of the disk (for example, application service and redis service deployed on the same node, application service real-time record in and out of the log).Should avoid as far as possible in IO have heavy nodes Redis persistence;

    The child persisted, the child to write and main process of fsync conflict cause obstruction

    In opens the AOF the persistence of nodes, when the process execution AOF rewrite or RDB persistence, the Redis ZhaXunKa even blocking problem for a long time, at this point, the Redis is unable to provide any read and write operations;

    Cause analysis:

    Redis service set up appendfsync everysec, main process will call every second fsync (), require the kernel will data "really" wrote in the storage hardware. But because the server is a large number of IO operations, leading to the master fsync ()/operation was blocked, eventually lead to Redis main process block.

    Redis. Conf of it this way:

    When the AOF fsync policy is set to always or everysec, and a background saving process (a background save or AOF log background rewriting) is performing a lot of I/O against the disk, in some Linux configurations Redis may block too long on the fsync() call. Note that there is no fix for this currently, as even performing fsync in a different thread will block our synchronous write(2) call.

    When performing AOF rewrite there are a lot of IO, which in some Linux configuration will cause the main process fsync block;

    The solution:

    Set up the no - appendfsync - on - rewrite yes, when the child process execution AOF rewrite, main process do not call fsync () operation;Note that even if the process is not call fsync (), system kernel according to their own algorithm at the appropriate time to write data to disk (Linux the default the longest do not exceed 30 seconds).

    This setting of the problem is when the fault occurs, the longest possible loss of data from more than 30 seconds, instead of 1 SEC;

    Child AOF rewriting, the system's sync main process caused by the write block

    We come to comb the:

    Cause: 1) there is a lot of IO operations write (2) but did not take the initiative to call synchronous operation

    2) from the kernel buffer has a lot of dirty data

    3) the system synchronization, the synchronization of the sync time is too long 4) cause redis write aof log write (2) operation blocks;

    5) redis effect of the single thread to handle an event, the entire redis obstruction (redis event handling is done in one thread, including writing aof log write (2) is a synchronous call blocking mode, and the network of non-blocking write (2) to distinguish)

    Causes of 1) : this is an issue before redis2.6.12, AOF rewrite has been sunk when call write (2), by the system to trigger the sync.

    Another reason: system IO busy, such as another application in writing plate;

    The solution:

    Time sync call control system;Need to synchronize data, is time consuming;Reduce the time-consuming and control data of each time synchronization;Through configuration according to the proportion (vm) dirty_background_ratio) or by value (vm) dirty_bytes) set up a sync call threshold;Once (typically set to 32 m synchronization)

    After 2.6.12, AOF rewrite 32 m will take the initiative to call fdatasync;

    Redis, moreover, when found in implementing the current file that is being written by fdatasync (2), just don't call write (2), only exist in the cache, lest be block.But if you have more than two seconds or like this, will be forced to perform the write (2), even if the redis will be block.

    AOF rewrite when after the completion of the merger data block

    In bgrewriteaof process, all the new written request will still be written to the old AOF file, at the same time in AOF buffer, when after the completion of the rewrite, will this part of the content on the main thread after merge into a temporary file to rename a new AOF file, so will continue in the process of rewrite print "Background AOF buffer size: 80 MB, Background AOF buffer size: 180 MB", to monitor this part of the log.The merge process is blocked, if produced the 280 MB buffer, in the traditional hard disk 100 MB/s, Redis will block 2.8 seconds;

    The solution:

    Set the hard disk is enough big, will be AOF rewrite a threshold raised, during the peak travel not trigger rewrite operations;Use the crontab calls when idle AOF rewrite command;

Report this document

For any questions or suggestions please email