From the Container memory monitoring restrictions to the CPU usage restrictions

By Melvin Webb,2015-09-13 16:35
14 views 0
From the Container memory monitoring restrictions to the CPU usage restrictions

    From the Container memory monitoring

    restrictions to the CPU usage restrictions


    Recently in ops hadoop cluster, our department found many Job OOM phenomenon, because of a command can be used in the machine to check, full gc is more serious. We all know, the consequences of a full gc is relatively large, "stop the world", once your full gc elapsed time more than a few minutes, then other activities have to suspend so much time. So full gc once appear, and exception, must find the root cause and solve it. This article is to discuss how we solve this kind of problem and on this basis made some what other optimization.

    Full Gc comes from where

    OOM happened, lead to the phenomenon of frequent Full Gc, first of all to think what cause Full Gc. The first association must be running on the Job, the Job of the embodiment of the runtime is from each task, and each task is performance in each TaskAttempt TaskAttempt is running in the application to the container. Every container is a separate process, you can use the JPS command on a datanode machine, you can see a lot of process called "YarnChild", that is the container of. To find the source, we estimate that will think, that should be a container and the JVM memory size with small, lead to memory, memory with larger can't solve the problem?In fact problems are not so simple, the waters here are still a little dark.

    Why Full Gc

    A. Why full gc, the first reason is the common memory with small, people called is below 2 items:

public static final String MAP_MEMORY_MB = "";

    public static final String REDUCE_MEMORY_MB = "mapreduce.reduce.memory.mb";

    The default is 1024 m, is 1 G.

    Ii. Another reason for estimation of people not too much, unless you really encountered in our daily life, the situation is: incorrect configuration. This configuration above actually is not the container set his rev JVM configuration, but the upper limit of each Task in the memory, but there will be a premise, must ensure that your JVM to use so much memory, if your JVM maximum memory ceiling is only 512 m, the memory of your Task of again big also have no, the final result is the direct result of memory used, would be a full gc. The above two prefer to theoretical values. The real control of the JVM configuration items are actually the two configurations:

public static final String MAP_JAVA_OPTS = "";

    public static final String REDUCE_JAVA_OPTS = "";

    . So the ideal configuration should be Java opts value must be greater than or equal to the memory. The value of the MB. So, this configuration in inappropriate ways can also cause frequent full gc.

    The Container memory monitoring

    But lucky is in view of the above listed the second question, hadoop itself have the contaienr level monitoring, for all tostart the container, he will open an additional container - monitor the thread, have for these container pmem (physical memory), vmem (virtual memory) monitoring. The relevant configuration are as follows:

String org.apache.hadoop.yarn.conf.YarnConfiguration.NM_PMEM_CHECK_ENABLED =


    String org.apache.hadoop.yarn.conf.YarnConfiguration.NM_VMEM_CHECK_ENABLED =


    Default is open. The memory monitoring means once the memory used by the container than the JVM itself the biggest ceiling, can use this conyainer kill off. The following simple from source level to analyze, actually not difficult. The first into the ContainersMonitorImpl. Java class - this class.


     protected void serviceInit(Configuration conf) throws Exception {

     this.monitoringInterval =




     pmemCheckEnabled = conf.getBoolean(YarnConfiguration.NM_PMEM_CHECK_ENABLED,


     vmemCheckEnabled = conf.getBoolean(YarnConfiguration.NM_VMEM_CHECK_ENABLED,

     YarnConfiguration.DEFAULT_NM_VMEM_CHECK_ENABLED);"Physical memory check enabled: " + pmemCheckEnabled);"Virtual memory check enabled: " + vmemCheckEnabled);


    In serviceInit approach will be opened if read from the configuration memory monitoring function,

    and the output log information. Then we directly into such MonitorThread monitoring thread class.


     private class MonitoringThread extends Thread {

     public MonitoringThread() {

     super("Container Monitor");



     public void run() {

     while (true) {

     // Print the processTrees for debugging.

     if (LOG.isDebugEnabled()) {

     StringBuilder tmp = new StringBuilder("[ ");

     for (ProcessTreeInfo p : trackingContainers.values()) {


     tmp.append(" ");



    In monitoring thread run method, he will do to monitor the container traversal judgment


     public void run() {

     while (true) {


     // Now do the monitoring for the trackingContainers

     // Check memory usage and kill any overflowing containers

     long vmemStillInUsage = 0;

     long pmemStillInUsage = 0;

     for (Iterator> it =

     trackingContainers.entrySet().iterator(); it.hasNext();) {

     Map.Entry entry =;

     ContainerId containerId = entry.getKey();

     ProcessTreeInfo ptInfo = entry.getValue();

     try {

     String pId = ptInfo.getPID();


    We to the principle of physical memory monitoring implementation as an example.

    First of all, he will get the process according to pTree related operation information, such as memory, CPU information, etc

LOG.debug("Constructing ProcessTree for : PID = " + pId

     + " ContainerId = " + containerId);

     ResourceCalculatorProcessTree pTree = ptInfo.getProcessTree();

     pTree.updateProcessTree(); // update process-tree

     long currentVmemUsage = pTree.getVirtualMemorySize();

     long currentPmemUsage = pTree.getRssMemorySize();

     // if machine has 6 cores and 3 are used,

     // cpuUsagePercentPerCore should be 300% and

     // cpuUsageTotalCoresPercentage should be 50%

     float cpuUsagePercentPerCore = pTree.getCpuUsagePercent();

     float cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore /

     resourceCalculatorPlugin.getNumProcessors(); Then get the memory usage limit

    // Multiply by 1000 to avoid losing data when converting to int

     int milliVcoresUsed = (int) (cpuUsageTotalCoresPercentage * 1000

     * maxVCoresAllottedForContainers /nodeCpuPercentageForYARN);

     // as processes begin with an age 1, we want to see if there

     // are processes more than 1 iteration old.

     long curMemUsageOfAgedProcesses = pTree.getVirtualMemorySize(1);

     long curRssMemUsageOfAgedProcesses = pTree.getRssMemorySize(1);

     long vmemLimit = ptInfo.getVmemLimit();

     long pmemLimit = ptInfo.getPmemLimit();

    And this is pememLimit pTree information, but from the outside tostart the container when the

    incoming value, this value is Java. Opts values.

ContainerId containerId = monitoringEvent.getContainerId();

     switch (monitoringEvent.getType()) {


     ContainerStartMonitoringEvent startEvent =

     (ContainerStartMonitoringEvent) monitoringEvent;

     synchronized (this.containersToBeAdded) {

     ProcessTreeInfo processTreeInfo =

     new ProcessTreeInfo(containerId, null, null,

     startEvent.getVmemLimit(), startEvent.getPmemLimit(),


     this.containersToBeAdded.put(containerId, processTreeInfo);



    Then is the core of the memory monitoring logic


     } else if (isPmemCheckEnabled()

     && isProcessTreeOverLimit(containerId.toString(),

     currentPmemUsage, curRssMemUsageOfAgedProcesses,

     pmemLimit)) {

     // Container (the root process) is still alive and overflowing

     // memory.

     // Dump the process-tree and then clean it up.

     msg = formatErrorMessage("physical",

     currentVmemUsage, vmemLimit,

     currentPmemUsage, pmemLimit,

     pId, containerId, pTree);

     isMemoryOverLimit = true;

     containerExitStatus = ContainerExitStatus.KILLED_EXCEEDED_PMEM;


    To the current memory usage, and limit then compare, isProcessTreeOverLimit will call to the

    following method.


     * Check whether a container's process tree's current memory usage is over

     * limit.


     * When a java process exec's a program, it could momentarily account for

     * double the size of it's memory, because the JVM does a fork()+exec()

     * which at fork time creates a copy of the parent's memory. If the

     * monitoring thread detects the memory used by the container tree at the

     * same instance, it could assume it is over limit and kill the tree, for no

     * fault of the process itself.


     * We counter this problem by employing a heuristic check: - if a process

     * tree exceeds the memory limit by more than twice, it is killed

     * immediately - if a process tree has processes older than the monitoring

     * interval exceeding the memory limit by even 1 time, it is killed. Else it

     * is given the benefit of doubt to lie around for one more iteration.


     * @param containerId

     * Container Id for the container tree

     * @param currentMemUsage

     * Memory usage of a container tree

     * @param curMemUsageOfAgedProcesses

     * Memory usage of processes older than an iteration in a container

     * tree

     * @param vmemLimit

     * The limit specified for the container

     * @return true if the memory usage is more than twice the specified limit,

     * or if processes in the tree, older than this thread's monitoring

     * interval, exceed the memory limit. False, otherwise.


     boolean isProcessTreeOverLimit(String containerId,

     long currentMemUsage,

     long curMemUsageOfAgedProcesses,

     long vmemLimit) {

     boolean isOverLimit = false;

     if (currentMemUsage > (2 * vmemLimit)) {

     LOG.warn("Process tree for container: " + containerId

     + " running over twice " + "the configured limit. Limit=" + vmemLimit

     + ", current usage = " + currentMemUsage);

     isOverLimit = true;

     } else if (curMemUsageOfAgedProcesses > vmemLimit) {

     LOG.warn("Process tree for container: " + containerId

     + " has processes older than 1 "

     + "iteration running over the configured limit. Limit=" + vmemLimit

     + ", current usage = " + curMemUsageOfAgedProcesses);

     isOverLimit = true;


     return isOverLimit;


    There are two conditions will lead to the phenomenon of memory beyond, one is to use memory beyond limit memory 2 times, the reason is that new JVM executes fork and exec operation, the fork will copy the parent information operation, and is a memory limit to age value. Other comments above have write very clear, if still can't understand English, to find a translation tool.

    Finally, if it is found that the container memory usage is beyond memory limit, after that, it will send the container kill the event event, will trigger a subsequent

    container movements of the kill.


     } else if (isVcoresCheckEnabled()

     && cpuUsageTotalCoresPercentage > vcoresLimitedRatio) {

     msg =


     "Container [pid=%s,containerID=%s] is running beyond %s vcores limits."

     + " Current usage: %s. Killing container.\n", pId,

     containerId, vcoresLimitedRatio);

     isCpuVcoresOverLimit = true;

     containerExitStatus = ContainerExitStatus.KILLED_EXCEEDED_VCORES;


     if (isMemoryOverLimit) {

     // Virtual or physical memory over limit. Fail the container and

     // remove

     // the corresponding process tree


     // warn if not a leader

     if (!pTree.checkPidPgrpidForMatch()) {

     LOG.error("Killed container process with PID " + pId

     + " but it is not a process group leader.");

Report this document

For any questions or suggestions please email