6/22/2018

YARN Job Scheduling and Task Execution

Hadoop 1.X Architecture



Hadoop 2.X Architecture


Explanation :
  
    1. Submit job to job client

    2. Job client request a new application id

    3. Check output directory already created and copy resources to HDFS

    4. Submit job to resource manager.

    5. Resource Manager contact Node Manager and to allocate a container and  launch a Application Master.

    6. Application Master create an object for book keeping and task management purposes.

    7. Application Master retrieve inputs splits and create one map per split.

    8. If the application task is small (uber task) it will run on same JVM. If the task is not uber, Application Master requests Resource Manager for computing resources.   
    Scheduler knows where is splits are located by heart beats received from Node Managers. Scheduler will allocate Node for task execution.

    9. Application Master contact Node manager to launch a container, Node Manager launch a container for task execution.

    10. Container sends progress report every 3 seconds to the Application Master.Application Master aggregate and sends update directly to JobClient.



No comments:

Post a Comment