Monitoring Mesos Resource Offers and Tasks

WHAT IS DRF AND HOW IT WORKS?

As we have said in the previous post, Mesos provides two-level resource scheduling, the first level happens at the Mesos master that is responsible for deciding what resources are offered to each framework and when; the second level happens at the framework’s scheduler level which is responsible for accepting or rejecting the offers received from the master.

The Mesos master uses the allocation module for deciding how many resources to offer to each framework and in what order. This module is pluggable and Mesos provides a default algorithm called DRF (Dominant Resource Fairness).

mesos1

 

The main idea behind DRF is trying to maximize the minimal dominant share across all users (frameworks). The dominant resource (cpu, memory, disk, …) is the one that a framework percentage-wise demands most. Imagine a cluster with a total amount of 30 CPUs and 512GB of memory, framework A demanding 5 CPUs and 16GB of memory and  framework B demanding 4 CPUs and 128GB of memory. Framework A has the CPU as the dominant resource and framework B has the memory, or the same expressed as a share of the whole cluster resources: framework A 15% CPU and 3.125% of memory; framework B 13.3% CPU and 25% of memory.

At each round of the resource offering, the allocation module applies DRF to identify the dominant shares of the frameworks and offers the resources first to the one with smallest dominant share, then to the second smallest one and so on.

There is an additional set of features that Mesos provides in order to guarantee and tune the SLA’s of the framework’s resource allocations: roles, quotas, reservations, oversubscription, weights, etc, however, it is out of the scope of this brief introduction.

TWO EASY STEPS TO MONITOR THE OFFERS AND OTHER RELEVANT EVENTS

Now that we understand the basis of the DRF we are going to introduce a way to monitor how Mesos agents publish their available resources. The Mesos master manages resources offers and the framework’s schedulers reject offers or accept them and launch tasks…and also some other interesting events related to the lifecycle of the framework and its tasks.

The first step is to toggle the Mesos master log level to 3 for the amount of time that we want to monitor the frameworks, offers and tasks. For doing so you only have to make the following POST request:

http://:5050/logging/toggle?level=3&duration=1mins

With this level of log we’ll start seeing some key logs from the allocation module, the  hierarchical Dominant Resource Fairness, and from the master itself. For more information about Mesos logging you can check it here.

As a second step,  we’ll go to the dir containing the Mesos master logs and access the logs using the following command in order to avoid getting overwhelmed with the amount of log activity:

tail -f mesos-master.INFO | grep -Ei "added|removed|allocating|accept|decline|adding|launching|forwarding”

INTERPRETING THE IDENTIFIED LOG PATTERNS

Let’s explain each state or event identified in the log pattern including one sample.

Two main events of the framework lifecycle

  • added: represents the moment when a framework is registered with the Mesos master, the log includes the unique id assigned to the framework. This unique id is fundamental and will allow us to correlate all the traces related to a concrete framework.
I1027 00:41:36.338513  1166 hierarchical.cpp:271] Added framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080
  • removed: represents the moment when a framework is unregistered from the Mesos master, for example the framework’s scheduler stops.
I1027 00:42:01.495795  1163 hierarchical.cpp:333] Removed framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

Three main events of an offer lifecycle

  • allocate: the Mesos master sends an offer to a framework’s scheduler. Note that the log shows the detail of the resources from an agent offered by Mesos master to a concrete framework, identified by the framework’s unique id.
I1027 00:41:36.338580  1166 hierarchical.cpp:1511] Allocating cpus(*):4; mem(*):14922; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080
  • accept: the framework’s scheduler accepts the offer but we have no idea of how many resources have been taken. We’ll know it when a task is launched with the accepted resources. Remember that the scheduler should only take the resources that it needs, not necessarily all the resources included in the offer.
I1027 00:41:36.341940  1163 master.cpp:3342] Processing ACCEPT call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O645 ] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 at slave(1)@127.0.1.1:5051 (irodriguez) for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080 (mesos-execute instance)
  • decline: the framework’s scheduler declines the offer.
I1027 00:41:42.217736  1160 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O647 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080 (mesos-execute instance)

Four main events of the task lifecycle

  • adding: it could correspond with the TASK_STAGING state, in which the Mesos master has received the framework’s scheduler request to launch a task but the task hasn’t yet started to run.
I1027 00:41:36.342283  1163 master.cpp:7447] Adding task F1-TASK with resources cpus(*):2; mem(*):128 on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 (irodriguez)
  • launching: it could correspond with the TASK_STARTING state, in which the executor has learned about the existence of the task and prepares to run it.
I1027 00:41:36.342308  1163 master.cpp:3831] Launching task F1-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080 (mesos-execute instance) with resources cpus(*):2; mem(*):128 on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 at slave(1)@127.0.1.1:5051 (irodriguez)
  • forwarding: allows us to identify the different task status updates. The following traces show the TASK_RUNNING status update, indicating that the task has begun running successfully. And the second status update shows that the task has been completed successfully, TASK_FINISHED
I1027 00:41:36.426267  1166 master.cpp:5199] Forwarding status update TASK_RUNNING (UUID: 09444748-8f44-430e-9f8e-2ba9b4f47963) for task F1-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

I1027 00:42:01.493413  1164 master.cpp:5199] Forwarding status update TASK_FINISHED (UUID: b90c09e4-80c8-4f75-8610-d46f56c3af76) for task F1-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

If you are interested on the complete lifecycle of a task, it is described here: “The Life Cycle of a Task

PUTTING EVERYTHING TOGETHER: 3 FRAMEWORKS FIGHTING FOR RESOURCES

We are going to introduce an easy example simulating 3 frameworks (F1, F2 and F3) which will arrive at the Mesos cluster at different times demanding different amounts of resources.

Framework Arrive time Requested CPU Requested memory Duration
F1 t0 2 CPUs 128 MB 25 secs
F2 t0+5 1.5 CPUs 400 MB 20 secs
F3 t0+11 1.8 CPUs 300 MB 20 secs

 

The first framework will request 2 CPUs and 128MB of memory and will be running for 25 seconds. The second framework will arrive to the cluster 5 seconds later, will request 1.5 CPUs and 400MB of memory and will be running for 20 seconds. And finally the third framework will arrive 6 seconds later, requesting 1.8 CPUs and 300 MB of memory, and it will be running for 20 seconds. Each framework will run only one task which will last the forementioned time in each case.

The following script uses a built-in Mesos scheduler, the CommandScheduler, via mesos-execute command and represents the scenario described above:

#!/bin/sh

mesos-execute --master=127.0.1.1:5050 --resources="cpus:2;mem:128" --name="F1-TASK" --command='echo "Tarea 1 del Framework 1";sleep 25'

sleep 5

mesos-execute --master=127.0.1.1:5050 --resources="cpus:1.5;mem:400" --name="F2-TASK" --command='echo "Tarea 1 del Framework 2";sleep 25'

sleep 6

mesos-execute --master=127.0.1.1:5050 --resources="cpus:1.8;mem:300" --name="F3-TASK" --command='echo "Tarea 1 del Framework 3";sleep 20'

You can execute the above script in your Mesos cluster and try to analyze the logs with the guidelines provided.

The following images show the execution of the above script in a Mesos cluster with one agent with 4 CPUs and 14922MB of memory available. The evolution in the time of the max share of each framework is included, it is very interesting to see how the DRF allocates the offers from the min share to the max share. Two screenshots of the Mesos UI showing the active frameworks with the assigned resources, the number of tasks and the max share of each one in different moments of the execution are also included.

mesos2Presented scenario: frameworks, offer and tasks lifecycle I/II

mesos3

Mesos UI t0+11

mesos4Presented scenario: frameworks, offer and tasks lifecycle II/II

mesos5Mesos UI t0+Y

Finally here is the result of applying our pattern to the log of the Mesos master, having previously toggled the log level to 3:

I1027 00:41:36.338513  1166 hierarchical.cpp:271] Added framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

I1027 00:41:36.338580  1166 hierarchical.cpp:1511] Allocating cpus(*):4; mem(*):14922; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

I1027 00:41:36.341940  1163 master.cpp:3342] Processing ACCEPT call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O645 ] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 at slave(1)@127.0.1.1:5051 (irodriguez) for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080 (mesos-execute instance)

I1027 00:41:36.342283  1163 master.cpp:7447] Adding task F1-TASK with resources cpus(*):2; mem(*):128 on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 (irodriguez)

I1027 00:41:36.342308  1163 master.cpp:3831] Launching task F1-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080 (mesos-execute instance) with resources cpus(*):2; mem(*):128 on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 at slave(1)@127.0.1.1:5051 (irodriguez)

I1027 00:41:36.426267  1166 master.cpp:5199] Forwarding status update TASK_RUNNING (UUID: 09444748-8f44-430e-9f8e-2ba9b4f47963) for task F1-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

I1027 00:41:41.164577  1159 hierarchical.cpp:271] Added framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081

I1027 00:41:41.164649  1159 hierarchical.cpp:1511] Allocating cpus(*):2; mem(*):14794; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081

I1027 00:41:41.166434  1163 master.cpp:3342] Processing ACCEPT call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O646 ] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 at slave(1)@127.0.1.1:5051 (irodriguez) for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081 (mesos-execute instance)

I1027 00:41:41.166695  1163 master.cpp:7447] Adding task F2-TASK with resources cpus(*):1.5; mem(*):400 on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 (irodriguez)

I1027 00:41:41.166718  1163 master.cpp:3831] Launching task F2-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081 (mesos-execute instance) with resources cpus(*):1.5; mem(*):400 on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 at slave(1)@127.0.1.1:5051 (irodriguez)

I1027 00:41:41.243779  1160 master.cpp:5199] Forwarding status update TASK_RUNNING (UUID: c050e32b-99a3-4b07-a792-6d539717d8a8) for task F2-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081

I1027 00:41:42.216320  1166 hierarchical.cpp:1511] Allocating cpus(*):0.5; mem(*):14394; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

I1027 00:41:42.217736  1160 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O647 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080 (mesos-execute instance)

I1027 00:41:46.219952  1166 hierarchical.cpp:1511] Allocating cpus(*):0.5; mem(*):14394; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081

I1027 00:41:46.221310  1166 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O648 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081 (mesos-execute instance)

I1027 00:41:47.220943  1159 hierarchical.cpp:1511] Allocating cpus(*):0.5; mem(*):14394; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

I1027 00:41:47.222218  1171 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O649 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080 (mesos-execute instance)

I1027 00:41:47.784749  1159 hierarchical.cpp:271] Added framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

I1027 00:41:47.784827  1159 hierarchical.cpp:1511] Allocating cpus(*):0.5; mem(*):14394; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

I1027 00:41:47.787749  1163 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O650 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082 (mesos-execute instance)

I1027 00:41:51.223963  1163 hierarchical.cpp:1511] Allocating cpus(*):0.5; mem(*):14394; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081

I1027 00:41:51.225608  1166 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O651 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081 (mesos-execute instance)

I1027 00:41:52.225154  1159 hierarchical.cpp:1511] Allocating cpus(*):0.5; mem(*):14394; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

I1027 00:41:52.228976  1161 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O652 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080 (mesos-execute instance)

I1027 00:41:53.227172  1171 hierarchical.cpp:1511] Allocating cpus(*):0.5; mem(*):14394; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

I1027 00:41:53.231391  1166 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O653 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082 (mesos-execute instance)

I1027 00:41:56.230902  1169 hierarchical.cpp:1511] Allocating cpus(*):0.5; mem(*):14394; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081

I1027 00:41:56.233460  1164 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O654 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081 (mesos-execute instance)

I1027 00:41:57.231640  1164 hierarchical.cpp:1511] Allocating cpus(*):0.5; mem(*):14394; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

I1027 00:41:57.235337  1163 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O655 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080 (mesos-execute instance)

I1027 00:41:59.232581  1161 hierarchical.cpp:1511] Allocating cpus(*):0.5; mem(*):14394; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

I1027 00:41:59.233829  1161 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O656 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082 (mesos-execute instance)

I1027 00:42:01.493413  1164 master.cpp:5199] Forwarding status update TASK_FINISHED (UUID: b90c09e4-80c8-4f75-8610-d46f56c3af76) for task F1-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

I1027 00:42:01.495795  1163 hierarchical.cpp:333] Removed framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0080

I1027 00:42:02.234576  1169 hierarchical.cpp:1511] Allocating cpus(*):2.5; mem(*):14522; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

I1027 00:42:02.236279  1163 master.cpp:3342] Processing ACCEPT call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O657 ] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 at slave(1)@127.0.1.1:5051 (irodriguez) for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082 (mesos-execute instance)

I1027 00:42:02.236750  1160 master.cpp:7447] Adding task F3-TASK with resources cpus(*):1.8; mem(*):300 on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 (irodriguez)

I1027 00:42:02.236784  1160 master.cpp:3831] Launching task F3-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082 (mesos-execute instance) with resources cpus(*):1.8; mem(*):300 on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 at slave(1)@127.0.1.1:5051 (irodriguez)

I1027 00:42:02.343415  1160 master.cpp:5199] Forwarding status update TASK_RUNNING (UUID: 41a5806f-6c35-4ea7-ac10-a4b25141e2b0) for task F3-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

I1027 00:42:03.235754  1160 hierarchical.cpp:1511] Allocating cpus(*):0.7; mem(*):14222; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081

I1027 00:42:03.236974  1160 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O658 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081 (mesos-execute instance)

I1027 00:42:06.308914  1164 master.cpp:5199] Forwarding status update TASK_FINISHED (UUID: 0eb9e99f-0bdd-4d54-afa3-2c49dac5ff6e) for task F2-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081

I1027 00:42:06.312667  1166 hierarchical.cpp:333] Removed framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0081

I1027 00:42:07.239861  1171 hierarchical.cpp:1511] Allocating cpus(*):2.2; mem(*):14622; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

I1027 00:42:07.242084  1166 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O659 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082 (mesos-execute instance)

I1027 00:42:12.252518  1163 hierarchical.cpp:1511] Allocating cpus(*):2.2; mem(*):14622; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

I1027 00:42:12.253914  1166 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O660 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082 (mesos-execute instance)

I1027 00:42:17.257071  1163 hierarchical.cpp:1511] Allocating cpus(*):2.2; mem(*):14622; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

I1027 00:42:17.258216  1159 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O661 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082 (mesos-execute instance)

I1027 00:42:22.263494  1164 hierarchical.cpp:1511] Allocating cpus(*):2.2; mem(*):14622; disk(*):925185; ports(*):[31000-32000] on agent 9ecb851f-3841-4dbf-b167-fe608fb11a86-S0 to framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

I1027 00:42:22.267007  1171 master.cpp:3951] Processing DECLINE call for offers: [ 9ecb851f-3841-4dbf-b167-fe608fb11a86-O662 ] for framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082 (mesos-execute instance)

I1027 00:42:22.389632  1159 master.cpp:5199] Forwarding status update TASK_FINISHED (UUID: c97786e7-d711-468e-9957-da7777b6208a) for task F3-TASK of framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

I1027 00:42:22.391718  1166 hierarchical.cpp:333] Removed framework 9ecb851f-3841-4dbf-b167-fe608fb11a86-0082

CONCLUSIONS

In this post we have briefly explained what the default Mesos allocation module is and how it works, following that, we have introduced a comprehensible approach for monitoring the offers and some key events in the lifecycle of a framework and its tasks. The described method is only valid for the default Mesos allocation algorithm, the hierarchical Dominant Resource Fairness. This is a simple but powerful approach as long as it allows us to have a cluster wide vision of the offers, tasks and frameworks lifecycle . The described method could be used as the basis for building a graphical dashboard or some other kind of tool based on the described log events and anything else that could be of interest and not described in this post.

mm

Isaac Rodríguez

I'm a technical engineer in computer systems with more than 14 years experience as a specialist in Real Time and Event Driven Architectures, having a vast background in technologies related to stream processing, complex event processing, messaging middelwares, rule engines and integration solutions. Now I'm facing new challenges in leading the technical architecture area in Datio.

More Posts

Follow Me:
TwitterLinkedIn