How busy is your grocer?Posted: March 11, 2011
Who knew that coming up with a statistical model that fits your problem requires a bit of daydreaming, coffee and people-watching. I got my inspiration for modeling MapReduce behavior in the face of failures from my grocer, Raj. I was trying to figure out, how much free time Raj gets in between customers and is that idle time enough to say read a few pages from a book.
First, a refresher on how MapReduce handles failures: When a machine fails during a MapReduce query execution, all its work is reassigned to other machines in the cluster. In the best case scenario, all its work is reassigned uniformly to all alive machines. So if we have a query that requires x seconds of processing time on each machine in a cluster of k machines, then when one machine fails, approximately x/k seconds of work are added to all machines. We are interested in the expected query time given an exponential time to failure distribution as we blogged before.
How does MapReduce relate to Raj’s idle time? We could think of the cluster as a single entity like Raj. Failures, like customers, appear at a certain rate: λc. Raj usually takes a few seconds to service every customer; the cluster takes on average x/k seconds to process the added work caused by the failure. The longer Raj takes to service a customer, the larger the queue gets and so he stays busy for longer without breaks. Early in the morning, Raj is busiest: all the cops are lining up to buy sugar with coffee. He is busy for longer not just because the arrival rate of cops is high but because he is slower. The first cop in the line-up has to wait until Raj figures out how to turn on the sales terminal. So like the first customer, the query in MapReduce requires x seconds to process, but every failure takes only x/k seconds to process. The more time a query requires, the more failures happen and the more time MapReduce spends dealing with failures.
How long an x-second query takes to complete on MapReduce over an imperfect cluster boils down to the queueing theory problem of how long the busy period is when the first customer takes an exceptional service time. Lucky for us, engineers like Erlang have used queueing theory since the 1900’s to solve problems like how many telephone operators you need to handle different call volumes or how many barista’s you need at Starbucks to keep lines short or could employees take their union-mandated breaks without disrupting service. We could apply busy period analysis directly to our problem.
Lets get to the maths. We need some notation:
x: time required to process the query
Y0= λcx: number of failures that occur while processing the query
x/k: time required to re-process the work of a failed machine
μ = k/x: processing rate of failure
ρ = λc/μ: is the traffic intensity or the average amount of work done in a second†.
B0: busy period from processing query and failures
B: busy period from processing failures
K: number of failures processed during B
I: idle time.
We are interested in the length of the busy period, B0, from when the query enters the system, until the last failure is handled and no more work is left in the queue.
B0 = x + i=1ΣY0 Bi
E[B0 | x, Y0] = x + Y0E[B]
E[B0 | x] = x + λcxE[B]
What is the expected length of busy period due to failures only, E[B]? We know that average work done in unit time is ρ, Therefore,
ρ = E[B] / (E[B] + E[I])
E[I] = 1/λc (the time between two failures)
ρ = λc/μ = λcx/k
λc*x/k = E[B]/(E[B] + 1/λc)
E[B] = 1/(μ – λc)
The busy period caused by a query is then:
E[B0] = x/(1 – ρ)
So the expected query execution time in MapReduce is x/(1 – λcx/k) or x/(1 – λmx), where λm is the failure rate of each machine. How does this compare with the expected query time of Parallel databases? Wait for the next post.
[†] If ρ >= 1 then the system is unstable and the failures will keep piling up at a rate that the MapReduce cluster can’t handle, the query will take an infinite time to finish. In this analysis, we will assume ρ < 1