Introduction to the call center

In the year 2000, major research firms predicted that by 2003 approximately US$60 billion would be spent on call center services worldwide. It’s a reality, and how much of that action does Pakistan have? How much does it deserve to have?

Enter “India Call Center” key words in google.com and get over 200 meaningful results. Enter “Pakistan Call Center” and get seven meaningful results.

India was already employing 16,000 personnel for USA companies two years ago with the claim that Indian call center agents are more reliable and skilled and can operate at a much lower cost per transaction than USA call centers.

President Arroyo of the Philippines signed three major agreements with USA companies the same year. She is quoted, “To be successful, the ICT sector must be driven by the private sector, with minimal government intervention and regulation.” Today, the Philippines is a major player in the outsourced call center world.

Some nations, with the same potentially well-qualified personnel for these “back offices,” are saying “thousands of jobs will be available in call center applications in next few years.” Meanwhile, the competition around those countries who are just simply waiting… are doing what it takes right now. Not right now, they began yesterday and before that. It is not thousands of jobs; it is millions.

My main objective is to clear questions you may have in your mind about call centers in Pakistan. Second, I hope to encourage every single one of you to create a plan today to take advantage of the huge and wonderful opportunities just waiting for you in the call center, back office, and remote sales to USA business.

Pakistan is an excellent example of a nation of many extremely fluent English-speaking and Internet literate people who have the future-thinking entrepreneurial attitude needed by USA companies looking for outsourcing their customer support, technical support, telemarketing, accounting and more. So, what is wrong? Internet must be more affordable, and bandwidth needs to be subsidized more. Such encouragement will bring these companies to the “Land of the Pure” instead of elsewhere in the world.

What is the definition of a call center anyway? A call center can take many forms. A mother and wife can work from the privacy of her home as a medical secretary, never leaving her children. She will be able to practice her religious beliefs and still help her family financially. Pizza delivery, medical insurance claims, book orders, travel itineraries… there is no limit to the number of jobs that are now available. Young, elderly, men, women, handicapped, and those with families or without will now have new opportunities in Pakistan to improve their economic status and use their often self-taught English, Internet, and people skills. Anyone can be a call center with these three skills. Manned by one or more than a thousand, it doesn’t matter.

So, what has India accomplished in comparison to Pakistan in the Call Center Choice race? Bangalore is known worldwide as IndoPak area’s Silicon Valley. They were first and quickest in capitalizing on the software market. Now, Bangalore and the rest of India are moving to the hottest market, Call Centers. This is just a “dust in the wind” phenomenon. It is here to stay for several decades. Take a look at “Dialing for Dollars” India Call Center article and video show at http://www.pakistancallcenters.com/. On 30 December 2002, there were listed at http://www.people-one.com/searchjob/hotjobs.asp 250 call center positions in India and zero in Pakistan.

India supports call center operations in a huge way. See the following example.

THE INDIA ADVANTAGE (http://www.delhicall.com/why-india.html)

Language

» India has the second-largest English- speaking population in the world.
» English is the principal language for the transaction of business

Manpower

» India has the second largest and the fastest growing pool of technical manpower .
» High availability of English speaking & educated customer care professionals.
» Has the lowest manpower cost
» High availability of computer literate graduate manpower and technical manpower

Reliability and Security

The work force is highly reliable and can deliver world-class quality and ensure rapid delivery of service. Indian companies are also increasingly adapting to international quality and security standards.

Infrastructure

» Well-connected telecommunication systems on a world-class scale . · High availability of infrastructure resources.

» India's satellite-based telecommunication network enables almost instantaneous high-speed transfer of voice and data across the globe

Legislative Framework

» Highly liberal Government policies on Call- center operations

» Maintains High cost competitiveness in service sectors

» Proactive Government - 10yr. Tax holiday · Duty free import of capital machinery and software.

Cost Benefits

Indian companies can provide call center services to clients based in the U.S. or the U.K. at one-sixth to one-fourth of what it costs in the U.S., U.K. or Australia.

Time Zone

A virtual 12-hour time zone difference with the USA and other markets for Call Center services is in India's favor.

Pro-Active Government

The department of telecommunications, Government of India has given a special thrust to the industry by reducing the prices of high-speed international private leased circuits. The recent IT boom has prompted the Government of India to announce exemptions from income tax and customs for the exports of IT enabled services. Central and the State governments have put emphasis to set up state-of-the art infrastructure for the projected boom for IT enabled services. Private Internet gateways and 100 foreign direct investments have been given approvals. "Potentially 50-80 percent of total process costs in most IT enabled services can be out sourced offshore. As much as 70 - 80 percent costs can be reduced primarily because of wage differentials. However, in order to manage operations in remote locations, expatriate management may initially be required to support the remote locations. This together with higher telecom costs could result in additional costs of 10-20 percent. Hence there could be 50-60 percent saving on out located processes." (VCare Technologies, Market Perception Memo III, July 2000)”

Why is Pakistan not one of the choice call center locations in the world? Priorities? The infrastructure needs assistance. The opportunities must be publicized to the people. Call centers can have a location as small as a corner of one’s home. Again bandwidth must be subsidized, Internet access must be affordable, and VOIP must be viewed as a friend, not the enemy. Look at these hard facts. About 2 MB of bandwidth will host about 68 phone lines at 30 Kbps per phone line. This costs US$6000/month in Pakistan and only US$1000/month in USA.

Finally, why would a nation want to block VOIP when it should be the answer to economic improvement? Why block VOIP and not Playboy? Playboy uses 50% more bandwidth than VOIP, and it does nothing to improve the Pakistan citizenry. In fact, we all know it does a world of harm to one’s emotional, social, family, and spiritual life and sometimes even causes a person to lose everything he has. On the other hand, VOIP, something that is still being blocked by a few governments of the world, is a necessity for any type of call center. Let us see that Pakistan is the smart example and choice call center location of the world with all the tools of success and encouragement ready to get in on the action.


How good is the Erlang formula?

In this section we consider the weak points of the Erlang formula and its underlying assumptions. It will motivate some of the more sophisticated models. It might come as a surprise that the ASA is bigger than 0 although there is overcapacity. The reason for this is the variability in arrival times and service durations. If all arrival times were equally spaced and if all call holding times were constant, then no waiting would occur. However, in the random environment of the call center under capacity occurs during short periods of time. This is the reason why queueing occurs. The queue will always empty again if on average there is overcapacity. The Erlang formula quantifies the amount of waiting (in terms of ASA or TSF) for a particular type of random arrival and service times. The mathematical random processes that model the arrivals and departures are therefore nothing more than approximations. The quality of the approximation and the sensitivity of the formula to changes with respect to the different aspects of the model decide whether the Erlang formula gives acceptable results. We deal with the underlying assumptions one by one and discuss the consequences for the approximation.
Abandonments In a well-dimensioned call center there are few abandonments. Not modeling these abandonments is therefore not a gross simplification. However, there are call centers that show a completely different behavior than predicted by the Erlang formula because abandonments are not explicitly modeled. In general we can state that abandonments reduce the waiting time of other customers, thus it is good for the SL that abandonments occur! In call centers with a close to or even exceeding s it is crucial to model abandonments as well. Luckily this is possible Retrials Abandonments are relatively well understood and the Erlang C formula can be extended to account for abandonments without too much difficulty. This is no longer true
if the customers who abandoned start to call again and thus generate retrials. Little is known about the behavior of customers concerning retrials and about good mathematical models. Unfortunately retrials are a common phenomenon in most call centers. Peaks in offered load Formally speaking, the Erlang formula allows no fluctuations in offered load. However, in every call center there are daily changes in load. As long as these changes remain limited, and, more importantly, if there are no periods with undercapacity, then the Erlang formula performs well for periods where there are little fluctuations in load and number of agents. By using the Erlang formula for different time intervals we can get the whole picture by averaging (as explained in Chapter 3).
However, as soon as undercapacity occurs then the backlog of calls from one period is shifted to the next. This backlog should be explicitly modeled, which is not possible within the framework of the Erlang formula. Therefore the Erlang formula cannot be used in the case of undercapacity. For a short peak in offered traffic (e.g., reactions to a tv commercial) straightforward capacity calculations ignoring the random behavior can give quite good results. See also Type of call durations The Erlang formula is based on the assumption that the service times come from a so-called exponential distribution. Without going in the mathematical details, we just note that all positive values are possible as call durations, thus also very long or short ones, but that most of the durations are below the average. Certain measurements on standard telephone traffic show that call durations are approximately exponential, although the results in the literature do not completely agree on this subject. A typical case where call durations are not exponential is when there are multiple types of calls with different call length averages, or if a call always takes a certain minimum amount of time. In these cases one should wonder what the influence is of the different service time distributions on the Erlang formula. We can state that this influence decreases as the call center increases in size. With some care it can be concluded that only the average call duration is of major importance to the performance of the call center.
Human behavior Up to now we ignored the behavior of the agents, apart from the time it takes to take up to phone. However, agent behavior is not as simple as that. Employees take small breaks to get coffee, to discuss things, etc. Modeling explicitly the human behavior is a difficult task; describing and quantifying the behavior is even more difficult! In most situations these small breaks are taken when there are no calls in the queue. It can therefore be expected that they are of minor importance to the SL. In other situations it has a bigger impact, and it can ceriously limit the possibilities of quantitative modeling.

The square-root staffing rule

Up to now we saw that an increase of scale leads to advantages with respect to productivity
and/or service level. These advantages can always be quantified using the Erlang formula.
To obtain a general understanding we formulate a rule of thumb that relates, for a fixed
service level, call volume and the number of agents. In a formula this relation can be
formulated as follows:
overcapacity in % × ps = constant.
The constant in the formula is related to the service level, the formula therefore relates
only overcapacity and the number of agents. The percentage overcapacity in the formula
is given by 100 × (1 − a/s). From the rule of thumb we obtain results such as: if the call
center becomes four times as big, then the overcapacity becomes roughly halve as big. How
we obtain this type type of results is illustrated by the following example.
A call center with 4 agents and _ = 1 and _ = 2 minutes has an average waiting time of a
little over 10 seconds. For this call center the associated constant is 100×(1−2/4)×
p4 =50 × 2 = 100. If we multiply s by 4, than ps doubles. Thus to keep the same service
level (the same constant), we halve the overcapacity to 25%. Thus the productivity becomes
75%, and thus with s = 4×4 = 16 this gives a = 12 and _ = 6. If we verify these numbers
with the Erlang formula, then we find an average waiting time of a little over 6 seconds.
Closest to 10 seconds is s = 15, with approximately 12 seconds waiting time. If we multiply
s again with 4, then the overcapacity can be reduced to 12.5%. This means _ = 28, with
3.2 seconds waiting time. Closest to 6 seconds is s = 62, from which we see that the rule
of thumb works reasonably well.
From the example we see how simply we can get an impression of the allowable call
volume if we change the occupation level. More often we prefer to calculate the number of
agents needed under an increase in call volume. The calculation for this is more complex. If
we denote with c the constant related to the service level divided by 100, then the formula
for s is:
s =c +p
c2 + 4a
2
!2
.
As in the previous example we start with 4 agents, _ = 1 and _ = 2 minutes. The number
c is the constant divided by 100, thus c = 1. Filling in a = _ × _ = 2 and c = 1, then we
find indeed s = 4. Assume that _ doubles. Then a = 4, and with c = 1 we find s _ 6.6.

This is a good approximation: s = 7 gives a waiting time under the 10 seconds, s = 6
above. If _ = 10, the we find s = 25 as approximation. An agent less would give a waiting
time of 9 seconds. If _ doubles again, then we get 47 as approximation, with 45 as best
value according to the Erlang formula. We see that for big values of _ doubling leads to
doubling s.
If c is small with respect to a then we see that s is proportional to a. This means that
the economies of scale become less for very big call centers, because it is already almost at
the highest possible level. What “big” is in this context depends on the service level.
Using this rule of thumb should be done with care. It is only useful to relate _ and s.
Next to that, one should realize that it is only an approximation, the results need to be
checked with the Erlang formule before use in practice. This point was illustrated in the
example.

Properties of the Erlang formula

Knowing the Erlang formula is one thing, understanding it is another. The Erlang formula
has a number of properties with important managerial consequences. We will discuss these
in this section. Robustness One agent more or less can make a big difference in SL, even for big call centers. This is good news for call centers with a moderate SL: with a relatively limited
effort the SL can be increased to an acceptable level. On the other hand it means that
a somewhat higher load, necessitating an additional agent, can deteriorate the SL considerably.
In general we can say that the Erlang formula is very sensitive to small changes
in the input parameters, which are _, _ en s. This is especially the case if a is close to
s, as we can see in Figures 4.1 and 4.2. The figures get steeper when _ approaches s/_,
and thus small changes in the value of the horizontal axis give big changes at the vertical
axis. This sensitivity can make the task of a call center manager a very hard one: small
unpredictable changes in arrival rate or unanticipated absence of a few agents can ruin the
SL. In Chapter 7 we discuss in detail the consequences of this sensitivity.
In our small call center with _ = 1, _ and s = 8 we expect an ASA of around 17 seconds.
However, there are 10% more arrivals (i.e., _ = 1.1). The ASA almost doubles to over 30
seconds!
Stretching time A second property is related to the absolute and relative values of the
call characteristics, i.e., _ and _. Recall that the load is defined by a = _ × _. If either
_ or _ is doubled, and the other is divided by two, then the load remains the same. This
does not mean that the same number of agents is needed to obtain a certain service level.
A manager is working in a call center that merely connects calls, thus call durations are
short. As a rule she uses a load to agent ratio of 80%. From experience with the call center
she knows that this gives a reasonable service level. For parameters equal to _ = 32 seconds
and 15 calls per minute the load is a = 8 Erlang. Indeed, with 10 agents the average speed
of answer is 6.5 seconds. After a promotion she is responsible for a telephone help desk
with also a load of 8 Erlang, but with _ approximately 5 minutes, more than nine times as
much. She uses the same rule of thumb, to find out that the average waiting is now around
60 seconds! When _ is multiplied by the same number (bigger than 1) as _ is divided with, then
the load remains the same but it is like the system goes slower. Evidently, the waiting
time also increases. If AWT is multiplied by the same number then the TSF remains the
same. The relationship between the ASA and stretching time is more complicated.
It is like saying that the load is insensitive to the ”stretching” of time. Certain performance
measures depend only on s and a, but not on the separate values of _ and _.
The probability of delay, C(s, a), is a good example. It does not hold anymore for the
TSF, here the actual value of _ and _ play an important role. In fact, for given a and
the service level depends only on AWT/_. Thus if time is stretched, and the acceptable
waiting time is stretched with it, then the TSF remains the same. Of course, this is just
theory, although we often see that the AWT is higher in call centers with long talk times
compared to call centers with short talk times. For the ASA the effect of stretching time
is simple: the ASA is stretched by the same factor. Let us go back to the call center with _ = 1, _ = 5, and s = 8. Then TSF = 86% for AWT = 20 seconds, and ASA = 16.7 seconds. Now stretch time by a factor 2, i.e., _ = 0.5 and _ = 10. Then TSF = 83% for AWT = 20 seconds (a difference, but surprisingly small; the reason of this is explained below), TSF = 86% for AWT = 2 × 20 = 40 seconds, and
ASA = 2 × 16.7 = 33.4 seconds.
Economies of scale Another well known property is that big call centers work more
efficiently. This is the effect of the economies of scale: if we double s, then we can increase
_ to more than twice its value while keeping the same service level, assuming that _ and
AWT remain constant. A firm has two small decentralized call centers, each with the same parameters: _ = 1 and _ = 5 minutes. With 8 agents the average waiting time is approximately 17 seconds in each call center. If we join these call centers “virtually”, then we have a single call
center with _ = 2 and 16 agents. The average waiting time is now less than 3 seconds, and
employing only 14 agents gives a waiting time of only 13 seconds. An additional advantage
is that there is more flexibility in the assignment of agents to call centers, as there is
only a constraint on the total number of agents (although there will probably be physical
constraints, such as the number of work places in a call center).
To give further insight in economies of scale, we plotted the two situations of the
example above in a single figure, Figure 4.3. We consider the TSF, and take 7 and 14
agents. To make comparisons possible we put _×_/s (the productivity) on the horizontal
axis, and the TSF on the vertical axis. Because _ × _ < s, TSF gets 0 as soon as _ × _/s
gets 1, no matter what call center we are considering.
In Figure 4.3 we see that the dashed line is more to the right: for the same productivity
we see that a bigger call center has a higher TSF. Stated otherwise: to obtain a target SL,
a big call center obtains a higher productivity. This is related to the steepness of the curve
for productivity values close to 1, which is the sensitivity of the Erlang formula to small
changes of the parameters, as discussed earlier in this section.
It is important to note that the relative gain of merging call centers (i.e., relative to
the size of the call center) decreases as the size increases. The absolute advantage however
(slowly) increases. Consider four call centers, each with _ = 10 and _ = 2 minutes, and 80% of the calls should be served within 20 seconds. If all call centers are separate then we need 24 agents in each call center, 45 in each when they are merged two by two, and 86 when we have one single call center. Merging two centers with arrival rate 10 saves 3 agents, merging two
0 0.5 1 1.5
0
20
40
60
80
100
Productivity _×_
s
TSF in %
Figure 4.3: The TSF for _ = 5, s = 7 (solid) and s = 14 (dashed), AWT = 0.33, and
varying _.
with arrival rate 10 saves 4. But divided by the arrival (that is, relative to the size), the
economies are higher when the small centers are merged.
Variations in waiting times Consider two different call centers: one has parameters
_ = 1, _ = 5, and s = 8, the other has _ = 20, _ = 0.333, and also s = 8. Both call centers
have a TSF of around 86% for AWT = 20 seconds. Does this mean that the waiting times
of both call centers are comparable? This is not the case. To make this clear, we plotted
histograms of waiting times of both call centers in Figure 4.4. The level at the right of 100
denotes the percentage of callers that has a waiting time exceeding 100 seconds. We see
that in the first call center, represented by the solid line, callers either do not wait at all
or wait very long, there are hardly any callers that wait between 10 and 100 seconds. In
the second call center (the dashed line) fewer calls get an agent right away, but very few
have to wait very long. There are two conclusions to be drawn from this example. In the first place: the TSF does not say everything. But more importantly, we see that depending on the characteristics of a call center there can be more or less variations in waiting times. Only a thorough investigation of for example the TSF for various AWTs can reveal the characteristics of a particular call center. The remaining waiting time* When we enter a queue that we can observe (as in the post office or the supermarket) then we can estimate our remaining service time on the Chapter 4 — The Erlang C formula 23
0 20 40 60 80 100 120
0
20
40
60
80
Waiting times in 10-second intervals % of calls basis of the number of customers in front of us. Usually our extimation of the remainingwaiting will decrease while we are waiting as we see customers in front of us leaving.But how about the remaining waiting time in an invisible queue as we encounter in call centers? The mathematics show that under the Erlang C model the remaining waitingtime is constant. Thus, no matter how long we have been waiting, the average remaining waiting time is always the same. How can this at first sight counterintuitive phenomenon be explained? As we enter the queue, we expect a certain number of calls to be waiting in front of us. As we wait a while, then we conclude that apparantly the queue was longer than expected. From the Erlang formula it follows that, as long as we are waiting, the expected number of customers remains always the same. A possible consequence of this fact for customers is that one should not abandon while waiting: why hang up after 1 minute if your remaining waiting time is as long as when you started waiting? In practice however there are good reasons to hang up after a while, and good reasons to stay in line. A reason to hang up is the fact that customers do not know the call center’s parameters, and therefore they do not know the average waiting time in the call center. The longer you wait, the more likely you entered a call center with unfavorable parameters, and thus your remaining waiting time does increase! On the other hand, the Erlang C formula does not account for abandonments. If your patience is longer than that of the customers ‘in front of you’, then they will abandon before you and you
will eventually be served. In a system where calls abandon the average remaining waiting time decreases while waiting. For call center managers it should be clear that, unless customers abandon quickly, very long waiting times can and will occur exceptionally. Theoretically there is no upper limit to the waiting time. To protect customers against unexpectedly long waiting times I think that it is good to inform customers on expected waiting times or numbers of customers
waiting in the queue. Together with this the waiting customer could be pointed towards other channels to make contact such as internet

Using the Erlang formula

In the previous section we saw that the Erlang formula can be used to compute the average
waiting time for a given number of agents, service times and traffic intensity. One would like to use the formula also for other types of questions, such as: for given _ and s, and a maximal acceptable ASA or given SL, what is the maximal call volume per time unit _ that the call center can handle? Because of the complexity of C(s, a) we cannot ”reverse” the formula, but by trial-and-error we can answer these types of questions. The question that is of course posed mostoften is to calculate the minimum number of agents needed for a given load and service level. This also can be done using trial-and-error, and software tools such as our Erlang calculator (to be found at www.math.vu.nl/~koole/ccmath/ErlangC) often do this automatically.
In our Erlang C calculator, fill in 1 and 5 at “Arrivals” and “Service time”, fill in “80” and “20” at “Service level” and select “Service level”. After pushing the “compute” button the computation shows that 8 agents are needed to reach this SL. Most software tools will give you an integer number of agents as answer. This makes sense, as we cannot employ say half an agent. However, we can employ an agent half of the time. Thus when a software tool requires you to schedule 17.4 agents during a half an hour, then you should schedule 17 agents during 18 minutes, and 18 agents during 12 Chapter 4 — The Erlang C formula 19
minutes. With 17 agents you are below the SL, with 18 you are above. Thus the ”bad” SL during 18 minutes is compensated by the better than required SL due to using 18 agents. In our Erlang C calculator we decided not to implement this, because we assume that the time interval is so short that a constant number of agents is required. Let us continue the example. Selecting “Number of agents” instead of “Service level”, and pushing the “compute” button again shows that the actual service level is 86% instead of only 80% that was required.
”Garbage in = garbage out”. This well-known phrase holds also for the Erlang formula: the input parameters should be determined with care. Especially with the value of the expected call durations _ one can easily make mistakes. The reason for this is that the entire time the agent is not available for taking a new call should be counted. For the Erlang formula the service starts the moment the ACD assigns a call to an agent, and ends when the agents becomes available, i.e., if the telephone switch has again the possibility to assign a call to that agent. Thus _ consists not only of the actual call duration, but also of the reaction time (that can be as long as 10 seconds!), plus the wrap-up time (that can be as long as the call itself). Note that the reaction time is seen by the caller as waiting time. This should be taken into account when calculating the service levels, by decreasing the acceptable waiting time with the average reaction time.
In a call center the reaction time is 3 seconds on average, the average call duration is 25 seconds and there is no finish time. On peak hours on average 200 calls per 15 minutes arrive. An average waiting time of 10 seconds is seen as an acceptable service level. We calculate first the load without reaction time. The number of calls per second is 200/(15 × 60) _ 0.2222 (_ means “approximately”), and the load is 0.2222×25 _ 5.555. The Erlang formula shows that we need 7 agents, giving an expected waiting time of 8.2 seconds. This seems alright, but in reality there is an expected waiting time of no less than 27.9 seconds! This follows from the Erlang formula, with a service time of 25+3 = 28 seconds (and thus a load of 0.2222 × 28 _ 6.222), and 7 agents. The waiting time is then 24.9 seconds, to which the 3 seconds reaction time should be added. To calculate the right number of agents we start with a service time of 28 seconds, and we look for the number of agents needed to get a maximal waiting time of 10 − 3 = 7 seconds. This is the case for 8 agents, with an average waiting time of 6.5 seconds. This way the average waiting time remains limited to 9.5 seconds.
A possible conclusion of the last example could be that agents should be stimulated to react faster in order to avoid that an extra agent should be scheduled. However, these types of measures, aimed at improving the quantitative aspects of the call center, can lead to a decrease of the quality of the call center work, due to the increased work pressure. We will not deal with the human aspects of call center work; let it just be noted that 100% productivity is in no situation possible, and the overcapacity calculated by the Erlang formula is one of the means forthe agents to get the necessary short breaks between calls.

The Erlang C formula

Danish mathematician who derived the formula at the beginning of the 20th century. We
have a call center with only one type of calls and no abandonments, thus every caller waits
until he or she reaches an agent. The number of calls that enter on average per time unit
(e.g., per minute) is denoted with the Greek letter _. The average service time of calls or
average holding time is denoted with _, measured in the same unit of time. We define the
load a as a = _ × _. The unit of load is called the Erlang.
Consider a call center with on average 1 call per minute, thus _ = 1, and a service time
duration of 5 minutes on average, thus _ = 5. The load is a = _ × _ = 1 × 5 = 5 Erlang.
Note that it does not matter in which time unit _ and _ are measured, as long as they are
the same: e.g., if we measure in hours, then we get again a = _×_ = 60× 1
12 = 5 Erlang.
The offered traffic is dealt with by a group of s agents. We assume that the number
of agents is higher than the load (thus s > a). Otherwise there are, on average, more
arrivals than departures per time unit, and thus the number of waiting calls increases all
the time, resulting in a TSF of 0%. (In reality this won’t occur, as callers will abandon.)
We can thus consider the difference between s and a as the overcapacity of the system. This
overcapacity assures that variations in the offered load can be absorbed. These variations
are not due to changes of _ or _, they originate in the intrinsic random behavior of call
interarrival and call holding times. Remember that _ and _ are averages: it occurs during
short periods of time that there are so many arrivals or that service times are so long that
undercapacity occurs. The strength of the Erlang formula is the capability to quantify the
15
TSF (and other waiting time measures) in this random environment with short periods of
undercapacity and therefore queueing.
The Erlang C formula gives the TSF for given _, _, s, and AWT. For the mathematically
interested reader we give the exact formula, for a < s:
TSF = 1 − C(s, a) × e−(s−a)AWT_ .
Here e is a mathematical constant, approximately equal to 2.7; C(s, a) is the probability
that an arbitrary caller finds all agents occupied, the probability of delay. In case a _ s
then TSF = 0. The formula itself is useful for those who implement it; see Appendix C.1
for details. For a call center manager it is more important to understand it, i.e., to have
a feeling for the TSF as variables vary. For this reason we plotted the Erlang formula for
some typical values in Figure 4.1. We fixed _, s, and AWT, and varied _. In the figure we
plotted _ on the horizontal axis, and the TSF on the vertical. The numbers in the figure
can be verified using our Erlang calculator at www.math.vu.nl/~koole/ccmath/ErlangC.
With the numbers of the example above, _ = 1 and _ = 5, we got a load of 5 Erlang. Let us
schedule 7 agents, and assume that a waiting time of 20 seconds is considered acceptable,
i.e., AWT = 20 seconds. Filling in 1 and 5 and selecting “Number of agents” (20 is already
filled in at start-up) gives after computation the TSF under “Service level”. It is almost
72% (check this!). Increasing the number of agents to 8 already gives a TSF of 86%.
0 0.5 1 1.5
0
20
40
60
80
100
Average number of arrivals per minute _
TSF in %
Figure 4.1: The TSF for _ = 5, s = 7, AWT = 0.33, and varying _.
We follow the curve of Figure 4.1 for increasing _. Starting at 100%, the TSF remains
close to this upper level until relatively high values of _. As _ gets such that a = _ × _
Chapter 4 — The Erlang C formula 17
approaches s then the TSF starts to decrease more steeply until it reaches 0 at _ = s/_ =
7/5 = 1.4. From that point on, as explained earlier, the TSF, as predicted by the Erlang
formula, remains 0%.
Next to the SL in terms of the fraction of calls waiting longer than the AWT, the TSF,
we can also derive the average speed of answer (ASA, also called average waiting time),
the average amount of time that calls spend waiting. The overcapacity assures that the
average speed of answer remains limited. How they depend on each other is given by the
Erlang formula for the ASA. This formula is given by:
ASA =
Probability of delay × Av. service time
Overcapacity=C(s, a) × _s − a.
For the same input parameters as in Figure 4.1 we plotted the ASA in Figure 4.2. We
see clearly that as _ approaches the value of s/_ = 1.4 then the waiting time increases
dramatically.
0 0.2 0.4 0.6 0.8 1 1.2
0
100
200
300
Average number of arrivals per minute _
ASA in seconds
Figure 4.2: Values of the ASA for _ = 5, s = 7, AWT = 0.33, and varying _.
The probability of delay is not only an intermediate step in calculating TSF or ASA, it
is also of independent interest: it tells us how many callers are put in the queue and how
many find a free agent right away. The probability of delay can also be computed using
an Erlang calculator. By computing the TSF for an AWT equal to 0 we find 100 minus
the delay percentage. Dividing by 100 gives the probability of delay. Thus we need to fill
in AWT = 0, and by noting that 100 × probability of delay = 100 − TSF.
Now we continue the example. We already saw that the load is 5 Erlang. Let us place 7
agents, then there is 2 Erlang overcapacity. Filling in 1, 5 and 7 we find that 68% waits
18 Koole — Call Center Mathematics
less than 0 seconds. Thus the probability of delay C(s, a) is equal to 0.32. Now we can fill
in the formula for the average waiting time, in seconds:
ASA =
C(s, a) × _s − a_0.32 × 3002= 48 seconds.
This corresponds with the answers of the Erlang C calculator (be careful with the units,
minutes or seconds!). Taking 8 agenten gives
ASA =C(s, a) × _s − a_0.17 × 3003= 17 seconds.
Thus increasing the number of agents with 1 reduces the average waiting time with a factor
3.
Up to now we just discussed the service level aspects of the Erlang C system. Luckily,
the agent side is relatively simple. Let us consider the case that a < s, thus s − a is the
overcapacity. Because every caller reaches an agent at some point in time, the whole offered
load a is split between the s agents. This gives a productivity of a/s × 100% to each one
of them, if we assume that the load is equally distributed over the agents. If a _ s then
saturation occurs, and agents get a call the moment they become available. In theory, this
means a 100% productivity. In practice such a high productivity can only be maintained
over short periods of time.

A discussion of service level metrics

When such a complex phenomenon as service level is reduced to a few numbers, then it is
unavoidable that certain aspects are ignored.
As an example, take the waiting times of just 4 calls: 0, 10, 30, 100 seconds. Then
the ASA is 35 seconds. However, the sequence 35, 35, 35, 35 has the same ASA. This
shows that the ASA, by its proper definition, does not depend on the variability: is the
ASA caused by many calls having a short waiting time or by a few calls having a very long
waiting time? Both are possible!
This is a good reason to look for other service level metrics. Consider next the TSF,
which is indeed, to a certain extent, sensitive to variability. However, in case of a bad
SL (a low TSF) you can better have high variability, and in the case of a high SL a low
variability! This can be seen from the following examples, each with AWT 20 seconds: 15,
15, 15, 15 (ASA 15, TSF 100), 0, 30, 0, 30 (ASA 15, TSF 50); 25, 25, 25, 25 (ASA 25,
TSF 0), 15, 35, 15, 35 (ASA 25, TSF 50).
Another disadvantage of the TSF is that does not take into account the waiting time in
excess of the AWT: the sequences 0, 10, 30, and 100 seconds and 0, 10, 24, and 30 seconds
give the same TSF of 50, although there is a clear difference between the situations! The
difference shows up if we vary the AWT. This however would lead to a SL metric consisting
of multiple numbers, which has the disadvantage that it is harder to interpret and to
compare.
Thus we find that neither the ASA nor the TSF represents the SL well. Focusing on one
of these can lead to consequences that go against common sense: it motivates managers to
take decisions that decrease the common perception of SL.
A call center has two types of calls: calls with a negociated SL in terms of a TSF that has
to be met, and ”best effort” traffic where the revenue depends on the SL. Under high traffic
conditions the TSF of the first type of calls cannot be met, even when priority is given to
these calls. Therefore, the rational decision, given the contract, is to give priority to best
effort calls in case of high load and to give priority to fixed TSF calls when traffic is low to
catch up with the SL. This is in complete contradiction with the intentions behind the SL
contract. (Source: Milner & Olsen, Management Science, 2006.)
It is common practice in call centers to answer the longest waiting call first. If the SL
is only measured through the TSF, then this is not a good solution: calls waiting longer
than the AWT should not be answered at all, instead the call that waits the longest among
those that wait less than the AWT should be helped. Thus the TSF stimulates wrong
behavior.
For the ASA the order in which calls are answered does matter at all. A possible solution
would be to report both the ASA and the TSF. Still priority is given to calls waiting a
little less than the AWT, but long-waiting calls eventually get served. An alternative SL
metric consisting of a single number that motivates us to help long waiting calls first is
as follows. It takes the AWT into account, and it penalizes waiting longer than the AWT
by measuring the time that waiting exceeds the AWT. We call it the average excess time
(AET). For the 0, 10, 30, 100 sequence the waiting times in excess of 20 seconds are 0, 0,
10, and 80, giving 90/4 = 22.5 seconds as AET. For 0, 10, 24, 30 it gives 3.5, and for 35,
35, 35, 35 the AET is equal to 15. When using the AET is it clear that those calls that
wait longer than the AWT get priority.