In The API Train Wreck Part-1, I discussed API design factors such as KPI, performance measurements, monitoring, runtime stats, and usage counters. In this posting, I’ll discuss the design factors that will influence your API’s load and traffic management architecture.
Load and Traffic Management
One of the key issues that you are going to face when opening your API to the world are the adverse effects on your own front-end and back-end systems. If you are using a mixed model, where the same API is used for both internal line of business applications and external applications, uncontrolled bursts of traffic and spikes in data requests can cause your service to run red hot.
In a mixed model, as the number of external API users increases, it’s only a question of time before you will start experiencing the gremlin effects which include brownouts, timeouts, and periodic back-end shutdowns. Due to the stateless nature of SOA and the multiple points of possible failure, troubleshooting the root cause of these problems can be a nightmare.
This is why traffic management is one of the first and most important capabilities you should build into your API. A good example of this type of implementation can be found in the Tweeter REST API Rate Limiting. Twitter provides a public API feed of their stream for secondary processing. In fact, there is an entire industry out there that consumes, mines, enriches, and resells Twitter data, but this feed is separate from their internal API and it contains numerous traffic management features such as: caching, prioritize active users, adapting to the search results, and blacklisting to insure that their private feed will not be adversely impacted by the public API.
Two of the most common ways to achieve efficient bandwidth management and regulate the flow are via (1) throttling and (2) rate-limiting. When you are considering either option, make the following assumptions:
- Your API users will write inefficient code and only superficially read your documentation. You need to handle the complexity and efficiency issues internally as part of your design.
- Users will submit complex and maximum query ranges. You can’t tell the user not to be greedy, rather you need to handle these eventualities in your design.
- Users will often try to download all the data in your system multiple times a day via a scheduler. They may have legitimate reason for doing so, so you can’t just deny their requests because it’s inconvenient for you. You need to find creative methods to deal with this through solutions like of-line batch processing or subscription to updates.
- Users will stress test your system in order to verify its performance and stability. You can’t just treat them as malicious attackers and simply cut off their access. Take this type of access into account and accommodate it through facilities like a separate test and certification regions.
In order to be able to handle these challenges, you need to build a safety fuse into your circuitry. Think of Transaction Throttling and Rate Limiting as a breaker switch that will protect your API service and backend.
Another reason to implement transaction throttling is because you may need to measure data consumption against a time based quota. Once such mechanism is in place, you can relatively easily segment your customers by various patterns of consumption (i.e. data and time, volume, frequency, etc.). This is even more relevant if you are building a tier based service where you charge premium prices per volume, query, and speed of response.
Rationing, Rate Limitations, and Quotas
OK, so now we are ready to implement some quota mechanisms that will provide rate limits. But just measuring data consumption against daily quotas can still leave you vulnerable to short bursts of high volume traffic or if done via a ‘per second ‘ rate limits can be perceived as a non-business friendly.
If your customers are running a high volume SaaS solution and are consuming your API as part of their data supply chain, they will probably find it objectionable when they discover that you are effectively introducing pre-defined bottlenecks into their processing requests.
So even if you consider all data requests equal – you may find that some requests are more equal than others (due to the importance of the calling client) or that they contain high rate transactions or large messages, so just limiting ‘X requests per second’ might not be sufficient.
Here are my core 13 considerations for designing the throttling architecture.
- Will you constrain rate of a particular client access by API key or IP address?
- Will you limit your data flow rate by user, application, or customer?
- Will you control the size of messages or the number of records per message?
- Will you throttle differently your own internal apps vs. public API traffic?
- Will you support buffer or queue based requests?
- Will you offer business KPIs on API usage (for billing, SLA, etc.)?
- Will you keep track of daily or monthly usage data to measure performance?
- Will you define different consumption levels for different service tiers?
- How will you handle over quota conditions (i.e. deny request, double bill, etc.)?
- Will you measure data flow for billing and pricing?
- Will you monitor and respond to traffic issues (security, abuse, etc.)?
- Will you support dynamic actions (i.e. drop or ignore request, send an alert, etc.)?
- Will you provide users usage metadata so they can help by metering themselves?
Obviously, no single consideration in the list above will singlehandedly control your design. During your deliberation process, dedicate some time to go over each of the 13 points. List the pros and cons for each and try to estimate the overall impact on the development scope and overall project timeline.
For example, evaluating “1. Constrain rate of a particular client access by API key” will yield the following decision table:
Pros for Key Management
Cons for Key Management
Conducive to per-seat and volume licensing models and can support both enterprise and individual developers.
Need to invest in the setup and development of a key lifecycle management system.
Effort is estimated at x man months and will push the project delivery date by y months.
Tighter control over user access and activity.
Need to provide customer support to handle key lifecycle management support.
Effort will require to hire/allocate x number dedicated operational resources to support provisioning, audit, and reporting.
Conducive to a tiered licensing and billing model including public, government, and evaluate before you buy promotions.
Managing international users will require specialty provisioning and multi-lingual capabilities.
Will complement and support the company’s market and revenue strategy.
Will scale well to millions of users.
Initial estimate of number of customers is relatively small.
Resist the temptation to go at it ‘quick and dirty’. Collect as much input as possible from your front-end, middle tier, back-end, operations, and business stakeholders. Developing robust and effective throttling takes careful planning and if not done correctly can become the Achilles heel of your entire service.
The above pointers should cover the most critical aspects of planning for robust API traffic management.
© Copyright 2013 Yaacov Apelbaum. All Rights Reserved.
13 thoughts on “The API Train Wreck Part-2”
I would add these 2 considerations to your list of 13:
1. How will you handle 2+ factor authentication requirements
2. How will you identify and handle a cyber-attack
You stated that
“Users will stress test your system in order to verify its performance and stability. You can’t just treat them as malicious attackers and simply cut off their access. Take this type of access into account and accommodate it through facilities like a separate test and certification regions.”
I would also add that if your API is publicly accessible, you should expect a fair amount of vandalism as some user will try to maliciously execute calls just to bring your system down.
I agree entirely. Security is a major factor in any API design. Bare with me; I will touch on in in detail in a future posting.
Do you know if there is any other architectural data publicly available for the Tweeter REST API?
Try the following:
How does traffic shaping fits into the API management architecture? I’ve seen it used to optimize and guarantee performance, improve latency, and/or increase usable bandwidth.
I think that traffic shaping would be defined by all of the following categories:
1. Load and Traffic Management
2. Bandwidth Management
3. Rationing, Rate Limitations, and Quotas
I would also add that bandwidth throttling regulates a bandwidth intensive device (such as a web server) by limiting the amount of data that device can accept. Bandwidth capping on the other hand limits the total transfer capacity, upstream or downstream, of data over any path.
Good points. In our case, “Bandwidth capping” is synonymous with “Throttling”
A comment regarding the selection of an internal vs. external API. Public APIs are usually used for platforms. You would use it if you want to open up your data and software to outside use. For example, Facebook, Twitter, and WordPress all have public APIs open to developers. Private APIs, on the other hand, should never exposed to the consumer, an example would be and internal bank transaction APIs.
How common is it to actually have two versions of an API?
Good points regarding bandwidth management.
You may also want to consider getting a layer 7 and 8 bandwidth allocation solution. There are several vendors like Cyberoam (http://www.cyberoam.com/) out there that offer this type of functionality commercially.
This would allow you to introduce additional components such as identity based security into the solution.