- The new mathematical model evaluates the probability of failure of links between data centers globally.
- It draws inspiration from the financial risk theory.
- It can help cloud service providers better utilize their datacenter resources and save millions of dollars.
To keep up with the ever-growing demand for cloud storage and cloud computing, companies spend millions of dollars increasing the capacity of their WAN backbones.
One of the major challenges is maintaining a good balance between network availability and utilization. A highly utilized channel might not be able to handle sudden traffic surge, resulting in node/link failure.
To address this issue, a research team at MIT, Microsoft, and Hebrew University developed a new mathematical model that draws inspiration from financial risk theory, which helps stock markets investors maximize their return while minimizing financial loss in market fluctuations.
The new model is called TeaVar (short for Traffic Engineering Applying Value at Risk). It evaluates the probability of failure of links between data centers globally. The evaluation process is similar to forecasting the volatility of stocks.
The model then optimally allocates traffic (through different links) to maximize the overall network usage, while minimizing loss. This is quite opposite to the traditional approach that keeps links idle to handle sudden traffic surges, wasting too much energy and resources.
Researchers claim that their model can help cloud service providers better utilize their datacenter resources and save millions of dollars.
How Does ‘TeaVar’ Work?
Major companies that provide cloud services use ‘traffic engineering’ (TE) tools to optimally allocate data bandwidth through all paths. In order to guarantee maximum availability, these companies keep several links at low utilization. Many network links do not operate at high utilization: they don’t send as much traffic as they could send.
Thus, there is a tradeoff between network utilization and network availability. This is where conventional TE techniques fail.
In any network, data bandwidth chunks are similar to ‘money’ invested in the market, and instruments with different failure probabilities are similar to ‘stocks’ and their uncertain values. Using this concept, the research team developed a ‘risk-aware’ methodology that ensures data will reach its destination with minimum traffic loss during worst-case failure conditions.
Their approach enables companies to strike the utilization-availability balance that best suits their goal. TeaVar addresses algorithmic challenges related to tractability of risk minimization, as well as operational challenges.
Applying TeaVar To Real-World Data
They tested this model against conventional TE tools on simulated traffic transferred through ATT, IBM, Google networks. They also produced several failure conditions based on the probability of their occurrence.
Finally, they applied TeaVar to real-world data and found that it can support up to twice as much traffic as conventional TE methods at the same level of availability. The model was able to keep reliable links operating to almost full capacity, while steering data clear of riskier paths.