Improving the World of Telco with Big Data and AI

AI can offer improvement and growth in many areas and Telco is no exception. From addressing privacy issues, fraud detection and helping with network optimization to more people oriented tasks like client support automation and customer segmentation behavior recognition, the role of AI is becoming crucial for organizations that want to keep up with the ever changing world of technology. But there is one thing that we think should get more attention and that is Predictive Maintenance. Here’s how we addressed the issue for our largest telco, Telekom Slovenije (TS).

Preventing downtime

We used predictive maintenance for TS critical business applications as they decided to tackle the challenge of correctly assessing the risk of system failures. Through observation of the system we wanted to predict how likely it is that a failure will occur. The solution has been up and running successfully for many years and it’s constantly evolving and improving. Before we jump into the solution, let’s have a look at our biggest enemy when it comes to critical infrastructure of large organizations, the downtime.

Downtime is one of the most unwanted and hated events across all industries as it causes direct loss of revenue and can cause huge reputational damage. We all remember what happens when popular social media services suffer downtime and being the largest telecommunication company in the country, Telekom Slovenije wants to avoid that as much as possible in order to stay the most reliable telco on the market.

  • Downtime causes direct loss of revenue and reputational damage
  • 20% increase in maintenance cost of software applications per year
  • up to $300k per hour is the cost of web application downtime
  • 91% of data centers have experienced unplanned outage in the last 24 months

The solution

Being problem-solvers by heart, we couldn’t help but ask ourselves - wouldn’t it be amazing if there was a solution to prevent downtime issues? As soon as we noticed that most of our clients were struggling with the issue, we jumped straight into it and developed a solution called Know Your System or KYS. KYS offers a complete overview of insights in real-time and allows enterprises to predict errors and application failures, various bottlenecks, improve operational efficiency, and encourages them to take tactical, data-driven decisions.

Before you can solve a problem, you need to understand its context. We wanted to understand the interconnectivity of events that cause downtime so we had to investigate each event, why it is happening and how it is connected to others. Our job was to connect seemingly unconnected events and finding the reason behind potential failures in the system. Detecting failures before they even happen is the holy grail of any system. The presented solution has a strong financial impact, however, we are not at liberty to disclose the numbers. What we can say is that KYS decreased the time engineers have spent on dealing with problems and gave them more time to invest in new solutions, improvements and, of course, innovation, which has always been one of the pillars of Telekom Slovenije.

"We use Medius' solution for data collection, monitoring and analysis to predict and prevent production failures of our 170+ applications in real-time, increasing overall IT ecosystem performance and availability." - Sašo Savič, Director of OSS/BSS at Telekom Slovenia d.d.

We want to share some numbers (data from June 2021) that will help you imagine what AI is capable of doing. There are over 200 applications inside the TS ecosystem with 800 million events per month, 300 business events per second, all of which comes down to 400 GB of data per day. What is all this data for, you ask?

Let’s say you want to increase the speed of your home internet connection. Your request flows through the system, “talking” to many different applications (CRM, billing, inventory, etc.), and at the end you can browse your favorite websites faster than before. Our main job was to figure out how these 200+ applications communicate with each other in real time. We needed to tune in and listen to all responses and requests from every single application to understand the context first. It’s like standing in a room full of people, calculating their feelings (bad or good) and paying attention when something goes wrong, for example, let’s assume someone starts to gossip. We need to understand the context and more importantly, the cause of the gossip. Once the dots are connected, we have determined a “critical path”. As you probably figured out, this can only be done with the AI.

Determining critical paths (finding the sources of possible errors) is our solution to discovering future failures—before they happen. There are hundreds of events happening each second but KYS is handling them all while improving itself and its algorithm.

What we did for Telekom Slovenije can be done for many other industries that are working with Big Data or are in a position to start collecting it. Our team is ready when you are.

