Data / Machine Learning / Monitoring

Telemetry: What It Is, What It Isn’t, and Why It’s Important in Distributed Systems

14 Sep 2016 10:01am, by

In this episode of The New Stack Makers, we learn about the nuances behind software-defined infrastructure, how new approaches to telemetry are changing the way users interact with their data and the ways that distributed analytics can be put into practice in the enterprise. The New Stack founder Alex Williams spoke with Intel Software-Defined Infrastructure (SDI) Distributed Analytics Engineer Brian Womack during the 2016 Intel Developer Forum (IDF) in San Francisco to get his thoughts on these topics and more.

In his role at Intel, Womack explained the concept of data as we recognize it today has shifted. Rather than working with traditional analytics, many of today’s platforms and services are taking a distributed approach to data and their infrastructures. “We introduced a term here at IDF called a ‘software-defined resource.’ There’s four types: Processor, memory, fabric and storage. People who manage data centers have collected telemetry in the past to try to observe what software-defined resources are doing so that you can do something about it,” Womack said.

Womack then went on to explain how running machine learning algorithms and computing data had shifted over the years. With the ever-increasing usage of wearable technology, smart city planning, and the connected home, understanding not only what telemetry is, but what the right data points can do for one’s organization is critical. “Everyone talks about big data, and ‘We have to store x number of petabytes.’ If you do distributed analytics, and you’re able to summarize the actions in such a way that you don’t need to store the raw data anymore, you could potentially store one-thousandth or one-millionth of the data.”

Intel’s open telemetry framework, Snap, aims to make collecting and organizing data simpler for its users. Creating actionable insights is a cornerstone of the project. Womack went on to explain that above all, Intel strives to make the framework accessible. “There’s a lot of plumbing there, and if we can provide that to the community and make it easier to do. They all don’t have to be data scientists or signal processing engineers to do it,” said Womack.

As the conversation drew to a close, Womack lamented the current state of schedulers: “I think the whole scheduling space needs to get a lot smarter over the next year. I would argue that part of the problem with schedulers today is they don’t have enough observability into what’s actually going on, so they’re constantly playing catch up. It’s too coarse. We need to make it a tighter control loop. … If you can’t observe it, you aren’t going to do a good job of controlling it.”

Feature image: “f(Glitch)” by Antonio Roberts is licensed under CC BY-SA 2.0.

Intel is a sponsor of The New Stack.

A newsletter digest of the week’s most important stories & analyses.