category
This architecture describes a solution that provides real-time monitoring and observability of systems and end-user device telemetry data. It focuses on a use case for the media industry. Grafana is a trademark of its respective company. No endorsement is implied by the use of this mark. Download a Visio file of this architecture. In the observable system shown in the diagram, raw telemetry is streamed to Azure Blob Storage via HTTP and connectors. The raw telemetry is processed, transformed, normalized, and saved in Azure Data Explorer for analysis. Systems like Grafana and Azure Metrics Advisor read data from Data Explorer and provide insights to end users. More specifically, these are the elements of the system in the diagram: Azure Data Factory and Azure Synapse Analytics provide tools and workspaces for building ETL workflows and the ability to track and retry jobs from a graphical interface. Note that Data Factory and Azure Synapse both have a minimum lag of about 5 minutes from the time of ingestion to persistence. This lag might be acceptable in your monitoring system. If it is, we recommend that you consider these alternatives. Organizations often deploy varied and large-scale technologies to solve business problems. These systems, and end-user devices, generate large sets of telemetry data. This architecture is based on a use case for the media industry. Media streaming for live and video-on-demand playback requires near real-time identification of and response to application problems. To support this real-time scenario, organizations need to collect a massive telemetry set, which requires scalable architecture. After the data is collected, other types of analysis, like AI and anomaly detection, are needed to efficiently identify problems across so large a data set. When large-scale technologies are deployed, the system and end-user devices that interact with them generate massive sets of telemetry data. In traditional scenarios, this data is analyzed via a data warehouse system to generate insights that can be used to support management decisions. This approach might work in some scenarios, but it's not responsive enough for streaming media use cases. To solve this problem, real-time insights are required for the telemetry data that's generated from monitoring servers, networks, and the end-user devices that interact with them. Monitoring systems that catch failures and errors are common, but to catch them in near real-time is difficult. That's the focus of this architecture. In a live streaming or video-on-demand setting, telemetry data is generated from systems and heterogeneous clients (mobile, desktop, and TV). The solution involves taking raw data and associating context with the data points, for example, dimensions like geography, end-user operating system, content ID, and CDN provider. The raw telemetry is collected, transformed, and saved in Data Explorer for analysis. You can then use AI to make sense of the data and automate the manual processes of observation and alerting. You can use systems like Grafana and Metrics Advisor to read data from Data Explorer to show interactive dashboards and trigger alerts. These considerations implement the pillars of the Azure Well-Architected Framework, a set of guiding tenets that you can use to improve the quality of a workload. For more information, see Microsoft Azure Well-Architected Framework. Reliability ensures your application can meet the commitments you make to your customers. For more information, see Overview of the reliability pillar. Business-critical applications need to keep running even during disruptive events like Azure region or CDN outages. There are two primary strategies and one hybrid strategy for building redundancy into your system: Keep in mind that not all Azure services have built-in redundancy. For example, Azure Functions runs a function app only in a specific region. Azure Functions geo-disaster recovery describes various strategies that you can implement, depending on how your functions are triggered (HTTP versus pub/sub). The ingestion and transformation function app can run in active/active mode. You can run Data Explorer in both active/active and active/standby configurations. Azure Managed Grafana supports availability zone redundancy. One strategy for creating cross-region redundancy is to set up Grafana in each region in which your Data Explorer cluster is deployed. Cost optimization is about reducing unnecessary expenses and improving operational efficiencies. For more information, see Overview of the cost optimization pillar. The cost of this architecture depends on the number of ingress telemetry events, your storage of raw telemetry in Blob Storage and Data Explorer, an hourly cost for Azure Managed Grafana, and a static cost for the number of time-series charts in Metrics Advisor. You can use the Azure pricing calculator to estimate your hourly or monthly costs. Performance efficiency is the ability of your workload to scale to meet the demands placed on it by users in an efficient manner. For more information, see Performance efficiency pillar overview. Depending on the scale and frequency of incoming requests, the function app might be a bottleneck, for two main reasons: We recommend that you use Premium or Dedicated SKUs to: For more information, see Select a SKU for your Azure Data Explorer cluster. For information about deploying this scenario, see real-time-monitoring-and-observability-for-media on GitHub. This code sample includes the necessary infrastructure-as-code (IaC) to bootstrap development and Azure functions to ingest and transform the data from HTTP and blob endpoints. This article is maintained by Microsoft. It was originally written by the following contributors. Principal authors: Other contributors: To see non-public LinkedIn profiles, sign in to LinkedIn.
Architecture
Dataflow
Components
Alternatives
Scenario details
Considerations
Reliability
Cost optimization
Performance efficiency
Deploy this scenario
Contributors
Next steps
Related resources
- 登录 发表评论
- 1次浏览
最新内容
- 6 days 9 hours ago
- 6 days 9 hours ago
- 6 days 10 hours ago
- 6 days 10 hours ago
- 6 days 10 hours ago
- 1 week 5 days ago
- 1 week 6 days ago
- 2 weeks 1 day ago
- 2 weeks 1 day ago
- 2 weeks 1 day ago