目录

Opentelemetry-7-续4-网络架构拓扑

目录

在之前的文章内容中,已经完成了一套可观测平台的搭建及配置,主要包括监控、日志、链路的可视化和互相关联,告警的配置、使用、通知。本篇主要针对Opentelemetry的ServiceGraph落地一套网络架构拓扑图。

在常用的Trace可视化开源工具例如:jaeger、skywalking等工具中,本身也自带网络拓扑功能,为什么还需要Opentelemetry来做这些东西,我认为是Opentelemetry在统一可观测各种标准后,可选择的后端服务比较多,如果用户选择的后端没有该功能,用户还可以根据Opentelemetry提供的ServiceGraph来自定义。

下面是ServiceGraph的配置:

otel.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
    receivers:
      jaeger:
        protocols:
          grpc:
          thrift_binary:
          thrift_compact:
          thrift_http:
      otlp:
        protocols:
          grpc:
          http:
      otlp/servicegraph:
        protocols:
          grpc:
            endpoint: localhost:12345
    exporters:
      logging:
        loglevel: debug
      otlp:
        endpoint: tempo.tempo:4317
        tls:
          insecure: true
      loki:
        endpoint: http://loki-distributed-gateway.loki/loki/api/v1/push
        tls:
          insecure: true
        sending_queue:
          enabled: true
          num_consumers: 100
          queue_size: 10000
      prometheus/servicegraph:
        endpoint: 0.0.0.0:9099

    processors:
      servicegraph:
        metrics_exporter: prometheus/servicegraph
        latency_histogram_buckets: [2ms, 4ms, 6ms, 8ms, 10ms, 50ms, 100ms, 200ms, 500ms, 800ms, 1s, 1400ms, 2s, 5s, 10s, 15s]
        dimensions:
          - k8s.cluster.id
          - k8s.namespace.name
        store:
          ttl: 60s
          max_items: 100000
    
    extensions:
    
    service:
      extensions:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [servicegraph]
          exporters: [otlp]
        logs:
          receivers: [otlp]
          exporters: [loki]
        metrics/servicegraph:
          receivers: [otlp/servicegraph]
          processors: []
          exporters: [prometheus/servicegraph]

在上面配置中已经配置好了,开启metrics端口9099,我们需要自己新建一个service和serviceMonitor

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: v1
kind: Service
metadata:
  labels:
    opentelemetry: servicegraph
  name: otel-collector-service-graph
  namespace: opentelemetry
spec:
  ports:
  - name: serivcegraph
    port: 9099
    protocol: TCP
    targetPort: 9099
  selector:
    app.kubernetes.io/component: opentelemetry-collector
    app.kubernetes.io/instance: opentelemetry.otel
    app.kubernetes.io/managed-by: opentelemetry-operator
    app.kubernetes.io/part-of: opentelemetry
  sessionAffinity: None
  type: ClusterIP
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: servicegraph
  labels:
    app: servicegraph
spec:
  selector:
    matchLabels:
      opentelemetry: "servicegraph"
  endpoints:
    - port: serivcegraph
      path: /metrics
      interval: 30s

展示,在tempo数据源配置页面中,由于我们metrics接入到了Prometheus存储,所以我们直接选择ServiceGraph的数据源

../images/servicegraph-1.png

保存后,进入explore

../images/servicegraph-2.png

dashboard中可以通过NodeGraph来展示,下面是dashboard的json

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
{
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": {
          "type": "grafana",
          "uid": "-- Grafana --"
        },
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "target": {
          "limit": 100,
          "matchAny": false,
          "tags": [],
          "type": "dashboard"
        },
        "type": "dashboard"
      }
    ]
  },
  "editable": true,
  "fiscalYearStartMonth": 0,
  "graphTooltip": 0,
  "id": 14,
  "links": [],
  "liveNow": false,
  "panels": [
    {
      "datasource": {
        "type": "prometheus",
        "uid": "RHqnODP4k"
      },
      "gridPos": {
        "h": 22,
        "w": 24,
        "x": 0,
        "y": 0
      },
      "id": 2,
      "interval": "15s",
      "options": {
        "nodes": {
          "arcs": [
            {
              "color": "#5794F2",
              "field": "arc__color"
            }
          ]
        }
      },
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "RHqnODP4k"
          },
          "editorMode": "code",
          "exemplar": false,
          "expr": "label_join(label_join(\n(rate(traces_service_graph_request_total{}[$__interval]))\n, \"id\", \"\", \"client\")\n, \"title\", \"\", \"client\")\n\nor\n\nlabel_join(label_join(\n(rate(traces_service_graph_request_total{}[$__interval]))\n, \"id\", \"\", \"server\")\n, \"title\", \"\", \"server\")\n",
          "format": "table",
          "hide": false,
          "instant": true,
          "legendFormat": "__auto",
          "range": false,
          "refId": "nodes"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "RHqnODP4k"
          },
          "editorMode": "code",
          "exemplar": false,
          "expr": "increase(\n  (sum by (id, source, target, mainStat) \n    (\n      (\n          label_replace(\n              label_replace(\n                  label_join(\n                    (traces_service_graph_request_total{})\n                    , \"id\", \":\", \"client\", \"server\")\n                    , \"source\", \"$1\", \"client\", \"(.*)\")\n                    , \"target\", \"$1\", \"server\", \"(.*)\")\n      )\n    )\n  )[$__range:$__interval]\n) > 0",
          "format": "table",
          "hide": false,
          "instant": true,
          "legendFormat": "__auto",
          "range": false,
          "refId": "edges"
        }
      ],
      "title": "Service Map ☸️",
      "type": "nodeGraph"
    }
  ],
  "refresh": "",
  "revision": 1,
  "schemaVersion": 38,
  "style": "dark",
  "tags": [],
  "templating": {
    "list": []
  },
  "time": {
    "from": "now-5m",
    "to": "now"
  },
  "timepicker": {},
  "timezone": "",
  "title": "架构拓扑图",
  "uid": "k0Om62pVf",
  "version": 2,
  "weekStart": ""
}