k8s部署filebeat 采集应用程序日志

公司的服务都部署在k8s上，日志采集成为必须要做的事；

filebeat采集日志有两种方案：
1.filebeat跟应用程序一同部署，放在一个pod内，filebeat作为sidecar监控应用程序的日志；
2.filebeat作为守护类pod采集每个节点上的应用程序日志；
由于每个节点上的应用程序还不是特别多，所以一个filebeat可以搞定，故放弃在每个应用程序的pod内添加filebeat的方案。

filebeat yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    deprecated.daemonset.template.generation: '44'
  creationTimestamp: '2021-12-25T13:56:32Z'
  generation: 44
  labels:
    k8s-app: filebeat
  name: filebeat
  namespace: lzzk
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: filebeat
  template:
    metadata:
      labels:
        k8s-app: filebeat
    spec:
      containers:
        - args:
            - '-c'
            - /etc/filebeat.yml
            - '-e'
          env:
            - name: ELASTICSEARCH_HOST
              value: 172.17.12.2
            - name: ELASTICSEARCH_PORT
              value: '9967'
            - name: ELASTICSEARCH_USERNAME
            - name: ELASTICSEARCH_PASSWORD
            - name: ELASTIC_CLOUD_ID
            - name: ELASTIC_CLOUD_AUTH
          image: 'docker.elastic.co/beats/filebeat:6.6.2'
          imagePullPolicy: IfNotPresent
          name: filebeat
          resources:
            limits:
              cpu: 300m
              memory: 200Mi
            requests:
              cpu: 100m
              memory: 100Mi
          securityContext:
            runAsUser: 0
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
            - mountPath: /etc/filebeat.yml
              name: config
              readOnly: true
              subPath: filebeat.yml
            - mountPath: /usr/share/filebeat/inputs.d
              name: inputs
              readOnly: true
            - mountPath: /usr/share/filebeat/data
              name: data
            - mountPath: /var/lib/docker/containers
              name: varlibdockercontainers
              readOnly: true
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: filebeat
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      volumes:
        - configMap:
            defaultMode: 384
            name: filebeat-config
          name: config
        - configMap:
            defaultMode: 384
            name: filebeat-inputs
          name: inputs
        - hostPath:
            path: /var/lib/filebeat-data
            type: DirectoryOrCreate
          name: data
        - hostPath:
            path: /var/lib/docker/containers
            type: ''
          name: varlibdockercontainers
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate
---
apiVersion: v1
data:
  filebeat.yml: >-
    filebeat.config:
      inputs:
        # Mounted `filebeat-inputs` configmap:
        path: ${path.config}/inputs.d/*.yml
        # Reload inputs configs as they change:
        reload.enabled: false
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false

    # To enable hints based autodiscover, remove `filebeat.config.inputs`
    configuration and uncomment this:

    #filebeat.autodiscover:

    #  providers:

    #    - type: kubernetes

    #      hints.enabled: true


    processors:
      - add_cloud_metadata:

    cloud.id: ${ELASTIC_CLOUD_ID}

    cloud.auth: ${ELASTIC_CLOUD_AUTH}


    output.elasticsearch:
      ilm.enabled: false
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      pipelines:
        - pipeline: "logmessage6"
          or:
            - equals:
                kubernetes.labels.app: "baseconfig"
            - equals:
                kubernetes.labels.app: "strategyservice"
            - equals:
                kubernetes.labels.app: "dataservice"
            - equals:
                kubernetes.labels.app: "dataservicejob"
            - equals:
                kubernetes.labels.app: "iot-data-distribute"
kind: ConfigMap
metadata:
  creationTimestamp: '2021-12-25T14:17:18Z'
  labels:
    k8s-app: filebeat
  name: filebeat-config
  namespace: lzzk
---
apiVersion: v1
data:
  kubernetes.yml: |-
    - type: docker
      containers.ids:
      - "*"
      exclude_lines: ['HEAD']
      exclude_lines: ['HTTP/1.1']
      multiline.pattern: '^[[:space:]]+(at|.{3})b|^Caused by:'
      multiline.negate: false
      multiline.match: after
      processors:
        - add_kubernetes_metadata:
            in_cluster: true
kind: ConfigMap
metadata:
  creationTimestamp: '2021-12-25T14:17:18Z'
  labels:
    k8s-app: filebeat
  name: filebeat-inputs
  namespace: lzzk

配置说明

filebeat采集之后后，直接发送到es；由于message字段需要切割成可识别的字段，需要使用es的pipeline，在数据发送到es，es先处理数据，然后再入库；

     pipelines:
        - pipeline: "logmessage6"
          or:
            - equals:
                kubernetes.labels.app: "baseconfig"
            - equals:
                kubernetes.labels.app: "strategyservice"
            - equals:
                kubernetes.labels.app: "dataservice"
            - equals:
                kubernetes.labels.app: "dataservicejob"
            - equals:
                kubernetes.labels.app: "iot-data-distribute"

es使用logmessage6来处理日志数据，同时在匹配app时才执行数据处理，如果没有匹配上，则直接入库；
2. JAVA多行日志的处理：

      multiline.pattern: '^[[:space:]]+(at|.{3})b|^Caused by:'
      multiline.negate: false
      multiline.match: after

问题

1.如果一个服务启动多个副本，日志中如何区分？？？
2.es6.6已经开始支持LRI，可以指定每个index的生命周期，但是filebeat开启之后，会出现只在一个index上增长日志，没有按照日志大小生成多个index。

k8s部署filebeat 采集应用程序日志

大数据系统相关栏目本月热门文章