当请求后端服务响应过慢的时候,为了不产生积压请求,不拖垮其他服务,主动注入故障,返回超时信息
主动注入故障可以减少等待时消耗的资源,避免请求积压,避免级联错误问题。超时也可以设置在代码中,一旦编译运行则无法修改,而Istio则可通过灵活配置及时生效。
另外还有istio还有重试机制,重试的意思是在网络环境不稳定的情况下,如网络波动、丢包等偶发网络不可达的现象,为了解决这种偶发问题,重试则是很好的解决方法。
二、实验步骤1、制作后端服务(设置返回等待时间、返回500状态码)
2、部署demo,配置deployment控制器和svc服务
3、创建virtualservice并测试访问效果
4、添加超时设置并测试访问效果
5、设置超时重试
2.1 制作后端服务制作一个后端服务,并设置延时返回,模拟后台压力大,处理业务时间长的场景,这里我准备使用python写一段flask的demo
首先准备app-demo.py,内容如下:
# coding=utf-8
from flask import Flask, jsonify
from flask import abort
import time
app = Flask(__name__)
@app.route('/istio/', methods=['GET'])
def hello_world(testname):
if len(testname) == 0:
abort(404)
if testname == '500':
abort(500)
if testname == '5s':
time.sleep(5)
return testname
if __name__=='__main__':
app.run(host='0.0.0.0',port=5000)
主要模拟的几个url请求
1、访问/istio/5s 延迟五秒返回结果
2、访问/istio/500 返回500状态
3、访问/istio/xxx 马上返回xxx
然后制作docker镜像,首先创建个文件夹,然后把app-demo.py也放进去
admon@admon-virtual-machine:~/docker$ pwd /home/admon/docker admon@admon-virtual-machine:~/docker$ ls app-demo.py Dockerfile
Dockerfile内容如下:
admon@admon-virtual-machine:~/docker$ cat Dockerfile FROM python:3.9.9-slim WORKDIR /app ADD . /app RUN pip install flask EXPOSE 5000 ENV NAME demo CMD ["python","app-demo.py"] admon@admon-virtual-machine:~/docker$
构建镜像
admon@admon-virtual-machine:~/docker$ sudo docker build -t flask-demo . Sending build context to Docker daemon 3.072kB Step 1/7 : FROM python:3.9.9-slim ---> 8ace3a02b842 Step 2/7 : WORKDIR /app ---> Running in a892c87a9a9a Removing intermediate container a892c87a9a9a ---> 0e14639f68cb Step 3/7 : ADD . /app ---> 54c1cadd0986 Step 4/7 : RUN pip install flask ---> Running in b5ae6b1e78e8 Collecting flask Downloading Flask-2.0.2-py3-none-any.whl (95 kB) Collecting itsdangerous>=2.0 Downloading itsdangerous-2.0.1-py3-none-any.whl (18 kB) Collecting Jinja2>=3.0 Downloading Jinja2-3.0.3-py3-none-any.whl (133 kB) Collecting click>=7.1.2 Downloading click-8.0.3-py3-none-any.whl (97 kB) Collecting Werkzeug>=2.0 Downloading Werkzeug-2.0.2-py3-none-any.whl (288 kB) Collecting MarkupSafe>=2.0 Downloading MarkupSafe-2.0.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (30 kB) Installing collected packages: MarkupSafe, Werkzeug, Jinja2, itsdangerous, click, flask Successfully installed Jinja2-3.0.3 MarkupSafe-2.0.1 Werkzeug-2.0.2 click-8.0.3 flask-2.0.2 itsdangerous-2.0.1 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available. You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command. Removing intermediate container b5ae6b1e78e8 ---> e9370a584b65 Step 5/7 : EXPOSE 5000 ---> Running in bbb45eb5701c Removing intermediate container bbb45eb5701c ---> f44c9daab58e Step 6/7 : ENV NAME demo ---> Running in 60639e9ad2db Removing intermediate container 60639e9ad2db ---> 3563ec1fefaa Step 7/7 : CMD ["python","app-demo.py"] ---> Running in f4d8c8f00982 Removing intermediate container f4d8c8f00982 ---> 70f3b0b74ea6 Successfully built 70f3b0b74ea6 Successfully tagged flask-demo:latest admon@admon-virtual-machine:~/docker$ admon@admon-virtual-machine:~/docker$ sudo docker images REPOSITORY TAG IMAGE ID CREATED SIZE flask-demo latest 70f3b0b74ea6 about a minute ago 133MB python 3.9.9-slim 8ace3a02b842 2 weeks ago 122MB
推送到仓库,可以是私有harbor,或者公有仓库也可以,我直接推送到dockerhub,方便各位日后下载
admon@admon-virtual-machine:~/docker$ sudo docker login Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one. Username: madongunintelligible Password: WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded admon@admon-virtual-machine:~/docker$ sudo docker tag flask-demo madongunintelligible/flask-demo admon@admon-virtual-machine:~/docker$ sudo docker images REPOSITORY TAG IMAGE ID CREATED SIZE flask-demo latest 70f3b0b74ea6 6 minutes ago 133MB madongunintelligible/flask-demo latest 70f3b0b74ea6 6 minutes ago 133MB python 3.9.9-slim 8ace3a02b842 2 weeks ago 122MB admon@admon-virtual-machine:~/docker$ sudo docker push madongunintelligible/flask-demo Using default tag: latest The push refers to repository [docker.io/madongunintelligible/flask-demo] f1c50f55f027: Pushed 4502e6bb3a3b: Pushed 41504162b4ec: Pushed a765872192e3: Mounted from library/python a6dfc8291750: Mounted from library/python 7e46f0272529: Mounted from library/python 5359ff267161: Mounted from library/python 2edcec3590a4: Mounted from library/python latest: digest: sha256:46372905707f357516167bbf36d1107bac0f7d80c447119ac6ae8cec6b60b763 size: 1995 admon@admon-virtual-machine:~/docker$2.2 部署demo,配置deployment控制器和svc服务
部署deployment,并查看pod日志,确认已经启动成功
[root@k8s-master ~]# kubectl create deployment flask-demo --image=madongunintelligible/flask-demo deployment.apps/flask-demo created [root@k8s-master ~]# kubectl get pod NAME READY STATUS RESTARTS AGE appv1-5cf75d8d8b-vdvzr 2/2 Running 4 (7d16h ago) 7d17h appv2-684dd44db7-r6k6k 2/2 Running 4 (7d16h ago) 7d17h flask-demo-b8fb4484f-6tjwt 2/2 Running 0 45s fortio-deploy-687945c6dc-zjb7s 2/2 Running 0 2d12h httpbin-74fb669cc6-5hkjz 2/2 Running 0 2d13h [root@k8s-master ~]# kubectl logs flask-demo-b8fb4484f-6tjwt * Serving Flask app 'app-demo' (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: off * Running on all addresses. WARNING: This is a development server. Do not use it in a production deployment. * Running on http://100.107.114.141:5000/ (Press CTRL+C to quit)
创建svc
[root@k8s-master ~]# kubectl expose deployment flask-demo --name=flask-demo --port=5000 --target-port=5000 --type=NodePort service/flask-demo exposed [root@k8s-master ~]# kubectl describe svc flask-demo Name: flask-demo Namespace: default Labels: app=flask-demo Annotations:2.3 创建virtualservice并测试访问效果Selector: app=flask-demo Type: NodePort IP Family Policy: SingleStack IP Families: IPv4 IP: 10.103.201.244 IPs: 10.103.201.244 Port: 5000/TCP TargetPort: 5000/TCP NodePort: 31234/TCP Endpoints: 100.107.114.141:5000 Session Affinity: None External Traffic Policy: Cluster Events: [root@k8s-master ~]#
创建virtualservice
[root@k8s-master timeout]# cat demo.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: flask-demo-vs
spec:
hosts:
- flask-demo
http:
- route:
- destination:
host: flask-demo
[root@k8s-master timeout]# kubectl apply -f demo.yaml
virtualservice.networking.istio.io/flask-demo-vs created
创建个nginx pod用于访问svc测试
[root@k8s-master timeout]# kubectl run nginx --image=nginx pod/nginx created [root@k8s-master timeout]# kubectl exec nginx -it /bin/bash kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. root@nginx:/# curl flask-demo:5000/istio/test testroot@nginx:/# root@nginx:/# curl flask-demo:5000/istio/5s 5sroot@nginx:/#
测试发现,curl flask-demo:5000/istio/test会马上返回,curl flask-demo:5000/istio/5s结果会延迟5秒返回,这就达到了我们的预期效果,模拟服务压力大,返回缓慢问题。
2.4 添加超时设置并测试访问效果接下来我们添加超时设置timeout: 1s
[root@k8s-master timeout]# cat demo.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: flask-demo-vs
spec:
hosts:
- flask-demo
http:
- route:
- destination:
host: flask-demo
timeout: 1s
[root@k8s-master timeout]# kubectl apply -f demo.yaml
virtualservice.networking.istio.io/flask-demo-vs configured
[root@k8s-master timeout]# kubectl exec nginx -it /bin/bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
root@nginx:/# curl flask-demo:5000/istio/test -i
HTTP/1.1 200 OK
content-type: text/html; charset=utf-8
content-length: 4
server: envoy
date: Sat, 08 Jan 2022 03:19:10 GMT
x-envoy-upstream-service-time: 20
root@nginx:/# curl flask-demo:5000/istio/5s -i
HTTP/1.1 504 Gateway Timeout
content-length: 24
content-type: text/plain
date: Sat, 08 Jan 2022 03:19:16 GMT
server: envoy
upstream request timeoutroot@nginx:/#
在次测试发现curl flask-demo:5000/istio/5s在1s后会提示timeout,状态码为504,因为服务要5s后才能访问结果,我们设置了1s则超时,这就达到我们测试目标,当请求后端服务响应过慢的时候,为了不产生积压请求,不拖垮其他服务,主动注入故障,返回超时信息。
2.5 设置超时重试添加重试策略
[root@k8s-master timeout]# cat demo.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: flask-demo-vs
spec:
hosts:
- flask-demo
http:
- route:
- destination:
host: flask-demo
retries:
attempts: 3 # 重试三次
perTryTimeout: 1s # 重试超时时间
retryOn: 5xx # 重试的状态码,只有5xx才会重试
[root@k8s-master timeout]# kubectl apply -f demo.yaml
virtualservice.networking.istio.io/flask-demo-vs configured
尝试测试,在打开一个窗口,用来查看日志
测试窗口执行如下命令,发送三个正常访问的地址
[root@k8s-master timeout]# kubectl exec nginx -it /bin/bash kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. root@nginx:/# curl flask-demo:5000/istio/test1 test1root@nginx:/# curl flask-demo:5000/istio/test2 test2root@nginx:/# curl flask-demo:5000/istio/test3 test3root@nginx:/#
日志窗口查看日志
[root@k8s-master timeout]# kubectl logs -f flask-demo-b8fb4484f-2lrhr * Serving Flask app 'app-demo' (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: off * Running on all addresses. WARNING: This is a development server. Do not use it in a production deployment. * Running on http://100.97.125.9:5000/ (Press CTRL+C to quit) 127.0.0.6 - - [08/Jan/2022 03:59:28] "GET /istio/test1 HTTP/1.1" 200 - 127.0.0.6 - - [08/Jan/2022 03:59:30] "GET /istio/test2 HTTP/1.1" 200 - 127.0.0.6 - - [08/Jan/2022 03:59:32] "GET /istio/test3 HTTP/1.1" 200 -
可以查看到正确日志
接下来我们在测试窗口访问/istio/500,触发重试
[root@k8s-master timeout]# kubectl exec nginx -it /bin/bash kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. root@nginx:/# curl flask-demo:5000/istio/test1 test1root@nginx:/# curl flask-demo:5000/istio/test2 test2root@nginx:/# curl flask-demo:5000/istio/test3 root@nginx:/# root@nginx:/# root@nginx:/# curl flask-demo:5000/istio/500500 Internal Server Error Internal Server ErrorThe server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.
root@nginx:/#
日志窗口查看日志
[root@k8s-master timeout]# kubectl logs -f flask-demo-b8fb4484f-2lrhr * Serving Flask app 'app-demo' (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: off * Running on all addresses. WARNING: This is a development server. Do not use it in a production deployment. * Running on http://100.97.125.9:5000/ (Press CTRL+C to quit) 127.0.0.6 - - [08/Jan/2022 03:59:28] "GET /istio/test1 HTTP/1.1" 200 - 127.0.0.6 - - [08/Jan/2022 03:59:30] "GET /istio/test2 HTTP/1.1" 200 - 127.0.0.6 - - [08/Jan/2022 03:59:32] "GET /istio/test3 HTTP/1.1" 200 - 127.0.0.6 - - [08/Jan/2022 04:01:48] "GET /istio/500 HTTP/1.1" 500 - 127.0.0.6 - - [08/Jan/2022 04:01:48] "GET /istio/500 HTTP/1.1" 500 - 127.0.0.6 - - [08/Jan/2022 04:01:48] "GET /istio/500 HTTP/1.1" 500 - 127.0.0.6 - - [08/Jan/2022 04:01:48] "GET /istio/500 HTTP/1.1" 500 -
我们发现已经重试了,这里要解释一下,为什么有4条日志,因为重试了3次,加上第一次访问一共是4条
如果第一次重试访问成功,那么日志将会是一条失败一条成功
如果第二次重试访问成功,那么日志将会是两条失败一条成功
我们的实验环境因为没法返回成功值,所以访问了4次均为500



