2021-09-23_Python

2021-09-23

解决“使用selenium模块遍历网页标签”出现的一类bug

文章目录

- 解决“使用selenium模块遍历网页标签”出现的一类bug
前言
一、一个简单案例
总结

前言

本文主要解决使用selenium模块遍历网页元素时出现的：“StaleElementReferenceException”bug，简单来说，当页面刷新或者重定向到其他页面时后，之前调用find.element()等类似的方法找到的元素对新页面不再适用，所以程序会抛出该bug。
若想具体了解这个bug生成的原因可以参考该链接：
元素引用异常

一、一个简单案例

请看如下页面：

我们尝试完成以下操作：在python中使用selenium模拟点击网页标签“首页”“国内”“国际”等一众标签。
若我们这么写：

from selenium import webdriver
import time
bro = webdriver.Chrome()
bro.get('http://news.baidu.com/')
l_list = bro.find_elements('xpath', '//*[@id="channel-all"]/div/ul/li')
for l in l_list:
    l.click()
    time.sleep(1)

就出现了前言提到了bug：

具体debug分析后发现，该错误是用for循环执行第二次时发生的。也就是说for循环的第一次执行所调用的click()方法要么刷新了网页，要么定向到了其他页面，从目前的资料来看，重新调用find.elements()方法可以解决该问题，接下来就是如何设计程序了。
解决思路：在try语句中调用for循环，出错时再调用find.elements()方法重新获取元素，并且删除已经成功点击的标签。
具体代码如下如下：

from selenium import webdriver
import time
from selenium.common import exceptions as ex
bro = webdriver.Chrome()
bro.get('http://news.baidu.com/')
run_times = 0
global stop_flag
elements_list = bro.find_elements('xpath', '//*[@id="channel-all"]/div/ul/li')
while True:
    try:
        for l in elements_list:
            l.click()
    except ex.StaleElementReferenceException:
        # 重新获取
        elements_list = bro.find_elements('xpath', '//*[@id="channel-all"]/div/ul/li')
        run_times += 1
        if run_times == 1:
            stop_flag = len(elements_list)
        # 删除老旧元素
        for f in range(run_times):
            del elements_list[0]
        # 结束循环
        if run_times == stop_flag:
            break
print('结束！')
time.sleep(100)
bro.quit()

总结

一点拙见，如果你有更好的办法，欢迎私聊。最后，感谢看到最后的你。

2021-09-23

Python相关栏目本月热门文章