爬虫--7Selenium webdriver控制浏览器

一、使用Selenium库调用浏览器必须有一个webdriver驱动文件，版本要与本地浏览器同步
        谷歌Chrome各版本驱动的下载地址
        http://chromedriver.storage.googleapis.com/index.html

        火狐Firefox浏览器对应各个版本驱动下载地址：
        https://github.com/mozilla/geckodriver/releases/

二、 手动创建一个存放浏览器驱动的目录，如： F:GeckoDriver , 将下载的浏览器驱动文件
（例如：chromedriver、geckodriver）丢到该目录下。

三、我的电脑–>属性–>系统设置–>高级–>环境变量–>系统变量–>Path，将“F:GeckoDriver”目录添加到Path的值中。比如：Path字段;F:GeckoDriver

import webbrowser
from time import sleep
from selenium import webdriver
#可以在创建浏览器时加一些参数
def headless():
    '''使用无头浏览器'''
    options = webdriver.ChromeOptions()

    options.add_argument('--headless')
    chrome = webdriver.Chrome(r'G:安装程序安装的程序谷歌浏览器驱动chromedriver.exe',
                              chrome_options= options)
def proxy():
    '''使用ip代理'''
    options = webdriver.ChromeOptions()
    options.add_argument('--proxy-server=http://183.154.213.117:9000')
    chrome = webdriver.Chrome(r'G:安装程序安装的程序谷歌浏览器驱动chromedriver.exe',
                              chrome_options=options)


#链接谷歌浏览器，指明浏览器驱动的路径
chrome = webdriver.Chrome(r'G:安装程序安装的程序谷歌浏览器驱动chromedriver.exe')

#打开网站
chrome.get("https://www.baidu.com")
sleep(3)
#选择元素
element = chrome.find_element_by_id('kw')
element.send_keys('你好')
element = chrome.find_element_by_id('su')
element.click()
#截图
chrome.save_screenshot('baidu.png')
#操纵滚动条
js = 'document.documentElement.scrollTop=1000'#将竖直滚动条拖到1000及底部
chrome.execute_script(js)#执行js语句

#获取源码
sleep(3)#给浏览器渲染时间
html = chrome.page_source




#关闭浏览器
chrome.quit()

爬虫--7Selenium webdriver控制浏览器

Python相关栏目本月热门文章