无需使用机械化,只需在POST请求中发送正确的表单数据即可。
另外,使用正则表达式解析HTML是一个坏主意。使用诸如lxml.html之类的HTML解析器会更好。
import requestsimport lxml.html as lhdef gender_genie(text, genre): url = 'http://bookblog.net/gender/analysis.php' caption = 'The Gender Genie thinks the author of this passage is:' form_data = { 'text': text, 'genre': genre, 'submit': 'submit', } response = requests.post(url, data=form_data) tree = lh.document_fromstring(response.content) return tree.xpath("//b[text()=$caption]", caption=caption)[0].tail.strip()if __name__ == '__main__': print gender_genie('I have a beard!', 'blog')


