如何使用Python / Django执行HTML解码/编码？

给定Django用例，对此有两个答案。这是它的

django.utils.html.escape

功能，以供参考：

def escape(html):    """Returns the given HTML with ampersands, quotes and carets enpred."""    return mark_safe(force_unipre(html).replace('&', '&amp;').replace('<', '&lt;').replace('>', '&gt;').replace('"', '&quot;').replace("'", '&#39;'))

为了解决这个问题，Jake的答案中描述的Cheetah函数应该起作用，但是缺少单引号。此版本包含更新的元组，并且替换顺序相反，以避免出现对称问题：

def html_depre(s):    """    Returns the ASCII depred version of the given HTML string. This does    NOT remove normal HTML tags like <p>.    """    htmlCodes = ( ("'", '&#39;'), ('"', '&quot;'), ('>', '&gt;'), ('<', '&lt;'), ('&', '&amp;')        )    for pre in htmlCodes:        s = s.replace(pre[1], pre[0])    return sunescaped = html_depre(my_string)

但是，这不是一般的解决方案。仅适用于以编码的字符串django.utils.html.escape。更笼统地说，坚持使用标准库是一个好主意：

# Python 2.x:import HTMLParserhtml_parser = HTMLParser.HTMLParser()unescaped = html_parser.unescape(my_string)# Python 3.x:import html.parserhtml_parser = html.parser.HTMLParser()unescaped = html_parser.unescape(my_string)# >= Python 3.5:from html import unescapeunescaped = unescape(my_string)

建议：将未转义的HTML存储在数据库中可能更有意义。如果可能的话，值得一探的是从BeautifulSoup获得未转义的结果，并完全避免此过程。

对于Django，转义仅在模板渲染期间发生；因此，为了防止转义，您只需告诉模板引擎不要转义您的字符串即可。为此，请在模板中使用以下选项之一：

{{ context_var|safe }}{% autoescape off %}    {{ context_var }}{% endautoescape %}

如何使用Python / Django执行HTML解码/编码？

面试问答相关栏目本月热门文章