请输入您要查询的百科知识:

 

词条 HTML sanitization
释义

  1. Implementations

  2. See also

  3. References

{{Refimprove|date=December 2009}}

HTML sanitization is the process of examining an HTML document and producing a new HTML document that preserves only whatever tags are designated "safe" and desired. HTML sanitization can be used to protect against cross-site scripting (XSS) attacks by sanitizing any HTML code submitted by a user.

Basic tags for changing fonts are often allowed, such as <b>, <i>, <u>, <em>, and <strong> while more advanced tags such as <script>, <object>, <embed>, and <link> are removed by the sanitization process. Also potentially dangerous attributes such as the onclick attribute are removed in order to prevent malicious code from being injected.

Sanitization is typically performed by using either a whitelist or a blacklist approach. Leaving a safe HTML element off a whitelist is not so serious; it simply means that that feature will not be included post-sanitation. On the other hand, if an unsafe element is left off a blacklist, then the vulnerability will not be sanitized out of the HTML output. An out-of-date blacklist can therefore be dangerous if new, unsafe features have been introduced to the HTML Standard.

Further sanitization can be performed based on rules which specify what operation is to be performed on the subject tags. Typical operations include removal of the tag itself while preserving the content, preserving only the textual content of a tag or forcing certain values on attributes.[1]

Implementations

In PHP, HTML sanitization can be performed using the strip_tags() function at the risk of removing all textual content following an unclosed less-than symbol or angle bracket.[2] The HTML Purifier library is another popular option for PHP applications.[3]

In Java (and .NET), sanitization can be achieved by using the OWASP Java HTML Sanitizer Project.[4]

In .NET, a number of sanitizers use the Html Agility Pack, an HTML parser.[5][6][7]

In JavaScript there are "JS-only" sanitizers for the back end, and browser-based[8] implementations that use browser's own DOM parser to parse the HTML (for better performance).

See also

  • Data sanitization

References

1. ^https://github.com/Vereyon/HtmlRuleSanitizer
2. ^{{cite web|url=http://us3.php.net/manual/en/function.strip-tags.php|title=strip_tags|publisher=PHP.NET}}
3. ^http://www.htmlpurifier.org
4. ^https://www.owasp.org/index.php/OWASP_Java_HTML_Sanitizer_Project
5. ^http://htmlagilitypack.codeplex.com/
6. ^http://eksith.wordpress.com/2011/06/14/whitelist-santize-htmlagilitypack/
7. ^https://github.com/Vereyon/HtmlRuleSanitizer
8. ^https://github.com/jitbit/HtmlSanitizer
{{web-software-stub}}

1 : HTML

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/11/11 8:14:50