It makes no sense to adopt a one-size-fits-all
approach to
filtersets: therefore, users must be able to define their own sets of
allowed
elements, as well as switch in-between doctypes of HTML.
Our goals are to let the user:
By default, users will use a doctype-based, permissive but secure whitelist. They must define a doctype, and this serves as the first method of determining a filterset.
This identifier is based on the name the W3C has given to the document type and not the DTD identifier.
This parameter is set via the configuration object:
$config->set('HTML', 'Doctype', 'XHTML 1.0 Transitional');
However, selecting this doctype doesn't mean much, because if we adhered exactly to the definition we would be letting XSS and other nasties through. HTML Purifier must, in its filterset, allow a subset of the doctype, which we shall call a filterset.
By default, HTML Purifier will use the Rich filterset, which allows as many elements as possible with untrusted sources. Other possible filtersets could be:
Extension-authors would be able to define custom filtersets for other users to use.
A possible call to select a filterset would be:
$config->set('HTML', 'Filterset', 'Rich');
If this cookie cutter approach doesn't appeal to a user, they may decide to roll their own filterset by selecting modules, tags and attributes to allow.
This would make use of the same facilities
as a filterset author would use, except that it would go under an
anonymous
filterset that would be auto-selected if any of the
relevant module/tag/attribute selection configuration directives were
non-null.