+ HTML Purifier has a fairly complex system for configuration. Users
+ interact with a HTMLPurifier_Config object to
+ set configuration directives. The values they set are validated according
+ to a configuration schema, HTMLPurifier_ConfigSchema.
+
+
+
+ The schema is mostly transparent to end-users, but if you're doing development
+ work for HTML Purifier and need to define a new configuration directive,
+ you'll need to interact with it. We'll also talk about how to define
+ userspace configuration directives at the very end.
+
+
+
Write a directive file
+
+
+ Directive files define configuration directives to be used by
+ HTML Purifier. They are placed in library/HTMLPurifier/ConfigSchema/schema/
+ in the form Namespace.Directive.txt (I
+ couldn't think of a more descriptive file extension.)
+ Directive files are actually what we call StringHashes,
+ i.e. associative arrays represented in a string form reminiscent of
+ PHPT tests. Here's a
+ sample directive file, Test.Sample.txt:
+
+
+
Test.Sample
+TYPE: string/null
+DEFAULT: NULL
+ALLOWED: 'foo', 'bar'
+VALUE-ALIASES: 'baz' => 'bar'
+VERSION: 3.1.0
+--DESCRIPTION--
+This is a sample configuration directive for the purposes of the
+<code>dev-config-schema.html<code> documentation.
+--ALIASES--
+Test.Example
+
+
+ Each of these segments has a specific meaning:
+
+
+
+
+
+
Key
+
Example
+
Description
+
+
+
+
+
ID
+
Test.Sample
+
The name of the directive, in the form Namespace.Directive
+ (implicitly the first line)
+
+
+
TYPE
+
string/null
+
The type of variable this directive accepts. See below for
+ details. You can also add /null to the end of
+ any basic type to allow null values too.
+
+
+
DEFAULT
+
NULL
+
A parseable PHP expression of the default value.
+
+
+
DESCRIPTION
+
This is a...
+
An HTML description of what this directive does.
+
+
+
VERSION
+
3.1.0
+
Recommended. The version of HTML Purifier this directive was added.
+ Directives that have been around since 1.0.0 don't have this,
+ but any new ones should.
+
+
+
ALIASES
+
Test.Example
+
Optional. A comma separated list of aliases for this directive.
+ This is most useful for backwards compatibility and should
+ not be used otherwise.
+
+
+
ALLOWED
+
'foo', 'bar'
+
Optional. Set of allowed value for a directive,
+ a comma separated list of parseable PHP expressions. This
+ is only allowed string, istring, text and itext TYPEs.
+
+
+
VALUE-ALIASES
+
'baz' => 'bar'
+
Optional. Mapping of one value to another, and
+ should be a comma separated list of keypair duples. This
+ is only allowed string, istring, text and itext TYPEs.
+
+
+
DEPRECATED-VERSION
+
3.1.0
+
Not shown. Indicates that the directive was
+ deprecated this version.
+
+
+
DEPRECATED-USE
+
Test.NewDirective
+
Not shown. Indicates what new directive should be
+ used instead. Note that the directives will functionally be
+ different, although they should offer the same functionality.
+ If they are identical, use an alias instead.
+
+
+
+
+
+ Some notes on format and style:
+
+
+
+
+ Each of these keys can be expressed in the short format
+ (KEY: Value) or the long format
+ (--KEY-- with value beneath). You must use the
+ long format if multiple lines are needed, or if a long format
+ has been used already (that's why ALIASES in our
+ example is in the long format); otherwise, it's user preference.
+
+
+ The HTML descriptions should be wrapped at about 80 columns; do
+ not rely on editor word-wrapping.
+
+
+
+
+ Also, as promised, here is the set of possible types:
+
+ The examples represent what will be returned out of the configuration
+ object; users have a little bit of leeway when setting configuration
+ values (for example, a lookup value can be specified as a list;
+ HTML Purifier will flip it as necessary.) These types are defined
+ in
+ library/HTMLPurifier/VarParser.php.
+
+ You may have noticed that your directive file isn't doing anything
+ yet. That's because it hasn't been added to the runtime
+ HTMLPurifier_ConfigSchema instance. Run
+ maintenance/generate-schema-cache.php to fix this.
+ If there were no errors, you're good to go! Don't forget to add
+ some unit tests for your functionality!
+
+
+
+ If you ever make changes to your configuration directives, you
+ will need to run this script again.
+
+
+
Errors
+
+
+ All directive files go through a rigorous validation process
+ through
+ library/HTMLPurifier/ConfigSchema/Validator.php, as well
+ as some basic checks during building. While
+ listing every error out here is out-of-scope for this document, we
+ can give some general tips for interpreting error messages.
+ There are two types of errors: builder errors and validation errors.
+
+
+
Builder errors
+
+
+
+ Exception: Expected type string, got
+ integer in DEFAULT in directive hash 'Ns.Dir'
+
+
+
+
+ You can identify a builder error by the keyword "directive hash."
+ These are the easiest to deal with, because they directly correspond
+ with your directive file. Find the offending directive file (which
+ is the directive hash plus the .txt extension), find the
+ offending index ("in DEFAULT" means the DEFAULT key) and fix the error.
+ This particular error would occur if your default value is not the same
+ type as TYPE.
+
+
+
Validation errors
+
+
+
+ Exception: Alias 3 in valueAliases in directive
+ 'Ns.Dir' must be a string
+
+
+
+
+ These are a little trickier, because we're not actually validating
+ your directive file, or even the direct string hash representation.
+ We're validating an Interchange object, and the error messages do
+ not mention any string hash keys.
+
+
+
+ Nevertheless, it's not difficult to figure out what went wrong.
+ Read the "context" statements in reverse:
+
+
+
+
in directive 'Ns.Dir'
+
This means we need to look at the directive file Ns.Dir.txt
+
in valueAliases
+
There's no key actually called this, but there's one that's close:
+ VALUE-ALIASES. Indeed, that's where to look.
+
Alias 3
+
The value alias that is equal to 3 is the culprit.
+
+
+
+ In this particular case, you're not allowed to alias integers values to
+ strings values.
+
+
+
+ The most difficult part is translating the Interchange member variable (valueAliases)
+ into a directive file key (VALUE-ALIASES), but there's a one-to-one
+ correspondence currently. If the two formats diverge, any discrepancies
+ will be described in
+ library/HTMLPurifier/ConfigSchema/InterchangeBuilder.php.
+
+
+
Internals
+
+
+ Much of the configuration schema framework's codebase deals with
+ shuffling data from one format to another, and doing validation on this
+ data.
+ The keystone of all of this is the HTMLPurifier_ConfigSchema_Interchange
+ class, which represents the purest, parsed representation of the schema.
+
+
+
+ Hand-writing this data is unwieldy, however, so we write directive files.
+ These directive files are parsed by HTMLPurifier_StringHashParser
+ into HTMLPurifier_StringHashes, which then
+ are run through HTMLPurifier_ConfigSchema_InterchangeBuilder
+ to construct the interchange object.
+
+
+
+ From the interchange object, the data can be siphoned into other forms
+ using HTMLPurifier_ConfigSchema_Builder subclasses.
+ For example, HTMLPurifier_ConfigSchema_Builder_ConfigSchema
+ generates a runtime HTMLPurifier_ConfigSchema object,
+ which HTMLPurifier_Config uses to validate its incoming
+ data. There is also a planned documentation builder.
+
+
+
$Id$
+
+
+
diff --git a/docs/dev-includes.txt b/docs/dev-includes.txt
index 37b9eed3..3db489df 100644
--- a/docs/dev-includes.txt
+++ b/docs/dev-includes.txt
@@ -275,3 +275,5 @@ New stuff
VERSION: Version number directive was introduced
DEPRECATED-VERSION: If the directive was deprecated, when was it deprecated?
DEPRECATED-USE: If the directive was deprecated, what should the user use now?
+REQUIRES: What classes does this configuration directive require, but are
+ not part of the HTML Purifier core?
diff --git a/docs/index.html b/docs/index.html
index 49d769a5..72bebf89 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -64,9 +64,12 @@ conventions.
Discusses when to flush HTML Purifier's various caches.
Describes config schema framework in HTML Purifier.
+
Proposals
diff --git a/docs/ref-html-modularization.txt b/docs/ref-html-modularization.txt
index e7fc0fce..c6faf721 100644
--- a/docs/ref-html-modularization.txt
+++ b/docs/ref-html-modularization.txt
@@ -1,6 +1,9 @@
The Modularization of HTMLDefinition in HTML Purifier
+WARNING: This document was drafted before the implementation of this
+ system, and some implementation details may have evolved over time.
+
HTML Purifier uses the modularization of XHTML
to organize the internals
of HTMLDefinition into a more manageable and extensible fashion. Rather
diff --git a/library/HTMLPurifier/ConfigSchema/InterchangeBuilder.php b/library/HTMLPurifier/ConfigSchema/InterchangeBuilder.php
index 74a53976..05ca367d 100644
--- a/library/HTMLPurifier/ConfigSchema/InterchangeBuilder.php
+++ b/library/HTMLPurifier/ConfigSchema/InterchangeBuilder.php
@@ -60,7 +60,7 @@ class HTMLPurifier_ConfigSchema_InterchangeBuilder
try {
$directive->default = $this->varParser->parse($hash->offsetGet('DEFAULT'), $directive->type, $directive->typeAllowsNull);
} catch (HTMLPurifier_VarParserException $e) {
- throw new HTMLPurifier_ConfigSchema_Exception($e->getMessage() . " in TYPE/DEFAULT in directive hash '$id'");
+ throw new HTMLPurifier_ConfigSchema_Exception($e->getMessage() . " in DEFAULT in directive hash '$id'");
}
}
diff --git a/library/HTMLPurifier/ConfigSchema/Validator.php b/library/HTMLPurifier/ConfigSchema/Validator.php
index bb88b1ff..a0a9f652 100644
--- a/library/HTMLPurifier/ConfigSchema/Validator.php
+++ b/library/HTMLPurifier/ConfigSchema/Validator.php
@@ -131,7 +131,7 @@ class HTMLPurifier_ConfigSchema_Validator
$this->with($d, 'allowed')
->assertNotEmpty()
->assertIsLookup(); // handled by InterchangeBuilder
- if (!isset($d->allowed[$d->default])) {
+ if (is_string($d->default) && !isset($d->allowed[$d->default])) {
$this->error('default', 'must be an allowed value');
}
$this->context[] = 'allowed';
diff --git a/maintenance/generate-includes.php b/maintenance/generate-includes.php
index b0f81090..d57674e3 100644
--- a/maintenance/generate-includes.php
+++ b/maintenance/generate-includes.php
@@ -19,11 +19,17 @@ $FS = new FSTools();
$exclude_dirs = array(
'HTMLPurifier/Language/',
+ 'HTMLPurifier/ConfigSchema/',
'HTMLPurifier/Filter/',
+ 'HTMLPurifier/Printer/',
+ /* These should be excluded, but need to have ConfigSchema support first
+
+ */
);
$exclude_files = array(
'HTMLPurifier/Lexer/PEARSax3.php',
'HTMLPurifier/Lexer/PH5P.php',
+ 'HTMLPurifier/Printer.php',
);
// Determine what files need to be included:
diff --git a/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultNullWithAllowed.vtest b/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultNullWithAllowed.vtest
new file mode 100644
index 00000000..fd44af44
--- /dev/null
+++ b/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultNullWithAllowed.vtest
@@ -0,0 +1,8 @@
+Ns
+DESCRIPTION: Namespace
+----
+Ns.Dir
+DESCRIPTION: Directive
+TYPE: string/null
+DEFAULT: null
+ALLOWED: 'a'
diff --git a/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultType.vtest b/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultType.vtest
index cc1b3668..fed53288 100644
--- a/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultType.vtest
+++ b/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultType.vtest
@@ -1,4 +1,4 @@
-ERROR: Expected type string, got integer in TYPE/DEFAULT in directive hash 'Ns.Dir'
+ERROR: Expected type string, got integer in DEFAULT in directive hash 'Ns.Dir'
----
Ns
DESCRIPTION: Namespace