diff --git a/TODO b/TODO index 210a091a..8445f69e 100644 --- a/TODO +++ b/TODO @@ -19,8 +19,6 @@ IMPORTANT - Release candidate, because of the major changes DOCUMENTATION - - Document new ConfigSchema setup and format; dev-includes.txt is a base - but we need it in HTML - Update French translation of README IMPORTANT FEATURES diff --git a/docs/dev-advanced-api.html b/docs/dev-advanced-api.html index 1cb3169b..83f82124 100644 --- a/docs/dev-advanced-api.html +++ b/docs/dev-advanced-api.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - + Advanced API - HTML Purifier diff --git a/docs/dev-config-schema.html b/docs/dev-config-schema.html new file mode 100644 index 00000000..bd3d1e1a --- /dev/null +++ b/docs/dev-config-schema.html @@ -0,0 +1,368 @@ + + + + + + + + Config Schema - HTML Purifier + + + +

Config Schema

+ +
Filed under Development
+
Return to the index.
+
HTML Purifier End-User Documentation
+ +

+ HTML Purifier has a fairly complex system for configuration. Users + interact with a HTMLPurifier_Config object to + set configuration directives. The values they set are validated according + to a configuration schema, HTMLPurifier_ConfigSchema. +

+ +

+ The schema is mostly transparent to end-users, but if you're doing development + work for HTML Purifier and need to define a new configuration directive, + you'll need to interact with it. We'll also talk about how to define + userspace configuration directives at the very end. +

+ +

Write a directive file

+ +

+ Directive files define configuration directives to be used by + HTML Purifier. They are placed in library/HTMLPurifier/ConfigSchema/schema/ + in the form Namespace.Directive.txt (I + couldn't think of a more descriptive file extension.) + Directive files are actually what we call StringHashes, + i.e. associative arrays represented in a string form reminiscent of + PHPT tests. Here's a + sample directive file, Test.Sample.txt: +

+ +
Test.Sample
+TYPE: string/null
+DEFAULT: NULL
+ALLOWED: 'foo', 'bar'
+VALUE-ALIASES: 'baz' => 'bar'
+VERSION: 3.1.0
+--DESCRIPTION--
+This is a sample configuration directive for the purposes of the
+<code>dev-config-schema.html<code> documentation.
+--ALIASES--
+Test.Example
+ +

+ Each of these segments has a specific meaning: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
KeyExampleDescription
IDTest.SampleThe name of the directive, in the form Namespace.Directive + (implicitly the first line)
TYPEstring/nullThe type of variable this directive accepts. See below for + details. You can also add /null to the end of + any basic type to allow null values too.
DEFAULTNULLA parseable PHP expression of the default value.
DESCRIPTIONThis is a...An HTML description of what this directive does.
VERSION3.1.0Recommended. The version of HTML Purifier this directive was added. + Directives that have been around since 1.0.0 don't have this, + but any new ones should.
ALIASESTest.ExampleOptional. A comma separated list of aliases for this directive. + This is most useful for backwards compatibility and should + not be used otherwise.
ALLOWED'foo', 'bar'Optional. Set of allowed value for a directive, + a comma separated list of parseable PHP expressions. This + is only allowed string, istring, text and itext TYPEs.
VALUE-ALIASES'baz' => 'bar'Optional. Mapping of one value to another, and + should be a comma separated list of keypair duples. This + is only allowed string, istring, text and itext TYPEs.
DEPRECATED-VERSION3.1.0Not shown. Indicates that the directive was + deprecated this version.
DEPRECATED-USETest.NewDirectiveNot shown. Indicates what new directive should be + used instead. Note that the directives will functionally be + different, although they should offer the same functionality. + If they are identical, use an alias instead.
+ +

+ Some notes on format and style: +

+ + + +

+ Also, as promised, here is the set of possible types: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TypeExampleDescription
string'Foo'String without newlines
istring'foo'Case insensitive ASCII string without newlines
text"A\nb"String with newlines
itext"a\nb"Case insensitive ASCII string without newlines
int23Integer
float3.0Floating point number
booltrueBoolean
lookuparray('key' => true)Lookup array, used with isset($var[$key])
listarray('f', 'b')List array, with ordered numerical indexes
hasharray('key' => 'val')Associative array of keys to values
mixednew stdclassAny PHP variable is fine
+ +

+ The examples represent what will be returned out of the configuration + object; users have a little bit of leeway when setting configuration + values (for example, a lookup value can be specified as a list; + HTML Purifier will flip it as necessary.) These types are defined + in + library/HTMLPurifier/VarParser.php. +

+ +

+ For more information on what values are allowed, and how they are parsed, + consult + library/HTMLPurifier/ConfigSchema/InterchangeBuilder.php, as well + as + library/HTMLPurifier/ConfigSchema/Interchange/Directive.php for + the semantics of the parsed values. +

+ +

Refreshing the cache

+ +

+ You may have noticed that your directive file isn't doing anything + yet. That's because it hasn't been added to the runtime + HTMLPurifier_ConfigSchema instance. Run + maintenance/generate-schema-cache.php to fix this. + If there were no errors, you're good to go! Don't forget to add + some unit tests for your functionality! +

+ +

+ If you ever make changes to your configuration directives, you + will need to run this script again. +

+ +

Errors

+ +

+ All directive files go through a rigorous validation process + through + library/HTMLPurifier/ConfigSchema/Validator.php, as well + as some basic checks during building. While + listing every error out here is out-of-scope for this document, we + can give some general tips for interpreting error messages. + There are two types of errors: builder errors and validation errors. +

+ +

Builder errors

+ +
+

+ Exception: Expected type string, got + integer in DEFAULT in directive hash 'Ns.Dir' +

+
+ +

+ You can identify a builder error by the keyword "directive hash." + These are the easiest to deal with, because they directly correspond + with your directive file. Find the offending directive file (which + is the directive hash plus the .txt extension), find the + offending index ("in DEFAULT" means the DEFAULT key) and fix the error. + This particular error would occur if your default value is not the same + type as TYPE. +

+ +

Validation errors

+ +
+

+ Exception: Alias 3 in valueAliases in directive + 'Ns.Dir' must be a string +

+
+ +

+ These are a little trickier, because we're not actually validating + your directive file, or even the direct string hash representation. + We're validating an Interchange object, and the error messages do + not mention any string hash keys. +

+ +

+ Nevertheless, it's not difficult to figure out what went wrong. + Read the "context" statements in reverse: +

+ +
+
in directive 'Ns.Dir'
+
This means we need to look at the directive file Ns.Dir.txt
+
in valueAliases
+
There's no key actually called this, but there's one that's close: + VALUE-ALIASES. Indeed, that's where to look.
+
Alias 3
+
The value alias that is equal to 3 is the culprit.
+
+ +

+ In this particular case, you're not allowed to alias integers values to + strings values. +

+ +

+ The most difficult part is translating the Interchange member variable (valueAliases) + into a directive file key (VALUE-ALIASES), but there's a one-to-one + correspondence currently. If the two formats diverge, any discrepancies + will be described in + library/HTMLPurifier/ConfigSchema/InterchangeBuilder.php. +

+ +

Internals

+ +

+ Much of the configuration schema framework's codebase deals with + shuffling data from one format to another, and doing validation on this + data. + The keystone of all of this is the HTMLPurifier_ConfigSchema_Interchange + class, which represents the purest, parsed representation of the schema. +

+ +

+ Hand-writing this data is unwieldy, however, so we write directive files. + These directive files are parsed by HTMLPurifier_StringHashParser + into HTMLPurifier_StringHashes, which then + are run through HTMLPurifier_ConfigSchema_InterchangeBuilder + to construct the interchange object. +

+ +

+ From the interchange object, the data can be siphoned into other forms + using HTMLPurifier_ConfigSchema_Builder subclasses. + For example, HTMLPurifier_ConfigSchema_Builder_ConfigSchema + generates a runtime HTMLPurifier_ConfigSchema object, + which HTMLPurifier_Config uses to validate its incoming + data. There is also a planned documentation builder. +

+ +
$Id$
+ + + diff --git a/docs/dev-includes.txt b/docs/dev-includes.txt index 37b9eed3..3db489df 100644 --- a/docs/dev-includes.txt +++ b/docs/dev-includes.txt @@ -275,3 +275,5 @@ New stuff VERSION: Version number directive was introduced DEPRECATED-VERSION: If the directive was deprecated, when was it deprecated? DEPRECATED-USE: If the directive was deprecated, what should the user use now? +REQUIRES: What classes does this configuration directive require, but are + not part of the HTML Purifier core? diff --git a/docs/index.html b/docs/index.html index 49d769a5..72bebf89 100644 --- a/docs/index.html +++ b/docs/index.html @@ -64,9 +64,12 @@ conventions.

Discusses when to flush HTML Purifier's various caches.
Advanced API
-
Functional specification for HTML Purifier's advanced API for defining +
Specification for HTML Purifier's advanced API for defining custom filtering behavior.
+
Config Schema
+
Describes config schema framework in HTML Purifier.
+

Proposals

diff --git a/docs/ref-html-modularization.txt b/docs/ref-html-modularization.txt index e7fc0fce..c6faf721 100644 --- a/docs/ref-html-modularization.txt +++ b/docs/ref-html-modularization.txt @@ -1,6 +1,9 @@ The Modularization of HTMLDefinition in HTML Purifier +WARNING: This document was drafted before the implementation of this + system, and some implementation details may have evolved over time. + HTML Purifier uses the modularization of XHTML to organize the internals of HTMLDefinition into a more manageable and extensible fashion. Rather diff --git a/library/HTMLPurifier/ConfigSchema/InterchangeBuilder.php b/library/HTMLPurifier/ConfigSchema/InterchangeBuilder.php index 74a53976..05ca367d 100644 --- a/library/HTMLPurifier/ConfigSchema/InterchangeBuilder.php +++ b/library/HTMLPurifier/ConfigSchema/InterchangeBuilder.php @@ -60,7 +60,7 @@ class HTMLPurifier_ConfigSchema_InterchangeBuilder try { $directive->default = $this->varParser->parse($hash->offsetGet('DEFAULT'), $directive->type, $directive->typeAllowsNull); } catch (HTMLPurifier_VarParserException $e) { - throw new HTMLPurifier_ConfigSchema_Exception($e->getMessage() . " in TYPE/DEFAULT in directive hash '$id'"); + throw new HTMLPurifier_ConfigSchema_Exception($e->getMessage() . " in DEFAULT in directive hash '$id'"); } } diff --git a/library/HTMLPurifier/ConfigSchema/Validator.php b/library/HTMLPurifier/ConfigSchema/Validator.php index bb88b1ff..a0a9f652 100644 --- a/library/HTMLPurifier/ConfigSchema/Validator.php +++ b/library/HTMLPurifier/ConfigSchema/Validator.php @@ -131,7 +131,7 @@ class HTMLPurifier_ConfigSchema_Validator $this->with($d, 'allowed') ->assertNotEmpty() ->assertIsLookup(); // handled by InterchangeBuilder - if (!isset($d->allowed[$d->default])) { + if (is_string($d->default) && !isset($d->allowed[$d->default])) { $this->error('default', 'must be an allowed value'); } $this->context[] = 'allowed'; diff --git a/maintenance/generate-includes.php b/maintenance/generate-includes.php index b0f81090..d57674e3 100644 --- a/maintenance/generate-includes.php +++ b/maintenance/generate-includes.php @@ -19,11 +19,17 @@ $FS = new FSTools(); $exclude_dirs = array( 'HTMLPurifier/Language/', + 'HTMLPurifier/ConfigSchema/', 'HTMLPurifier/Filter/', + 'HTMLPurifier/Printer/', + /* These should be excluded, but need to have ConfigSchema support first + + */ ); $exclude_files = array( 'HTMLPurifier/Lexer/PEARSax3.php', 'HTMLPurifier/Lexer/PH5P.php', + 'HTMLPurifier/Printer.php', ); // Determine what files need to be included: diff --git a/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultNullWithAllowed.vtest b/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultNullWithAllowed.vtest new file mode 100644 index 00000000..fd44af44 --- /dev/null +++ b/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultNullWithAllowed.vtest @@ -0,0 +1,8 @@ +Ns +DESCRIPTION: Namespace +---- +Ns.Dir +DESCRIPTION: Directive +TYPE: string/null +DEFAULT: null +ALLOWED: 'a' diff --git a/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultType.vtest b/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultType.vtest index cc1b3668..fed53288 100644 --- a/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultType.vtest +++ b/tests/HTMLPurifier/ConfigSchema/Validator/directive/defaultType.vtest @@ -1,4 +1,4 @@ -ERROR: Expected type string, got integer in TYPE/DEFAULT in directive hash 'Ns.Dir' +ERROR: Expected type string, got integer in DEFAULT in directive hash 'Ns.Dir' ---- Ns DESCRIPTION: Namespace