This help page shows how to use TidyGUI and gives brief explanations about Tidy's options. To get further details about HTML Tidy, please visit Dave Raggett's HTML Tidy page.
In a typical TidyGUI session you first select a file you want to cleanup
in field (1). Then you use action buttons
to tune some option settings (2),
tidy the file (3) and see the pretty-printed tidied
result (4).
Finally, look at the general comments and advise Tidy gives you in pane
(7) and take care of the warnings and errors
shown in pane (8).
To specify the input file, either type its name in field (1), drag-and-drop it to this field from a file explorer, or use the 'Browse' button.
Click the configuration button (2) to open a dialog box allowing you to set the many options Tidy offers. (See the Configuration section for option details.)
This button (3) starts the cleaning of the input file. TidyGUI never overwrites the input file and never writes to an error file either. Tidy's results are available in the user interface (output window, see below, and error pane (8)).
Also note that the original Tidy program is embedded in TidyGUI: it does not lanch a separate process to do its job.
Click this button (4) to open a window displaying the pretty-printed result of the tidying process. This output can be saved to a file (see the Output Window section).
Configuration options are distributed into 6 groups. Each option group is accessible
by selecting a tab at the top of the dialog box (1).
General operations are available in the configuration dialog. You can load or save a configuration from or to a file (2). The file format used by TydiGUI is the same text format as the one used by HTML Tidy.
You can also let TidyGUI generate command-line parameters (3), based on the settings made in the configuration dialog, for use by HTML Tidy. The command-line string is exported to the Windows clipboard.
Whenever you change the configuration you should click the 'Apply' button (4) before tidying a file. To reset all properties to their default values, click the 'Reset' button (5).
The following tables describe configuration options.
| TidyGUI option label |
HTML Tidy option name | Values | Description |
|---|---|---|---|
| Doctype | doctype | omit, auto, strict, loose, <fpi> | Controls the doctype declaration generated by Tidy. |
| Add Tidy meta element | tidy-mark | boolean | Add a <meta> element indicating that the document has been tidied. |
| Suppress optional end tags | hide-endtags | boolean | Omit optional end-tags when generating the pretty printed markup. Ignored when outputting to XML. |
| Enclose text in BODY within <P>'s | enclose-text | boolean | Enclose any text found in the <BODY> element within a <P> element. |
| Enclose text in blocks within <P>'s | enclose-block-text | boolean | Enclose any text, found in any element that allows mixed content for HTML transitional but not HTML strict, within a <P> element. |
| New empty tags | new-empty-tags | string (space or comma separated list of tags) | Declare new empty inline tags. Useful to let Tidy tolerate unknown (i.e. non-HTML) tags. |
| New inline tags | new-inline-tags | Declare new non-empty inline tags. Useful to let Tidy tolerate unknown (i.e. non-HTML) tags. | |
| New block-level tags | new-blocklevel-tags | Declare new block-level tags. Useful to let Tidy tolerate unknown (i.e. non-HTML) tags. | |
| New pre tags | new-pre-tags | Declare new tags that are to be processed in exactly the same way as HTML's <PRE> element. Useful to let Tidy tolerate unknown (i.e. non-HTML) tags. |
| TidyGUI option label |
HTML Tidy option name | Values | Description |
|---|---|---|---|
| Replace presentational tags and attrs by style rules | clean | boolean | Surplus presentational tags and attributes are replaced by style rules and structural markup as appropriate. |
| Discard <font> and <center> tags | drop-font-tags | boolean | Together with the previous option, discards <font> and <center> tags rather than creating the corresponding style rules. |
| Replace <i> by <em> and <b> by <strong> | logical-emphasis | boolean | The <i> and <b> presentational tags are replaced by the <em> and <strong> semantic ones. |
| Discard empty paragraphs | drop-empty-paras | boolean | If set to true, empty paragraphs are discarded. If set to false, empty paragraphs are replaced by a pair of <br> elements. |
| Source document is from MS Word 2000 | word-2000 | boolean | Deal with surplus stuff Microsoft Word 2000 inserts when you save Word documents as "Web pages". |
| Fix bad comments | fix-bad-comments | boolean | Replace unexpected hyphens with "=" characters when it comes across adjacent hyphens. |
| Replace '\' in URLs by '/' | fix-backslash | boolean | Backslash characters "\" in URLs are replaced by forward slashes "/". |
| Default text for ALT attributes | alt-text | string | Sets the default alt text for <img> attributes. This feature is dangerous as it suppresses further accessibility warnings. |
| TidyGUI option label |
HTML Tidy option name | Values | Description |
|---|---|---|---|
| Input is XML | input-xml | boolean | Use the XML parser rather than the error correcting HTML parser. |
| Output as XML | output-xml | boolean | Pretty-printed output is written as well-formed XML. |
| Output as XHTML | output-xhtml | boolean | Pretty-printed output is written as eXtensible HTML. |
| Add XML declaration | add-xml-decl | boolean | Add the XML declaration (<?xml version=...?>) when outputting XML or XHTML. |
| Assume XML processing instructions ('?>' PI terminator) | assume-xml-procins | boolean | Changes the parsing of processing instructions to require '?>' as the terminator rather than '>'. Automatically set if the input is in XML. |
| Add xml:space attribute as needed | add-xml-space | boolean | Causes Tidy to add xml:space="preserve" to elements such as <pre>, <style> and <script> when generating XML. |
| TidyGUI option label |
HTML Tidy option name | Values | Description |
|---|---|---|---|
| Character encoding | char-encoding | Raw, ASCII, Latin 1, UTF-8, ISO-2022, MacRoman | Determines how Tidy interprets character streams:
|
| Output numeric character entities | numeric-entities | boolean | Entities other than the basic XML 1.0 named entities are written in the numeric rather than the named entity form. |
Output " character as
" |
quote-marks | boolean | Causes " characters to be written out
as " as is preferred
by some editing environments. The apostrophe character
' is written out
as ' since many web browsers don't yet
support '.
|
| Output non-breaking spaces as entities | quote-nbsp | boolean | Causes non-breaking space characters to be written out as entities, rather than as the Unicode character U+00A0. |
Output unadorned & characters as
& |
quote-ampersand | boolean | Causes unadorned & characters to be written out as
&.
|
| TidyGUI option label |
HTML Tidy option name | Values | Description |
|---|---|---|---|
| Indent block-level tags | indent | no, yes, auto | If set to yes, Tidy indents block-level tags. If set to auto, indentation depends on context. |
| Indent attributes | indent-attributes | boolean | If set to true, each attribute will begin on a new line. |
| Indent spaces | indent-spaces | integer | Sets the number of spaces to indent content when indentation is enabled. |
| Wrap attribute values | wrap-attributes | boolean | If set to true, attribute values may be wrapped across lines for easier editing. |
| Wrap margin | wrap | integer | Sets the right margin for line wrapping. |
| Wrap string literals in script attributes | wrap-script-literals | boolean | Allows lines to be wrapped within string literals that appear in script attributes. |
| Literal attributes | literal-attributes | boolean | Ensures that whitespace characters within attribute values are passed through unchanged. |
| Wrap lines in ASP pseudo-elements (<%...%>) | wrap-asp | boolean | Prevents lines from being wrapped within ASP pseudo elements. |
| Wrap lines in JSTE pseudo-elements (<#...#>) | wrap-jste | boolean | Prevents lines from being wrapped within JSTE pseudo elements. |
| Wrap lines in PHP processing instructions | wrap-php | boolean | Prevents lines from being wrapped within PHP code. |
| Break before <br> | break-before-br | boolean | Output a line break before each <br> element. |
| Uppercase tags | uppercase-tags | boolean | Causes tag names to be output in upper case. |
| Uppercase attributes | uppercase-attributes | boolean | Causes attribute names to be output in upper case. |
| Spaces/tab on input | tab-size | integer | Used to map tabs to spaces when reading files. (Tidy never outputs files with tabs.) |
| TidyGUI option label |
HTML Tidy option name | Values | Description |
|---|---|---|---|
| Suppress tidied document output | markup | boolean | Determines whether Tidy generates a pretty printed version of the markup. Note that Tidy won't generate a pretty printed version if it finds unknown tags, or missing trailing quotes on attribute values, or missing trailing '>' on tags. |
| Quiet (no 'Parsing X', guessed DTD or error summary) | quiet | boolean | Don't output the welcome message or the summary of the numbers of errors and warnings. |
| Show warnings | show-warnings | boolean | If set to false, warnings are suppressed. This can be useful when a few errors are hidden in a flurry of warnings. |
| Create a sequence of slides | split | boolean | Use the input file to create a sequence of slides, splitting the markup prior to each successive <h2>. |
| Write tidied document back to source file | write-back | boolean | Not used by TidyGUI, but can be saved to a configuration file. |
| Keep time of source file | keep-time | boolean | |
| Format error output for GNU Emacs | gnu-emacs | boolean | |
| Error file | error-file | string |
TidyGUI's output window is a simple text window showing the tidied content of the input file. This in-memory content can be edited if necessary (uncheck the 'Read-only' box) and saved to a file (click the 'Save as...' button). You can also change the text font.