Using TidyGUI

This help page shows how to use TidyGUI and gives brief explanations about Tidy's options. To get further details about HTML Tidy, please visit Dave Raggett's HTML Tidy page.

TidyGUI's main window

Main window In a typical TidyGUI session you first select a file you want to cleanup in field (1). Then you use action buttons to tune some option settings (2), tidy the file (3) and see the pretty-printed tidied result (4). Finally, look at the general comments and advise Tidy gives you in pane (7) and take care of the warnings and errors shown in pane (8).

File input

To specify the input file, either type its name in field (1), drag-and-drop it to this field from a file explorer, or use the 'Browse' button.

'Configuration' button

Click the configuration button (2) to open a dialog box allowing you to set the many options Tidy offers. (See the Configuration section for option details.)

'Tidy!' button

This button (3) starts the cleaning of the input file. TidyGUI never overwrites the input file and never writes to an error file either. Tidy's results are available in the user interface (output window, see below, and error pane (8)).

Also note that the original Tidy program is embedded in TidyGUI: it does not lanch a separate process to do its job.

'Show Output' button

Click this button (4) to open a window displaying the pretty-printed result of the tidying process. This output can be saved to a file (see the Output Window section).

Configuration

Configuration dialog Configuration options are distributed into 6 groups. Each option group is accessible by selecting a tab at the top of the dialog box (1).

General operations are available in the configuration dialog. You can load or save a configuration from or to a file (2). The file format used by TydiGUI is the same text format as the one used by HTML Tidy.

You can also let TidyGUI generate command-line parameters (3), based on the settings made in the configuration dialog, for use by HTML Tidy. The command-line string is exported to the Windows clipboard.

Whenever you change the configuration you should click the 'Apply' button (4) before tidying a file. To reset all properties to their default values, click the 'Reset' button (5).

The following tables describe configuration options.

Markup options
TidyGUI option
label
HTML Tidy option name Values Description
Doctype doctype omit, auto, strict, loose, <fpi> Controls the doctype declaration generated by Tidy.
Add Tidy meta element tidy-mark boolean Add a <meta> element indicating that the document has been tidied.
Suppress optional end tags hide-endtags boolean Omit optional end-tags when generating the pretty printed markup. Ignored when outputting to XML.
Enclose text in BODY within <P>'s enclose-text boolean Enclose any text found in the <BODY> element within a <P> element.
Enclose text in blocks within <P>'s enclose-block-text boolean Enclose any text, found in any element that allows mixed content for HTML transitional but not HTML strict, within a <P> element.
New empty tags new-empty-tags string (space or comma separated list of tags) Declare new empty inline tags. Useful to let Tidy tolerate unknown (i.e. non-HTML) tags.
New inline tags new-inline-tags Declare new non-empty inline tags. Useful to let Tidy tolerate unknown (i.e. non-HTML) tags.
New block-level tags new-blocklevel-tags Declare new block-level tags. Useful to let Tidy tolerate unknown (i.e. non-HTML) tags.
New pre tags new-pre-tags Declare new tags that are to be processed in exactly the same way as HTML's <PRE> element. Useful to let Tidy tolerate unknown (i.e. non-HTML) tags.
Cleanup options
TidyGUI option
label
HTML Tidy option name Values Description
Replace presentational tags and attrs by style rules clean boolean Surplus presentational tags and attributes are replaced by style rules and structural markup as appropriate.
Discard <font> and <center> tags drop-font-tags boolean Together with the previous option, discards <font> and <center> tags rather than creating the corresponding style rules.
Replace <i> by <em> and <b> by <strong> logical-emphasis boolean The <i> and <b> presentational tags are replaced by the <em> and <strong> semantic ones.
Discard empty paragraphs drop-empty-paras boolean If set to true, empty paragraphs are discarded. If set to false, empty paragraphs are replaced by a pair of <br> elements.
Source document is from MS Word 2000 word-2000 boolean Deal with surplus stuff Microsoft Word 2000 inserts when you save Word documents as "Web pages".
Fix bad comments fix-bad-comments boolean Replace unexpected hyphens with "=" characters when it comes across adjacent hyphens.
Replace '\' in URLs by '/' fix-backslash boolean Backslash characters "\" in URLs are replaced by forward slashes "/".
Default text for ALT attributes alt-text string Sets the default alt text for <img> attributes. This feature is dangerous as it suppresses further accessibility warnings.
XML options
TidyGUI option
label
HTML Tidy option name Values Description
Input is XML input-xml boolean Use the XML parser rather than the error correcting HTML parser.
Output as XML output-xml boolean Pretty-printed output is written as well-formed XML.
Output as XHTML output-xhtml boolean Pretty-printed output is written as eXtensible HTML.
Add XML declaration add-xml-decl boolean Add the XML declaration (<?xml version=...?>) when outputting XML or XHTML.
Assume XML processing instructions ('?>' PI terminator) assume-xml-procins boolean Changes the parsing of processing instructions to require '?>' as the terminator rather than '>'. Automatically set if the input is in XML.
Add xml:space attribute as needed add-xml-space boolean Causes Tidy to add xml:space="preserve" to elements such as <pre>, <style> and <script> when generating XML.
Encoding options
TidyGUI option
label
HTML Tidy option name Values Description
Character encoding char-encoding Raw, ASCII, Latin 1, UTF-8, ISO-2022, MacRoman Determines how Tidy interprets character streams:
  • Raw: output values > 127 without translating them into entities
  • ASCII: accept Latin 1 character values, but output entities for all characters whose value > 127
  • Latin 1: characters > 255 are written as entities
  • UTF-8: Tidy assumes that both input and output is encoded as UTF-8
  • ISO-2022: to be used for files encoded using the ISO2022 family of encodings, e.g. ISO 2022-JP
  • MacRoman: use the Apple MacRoman character set
Output numeric character entities numeric-entities boolean Entities other than the basic XML 1.0 named entities are written in the numeric rather than the named entity form.
Output " character as &quot; quote-marks boolean Causes " characters to be written out as &quot; as is preferred by some editing environments. The apostrophe character ' is written out as &#39; since many web browsers don't yet support &apos;.
Output non-breaking spaces as entities quote-nbsp boolean Causes non-breaking space characters to be written out as entities, rather than as the Unicode character U+00A0.
Output unadorned & characters as &amp; quote-ampersand boolean Causes unadorned & characters to be written out as &amp;.
Layout options
TidyGUI option
label
HTML Tidy option name Values Description
Indent block-level tags indent no, yes, auto If set to yes, Tidy indents block-level tags. If set to auto, indentation depends on context.
Indent attributes indent-attributes boolean If set to true, each attribute will begin on a new line.
Indent spaces indent-spaces integer Sets the number of spaces to indent content when indentation is enabled.
Wrap attribute values wrap-attributes boolean If set to true, attribute values may be wrapped across lines for easier editing.
Wrap margin wrap integer Sets the right margin for line wrapping.
Wrap string literals in script attributes wrap-script-literals boolean Allows lines to be wrapped within string literals that appear in script attributes.
Literal attributes literal-attributes boolean Ensures that whitespace characters within attribute values are passed through unchanged.
Wrap lines in ASP pseudo-elements (<%...%>) wrap-asp boolean Prevents lines from being wrapped within ASP pseudo elements.
Wrap lines in JSTE pseudo-elements (<#...#>) wrap-jste boolean Prevents lines from being wrapped within JSTE pseudo elements.
Wrap lines in PHP processing instructions wrap-php boolean Prevents lines from being wrapped within PHP code.
Break before <br> break-before-br boolean Output a line break before each <br> element.
Uppercase tags uppercase-tags boolean Causes tag names to be output in upper case.
Uppercase attributes uppercase-attributes boolean Causes attribute names to be output in upper case.
Spaces/tab on input tab-size integer Used to map tabs to spaces when reading files. (Tidy never outputs files with tabs.)
Operation options
TidyGUI option
label
HTML Tidy option name Values Description
Suppress tidied document output markup boolean Determines whether Tidy generates a pretty printed version of the markup. Note that Tidy won't generate a pretty printed version if it finds unknown tags, or missing trailing quotes on attribute values, or missing trailing '>' on tags.
Quiet (no 'Parsing X', guessed DTD or error summary) quiet boolean Don't output the welcome message or the summary of the numbers of errors and warnings.
Show warnings show-warnings boolean If set to false, warnings are suppressed. This can be useful when a few errors are hidden in a flurry of warnings.
Create a sequence of slides split boolean Use the input file to create a sequence of slides, splitting the markup prior to each successive <h2>.
Write tidied document back to source file write-back boolean Not used by TidyGUI, but can be saved to a configuration file.
Keep time of source file keep-time boolean
Format error output for GNU Emacs gnu-emacs boolean
Error file error-file string

Output window

TidyGUI's output window is a simple text window showing the tidied content of the input file. This in-memory content can be edited if necessary (uncheck the 'Read-only' box) and saved to a file (click the 'Save as...' button). You can also change the text font.