Package org.w3c.tidy
Class Configuration
- java.lang.Object
-
- org.w3c.tidy.Configuration
-
- All Implemented Interfaces:
java.io.Serializable
public class Configuration extends java.lang.Object implements java.io.Serializable
Read configuration file and manage configuration properties. Configuration files associate a property name with a value. The format is that of a Java .properties file.- Version:
- $Revision: 807 $ ($Author: fgiust $)
- Author:
- Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected java.lang.String
altText
default text for alt attribute.static int
ASCII
Deprecated.protected boolean
asciiChars
convert quotes and dashes to nearest ASCII char.static int
BIG5
Deprecated.protected boolean
bodyOnly
output BODY content only.protected boolean
breakBeforeBR
o/p newline before br or not?protected boolean
burstSlides
create slides on each h2 element.protected java.lang.String
cssPrefix
CSS class naming for -clean option.protected int
definedTags
track what types of tags user has defined to eliminate unnecessary searches.static int
DOCTYPE_AUTO
treatment of doctype: auto.static int
DOCTYPE_LOOSE
treatment of doctype: loose.static int
DOCTYPE_OMIT
treatment of doctype: omit.static int
DOCTYPE_STRICT
treatment of doctype: strict.static int
DOCTYPE_USER
treatment of doctype: user.protected int
docTypeMode
see doctype property.protected java.lang.String
docTypeStr
user specified doctype.protected boolean
dropEmptyParas
discard empty p elements.protected boolean
dropFontTags
discard presentation tags.protected boolean
dropProprietaryAttributes
discard proprietary attributes.protected int
duplicateAttrs
Keep first or last duplicate attribute.protected boolean
emacs
if true format error output for GNU Emacs.protected boolean
encloseBlockText
if yes text in blocks is wrapped in p's.protected boolean
encloseBodyText
if yes text at body is wrapped in p's.protected java.lang.String
errfile
file name to write errors to.protected boolean
escapeCdata
replace CDATA sections with escaped text.protected boolean
fixBackslash
fix URLs by replacing \ with /.protected boolean
fixComments
fix comments with adjacent hyphens.protected boolean
fixUri
properly escape URLs.protected boolean
forceOutput
output document even if errors were found.protected boolean
hideComments
hides all (real) comments in output.protected boolean
hideEndTags
suppress optional end tags.protected boolean
htmlOut
output plain-old HTML, even for XHTML input.protected boolean
indentAttributes
newline+indent before each attribute.protected boolean
indentCdata
indent CDATA sections.protected boolean
indentContent
indent content of appropriate tags.static int
ISO2022
Deprecated.protected boolean
joinClasses
join multiple class attributes.protected boolean
joinStyles
join multiple style attributes.static int
KEEP_FIRST
Keep first duplicate attribute.static int
KEEP_LAST
Keep last duplicate attribute.protected boolean
keepFileTimes
if yes last modied time is preserved.protected java.lang.String
language
RJ language property.static int
LATIN1
Deprecated.protected boolean
literalAttribs
if true attributes may use newlines.protected boolean
logicalEmphasis
replace i by em and b by strong.protected boolean
lowerLiterals
folds known attribute values to lower case.static int
MACROMAN
Deprecated.protected boolean
makeBare
Make bare HTML: remove Microsoft cruft.protected boolean
makeClean
remove presentational clutter.protected boolean
ncr
allow numeric character references.protected char[]
newline
bytes for the newline marker.protected boolean
numEntities
use numeric entities.protected boolean
onlyErrors
if true normal output is suppressed.protected boolean
quiet
no 'Parsing X', guessed DTD or summary.protected boolean
quoteAmpersand
output naked ampersand as &.protected boolean
quoteMarks
output " marks as ".protected boolean
quoteNbsp
output non-breaking space as entity.static int
RAW
Deprecated.useTidy.setRawOut(true)
for raw outputprotected boolean
rawOut
Avoid mapping values > 127 to entities.protected boolean
replaceColor
replace hex color attribute values with names.protected java.lang.String
replacementCharEncoding
char encoding used when replacing illegal SGML chars, regardless of specified encoding.protected Report
report
Report instance.static int
SHIFTJIS
Deprecated.protected int
showErrors
number of errors to put out.protected boolean
showWarnings
however errors are always shown.protected java.lang.String
slidestyle
Deprecated.does nothingprotected boolean
smartIndent
does text/block level content effect indentation.protected int
spaces
default indentation.protected int
tabsize
default tab size (8).protected boolean
tidyMark
add meta element indicating tidied doc.protected boolean
trimEmpty
trim empty elements.protected TagTable
tt
TagTable associated with this Configuration.protected boolean
upperCaseAttrs
output attributes in upper not lower case.protected boolean
upperCaseTags
output tags in upper not lower case.static int
UTF16
Deprecated.static int
UTF16BE
Deprecated.static int
UTF16LE
Deprecated.static int
UTF8
Deprecated.static int
WIN1252
Deprecated.protected boolean
word2000
draconian cleaning for Word2000.protected boolean
wrapAsp
wrap within ASP pseudo elements.protected boolean
wrapAttVals
wrap within attribute values.protected boolean
wrapJste
wrap within JSTE pseudo elements.protected int
wraplen
default wrap margin (68).protected boolean
wrapPhp
wrap within PHP pseudo elements.protected boolean
wrapScriptlets
wrap within JavaScript string literals.protected boolean
wrapSection
wrap within CDATA section tags.protected boolean
writeback
if true then output tidied markup.protected boolean
xHTML
output extensible HTML.protected boolean
xmlOut
create output as XML.protected boolean
xmlPi
add<?xml?>
for XML docs.protected boolean
xmlPIs
If set to yes PIs must end with?>
.protected boolean
xmlSpace
if set to yes adds xml:space attr as needed.protected boolean
xmlTags
treat input as XML.
-
Constructor Summary
Constructors Modifier Constructor Description protected
Configuration(Report report)
Instantiates a new Configuration.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
addProps(java.util.Properties p)
adds configuration Properties.void
adjust()
Ensure that config is self consistent.protected java.lang.String
convertCharEncoding(int code)
Convert a char encoding from the deprecated tidy constant to a standard java encoding name.protected java.lang.String
getInCharEncodingName()
Getter forinCharEncodingName
.protected java.lang.String
getOutCharEncodingName()
Getter foroutCharEncodingName
.static boolean
isKnownOption(java.lang.String name)
Is the given String a valid configuration flag?void
parseFile(java.lang.String filename)
Parses a property file.protected void
setInCharEncoding(int encoding)
Deprecated.use setInCharEncodingName(String)protected void
setInCharEncodingName(java.lang.String encoding)
Setter forinCharEncodingName
.protected void
setInOutEncodingName(java.lang.String encoding)
Setter forinOutCharEncodingName
.protected void
setOutCharEncoding(int encoding)
Deprecated.use setOutCharEncodingName(String)protected void
setOutCharEncodingName(java.lang.String encoding)
Setter foroutCharEncodingName
.
-
-
-
Field Detail
-
RAW
public static final int RAW
Deprecated.useTidy.setRawOut(true)
for raw outputcharacter encoding = RAW.- See Also:
- Constant Field Values
-
ASCII
public static final int ASCII
Deprecated.character encoding = ASCII.- See Also:
- Constant Field Values
-
LATIN1
public static final int LATIN1
Deprecated.character encoding = LATIN1.- See Also:
- Constant Field Values
-
UTF8
public static final int UTF8
Deprecated.character encoding = UTF8.- See Also:
- Constant Field Values
-
ISO2022
public static final int ISO2022
Deprecated.character encoding = ISO2022.- See Also:
- Constant Field Values
-
MACROMAN
public static final int MACROMAN
Deprecated.character encoding = MACROMAN.- See Also:
- Constant Field Values
-
UTF16LE
public static final int UTF16LE
Deprecated.character encoding = UTF16LE.- See Also:
- Constant Field Values
-
UTF16BE
public static final int UTF16BE
Deprecated.character encoding = UTF16BE.- See Also:
- Constant Field Values
-
UTF16
public static final int UTF16
Deprecated.character encoding = UTF16.- See Also:
- Constant Field Values
-
WIN1252
public static final int WIN1252
Deprecated.character encoding = WIN1252.- See Also:
- Constant Field Values
-
BIG5
public static final int BIG5
Deprecated.character encoding = BIG5.- See Also:
- Constant Field Values
-
SHIFTJIS
public static final int SHIFTJIS
Deprecated.character encoding = SHIFTJIS.- See Also:
- Constant Field Values
-
DOCTYPE_OMIT
public static final int DOCTYPE_OMIT
treatment of doctype: omit.- See Also:
- Constant Field Values
-
DOCTYPE_AUTO
public static final int DOCTYPE_AUTO
treatment of doctype: auto.- See Also:
- Constant Field Values
-
DOCTYPE_STRICT
public static final int DOCTYPE_STRICT
treatment of doctype: strict.- See Also:
- Constant Field Values
-
DOCTYPE_LOOSE
public static final int DOCTYPE_LOOSE
treatment of doctype: loose.- See Also:
- Constant Field Values
-
DOCTYPE_USER
public static final int DOCTYPE_USER
treatment of doctype: user.- See Also:
- Constant Field Values
-
KEEP_LAST
public static final int KEEP_LAST
Keep last duplicate attribute.- See Also:
- Constant Field Values
-
KEEP_FIRST
public static final int KEEP_FIRST
Keep first duplicate attribute.- See Also:
- Constant Field Values
-
spaces
protected int spaces
default indentation.
-
wraplen
protected int wraplen
default wrap margin (68).
-
tabsize
protected int tabsize
default tab size (8).
-
docTypeMode
protected int docTypeMode
see doctype property.
-
duplicateAttrs
protected int duplicateAttrs
Keep first or last duplicate attribute.
-
altText
protected java.lang.String altText
default text for alt attribute.
-
slidestyle
protected java.lang.String slidestyle
Deprecated.does nothingstyle sheet for slides.
-
language
protected java.lang.String language
RJ language property.
-
docTypeStr
protected java.lang.String docTypeStr
user specified doctype.
-
errfile
protected java.lang.String errfile
file name to write errors to.
-
writeback
protected boolean writeback
if true then output tidied markup.
-
onlyErrors
protected boolean onlyErrors
if true normal output is suppressed.
-
showWarnings
protected boolean showWarnings
however errors are always shown.
-
quiet
protected boolean quiet
no 'Parsing X', guessed DTD or summary.
-
indentContent
protected boolean indentContent
indent content of appropriate tags.
-
smartIndent
protected boolean smartIndent
does text/block level content effect indentation.
-
hideEndTags
protected boolean hideEndTags
suppress optional end tags.
-
xmlTags
protected boolean xmlTags
treat input as XML.
-
xmlOut
protected boolean xmlOut
create output as XML.
-
xHTML
protected boolean xHTML
output extensible HTML.
-
htmlOut
protected boolean htmlOut
output plain-old HTML, even for XHTML input. Yes means set explicitly.
-
xmlPi
protected boolean xmlPi
add<?xml?>
for XML docs.
-
upperCaseTags
protected boolean upperCaseTags
output tags in upper not lower case.
-
upperCaseAttrs
protected boolean upperCaseAttrs
output attributes in upper not lower case.
-
makeClean
protected boolean makeClean
remove presentational clutter.
-
makeBare
protected boolean makeBare
Make bare HTML: remove Microsoft cruft.
-
logicalEmphasis
protected boolean logicalEmphasis
replace i by em and b by strong.
-
dropFontTags
protected boolean dropFontTags
discard presentation tags.
-
dropProprietaryAttributes
protected boolean dropProprietaryAttributes
discard proprietary attributes.
-
dropEmptyParas
protected boolean dropEmptyParas
discard empty p elements.
-
fixComments
protected boolean fixComments
fix comments with adjacent hyphens.
-
trimEmpty
protected boolean trimEmpty
trim empty elements.
-
breakBeforeBR
protected boolean breakBeforeBR
o/p newline before br or not?
-
burstSlides
protected boolean burstSlides
create slides on each h2 element.
-
numEntities
protected boolean numEntities
use numeric entities.
-
quoteMarks
protected boolean quoteMarks
output " marks as ".
-
quoteNbsp
protected boolean quoteNbsp
output non-breaking space as entity.
-
quoteAmpersand
protected boolean quoteAmpersand
output naked ampersand as &.
-
wrapAttVals
protected boolean wrapAttVals
wrap within attribute values.
-
wrapScriptlets
protected boolean wrapScriptlets
wrap within JavaScript string literals.
-
wrapSection
protected boolean wrapSection
wrap within CDATA section tags.
-
wrapAsp
protected boolean wrapAsp
wrap within ASP pseudo elements.
-
wrapJste
protected boolean wrapJste
wrap within JSTE pseudo elements.
-
wrapPhp
protected boolean wrapPhp
wrap within PHP pseudo elements.
-
fixBackslash
protected boolean fixBackslash
fix URLs by replacing \ with /.
-
indentAttributes
protected boolean indentAttributes
newline+indent before each attribute.
-
xmlPIs
protected boolean xmlPIs
If set to yes PIs must end with?>
.
-
xmlSpace
protected boolean xmlSpace
if set to yes adds xml:space attr as needed.
-
encloseBodyText
protected boolean encloseBodyText
if yes text at body is wrapped in p's.
-
encloseBlockText
protected boolean encloseBlockText
if yes text in blocks is wrapped in p's.
-
keepFileTimes
protected boolean keepFileTimes
if yes last modied time is preserved.
-
word2000
protected boolean word2000
draconian cleaning for Word2000.
-
tidyMark
protected boolean tidyMark
add meta element indicating tidied doc.
-
emacs
protected boolean emacs
if true format error output for GNU Emacs.
-
literalAttribs
protected boolean literalAttribs
if true attributes may use newlines.
-
bodyOnly
protected boolean bodyOnly
output BODY content only.
-
fixUri
protected boolean fixUri
properly escape URLs.
-
lowerLiterals
protected boolean lowerLiterals
folds known attribute values to lower case.
-
replaceColor
protected boolean replaceColor
replace hex color attribute values with names.
-
hideComments
protected boolean hideComments
hides all (real) comments in output.
-
indentCdata
protected boolean indentCdata
indent CDATA sections.
-
forceOutput
protected boolean forceOutput
output document even if errors were found.
-
showErrors
protected int showErrors
number of errors to put out.
-
asciiChars
protected boolean asciiChars
convert quotes and dashes to nearest ASCII char.
-
joinClasses
protected boolean joinClasses
join multiple class attributes.
-
joinStyles
protected boolean joinStyles
join multiple style attributes.
-
escapeCdata
protected boolean escapeCdata
replace CDATA sections with escaped text.
-
ncr
protected boolean ncr
allow numeric character references.
-
cssPrefix
protected java.lang.String cssPrefix
CSS class naming for -clean option.
-
replacementCharEncoding
protected java.lang.String replacementCharEncoding
char encoding used when replacing illegal SGML chars, regardless of specified encoding.
-
tt
protected TagTable tt
TagTable associated with this Configuration.
-
report
protected Report report
Report instance. Used for messages.
-
definedTags
protected int definedTags
track what types of tags user has defined to eliminate unnecessary searches.
-
newline
protected char[] newline
bytes for the newline marker.
-
rawOut
protected boolean rawOut
Avoid mapping values > 127 to entities.
-
-
Constructor Detail
-
Configuration
protected Configuration(Report report)
Instantiates a new Configuration. This method should be called by Tidy only.- Parameters:
report
- Report instance
-
-
Method Detail
-
addProps
public void addProps(java.util.Properties p)
adds configuration Properties.- Parameters:
p
- Properties
-
parseFile
public void parseFile(java.lang.String filename)
Parses a property file.- Parameters:
filename
- file name
-
isKnownOption
public static boolean isKnownOption(java.lang.String name)
Is the given String a valid configuration flag?- Parameters:
name
- configuration parameter name- Returns:
true
if the given String is a valid config option
-
adjust
public void adjust()
Ensure that config is self consistent.
-
getInCharEncodingName
protected java.lang.String getInCharEncodingName()
Getter forinCharEncodingName
.- Returns:
- Returns the inCharEncodingName.
-
setInCharEncodingName
protected void setInCharEncodingName(java.lang.String encoding)
Setter forinCharEncodingName
.- Parameters:
encoding
- The inCharEncodingName to set.
-
getOutCharEncodingName
protected java.lang.String getOutCharEncodingName()
Getter foroutCharEncodingName
.- Returns:
- Returns the outCharEncodingName.
-
setOutCharEncodingName
protected void setOutCharEncodingName(java.lang.String encoding)
Setter foroutCharEncodingName
.- Parameters:
encoding
- The outCharEncodingName to set.
-
setInOutEncodingName
protected void setInOutEncodingName(java.lang.String encoding)
Setter forinOutCharEncodingName
.- Parameters:
encoding
- The CharEncodingName to set.
-
setOutCharEncoding
protected void setOutCharEncoding(int encoding)
Deprecated.use setOutCharEncodingName(String)Setter foroutCharEncoding
.- Parameters:
encoding
- The outCharEncoding to set.
-
setInCharEncoding
protected void setInCharEncoding(int encoding)
Deprecated.use setInCharEncodingName(String)Setter forinCharEncoding
.- Parameters:
encoding
- The inCharEncoding to set.
-
convertCharEncoding
protected java.lang.String convertCharEncoding(int code)
Convert a char encoding from the deprecated tidy constant to a standard java encoding name.- Parameters:
code
- encoding code- Returns:
- encoding name
-
-