Change Log

Version 0.1.6 – Released 20 January, 2005

Bug Fixed: fixed false detection of US-ASCII which caused false encoding
conflict reports.
Bug Fixed: warnings were being written to report without the URL they were caused by.
Bug Fixed: modified packaging script so that script ‘wurml’ has the executable bit set.

Feature added -b, –brief to produce a short report to stdout instead of the verbose one.

Enhancement: removed port number from url in the report when it’s the default port.
Enhancement: modified archive package so that the directory contains the full,
version-qualified name.

Version 0.1.5 – released 14 January, 2005

Bug fixed: when DTD is specified with relative URL, the path is now calculated
relative to the document itself.

Feature: intelligent character encoding detection using HTTP Content-Type header, encoding attribute of XML directive and brute-force.
Feature: optional support for RFC 3023 (XML transmitted via HTTP).
Feature: added -k, –keeplocal always ignore external sites.
Feature: multi-session cookie support. At the end of a session, cookies are always saved to file ‘wurml-cookies’. When starting the next session with -C, –saved-cookies, those cookies will be loaded into the new session.
Feature: if an <INPUT> has attribute ‘emptyok’, the empty string is added to list of values used for traversal.
Feature: added support in config file to ignore specific attributes otheriwse disallowed by the document DTD.
Feature: added ‘image/allow’ to config file to configure acceptable mime types for images embedded within documents. Enhancement: after research, settled on UTF-8 as built-in default encoding.
Enhancement: wurml-conf.xml now uses wurml-conf.0.2.dtd (backwards compatible with wurml-conf.0.1.dtd)
Enhancement: insert empty line between documents fetches (can be disabled via -S).
Enhancement: rewrote Win/DOS batch file to accept an unlimited number of arguments.
Enhancement: elapsed time now only reflects request and response time. WurML parse time is ignored.
Enhancement: added support for formal variable suffixes ie. $(var:e), $(var:noesc).

Version 0.1.4released 02 December, 2004

Bug fixed: when run with -m, –dump, display the contents of the retrieved document even in the event of a parse error.

Bug fixed: loading links from GO with method=”GET”, include nested values via POSTFIELDs, same as “POST”.

Bug fixed: links with query arguments from POSTFIELD were being recognized as previously visited even if the value of the arguments had changed.

Bug fixed: traversal with POSTFIELD, include all possible combinations of
variable substitution.

Feature: -y, –dry-run now shows fully qualified links

Feature: added switch -x, –trust-config to allow non-validating parse of
configuration if the DTD has been removed.

Version 0.1.3Released 29 November, 2004

Bug fixed: to POST traversal which was causing an infinite loop manifesting as a OutOfMemoryError.

Version 0.1.2Released 25 November, 2004

Bug fixed: When a redirect went to another site, links were created relative to the first URL.

Bug fixed: Previously overlooked support for formal variable notation.

Bug fixed: Previously overlooked setvar.

Feature: support for POST and postfield added.

Feature: added switch -n, –no-validation: document is still parsed for well-formedness but not validated aginst DTD.

Feature: -y, –dry-run: load only the first document and display discovered links.

Enhancement: When a document has leading whitespace, it is stripped off and a warning printed before it is handed to the parser.

Enhancement: adjusted error reporting so that a link to the document the offending link was found in is also displayed (simplifies debugging).