DragonFly On-Line Manual Pages

Search: Section:  


DENATURE(1)           User Contributed Perl Documentation          DENATURE(1)

NAME

denature -- A script to convert an html file to an xsl-fo file, then using FOP, to a pdf.

SYNOPSIS

denature [-i dir] [-c dir] [-l] [-s] [-f] [-e] [-w] [-d] [-p] [-t] [-h] <input file> <output file>

DESCRIPTION

denature will parse through the given HTML file and produce a roughly equilavent PDF file. denature should be able to recognize most HTML files, but if the file is significantly ill-formed the parser may have problems. denature can add page breaks into the output pdf file. The placement of the page breaks is controlled by comments inside the HTML file. When a page break is desired place a <!-- PAGEBREAK --> command and a new page will be started before the next block of text. This can have funny consequences if the pagebreak is place inside a table data cell, so be careful where the break is placed.

OPTIONS

-i dir -- image directory. Directory where images used in the HTML document are located. -c dir -- css file directory. Directory where the CSS file used is located, if needed. -l -- landscape mode. Will attempt to print the PDF as a landscape page (EXPERIMENTAL). -s -- SVG graphics. This will enable the use of SVG graphics to draw some portions of the input fields. This requires a vaild DISPLAY variable to be set and a vaild X server because the java.awt packages are used to draw the images. -f -- footer. Will print a footer on the bottom of each page containing p. <page num>. -e -- header. Will print a header on the top of each page containing the <title></title> contents. -p -- padding. Will turn on the usage of the padding tag in CSS. This can cause some funny things in the output at the moment. -w -- warning. Will print out warning information. -d -- debug. Will print out debug information (this includes warnings). Only needed for development. -t -- test only Used to test denature, does not run fop and leaves the temporary xml file around. -h -- help. Prints out a brief help message.

ENVIRONMENT

There are several external dependancies which denature requres to execute. They are: HTML-Tagset (perl module) HTML-Parser (perl module) HTML-Tree (perl module) CSS::Tiny (perl module) FOP JDK The perl modules should be available through CPAN http://www.cpan.org, while FOP is available from http://xml.apache.org/fop/index.html. Once all the requirements are installed two variables at the top of the denature script need to be set to the FOP and JDK install directories. The variables are: $fop_home The directory that contains the fop.sh script. $java_home The directory that contains the JDK (This is the same as the JAVA_HOME environment variable). The the JDK version required should be specified on the FOP website. Once these variables are set denature should work correctly.

LIMITATIONS

denature does not currently handle all of HTML. As unknown tags are encountered they are added in. Not all <form> elements are handled, or only handled if the SVG flag is passed. The others could be added, I just have not had time. Current text fields just have there values printed, they could have boxes drawn around them in the future. The CSS support in denature only supports the simple items of TD.style or .style.

VERSION

This is denature 0.6.5.

AUTHOR

dan sinclair <dan.sinclair@treklogic.com> perl v5.20.3 2016-02-19 DENATURE(1)

Search: Section: