DragonFly On-Line Manual Pages

HTML_FMT(1)           User Contributed Perl Documentation          HTML_FMT(1)

NAME
       "html_fmt" - Reformat HTML, indented according to structure

SYNOPSIS
           html_fmt [uri|file]

EXAMPLE
           html_fmt http://perl.org

DESCRIPTION
       Given the URI or the name of a file, writes it to "STDOUT" reformatted
       and indented according to the HTML structure.  Missing start and end
       tags are supplied and comments added to indicate this.  Text inside
       "<pre>" elements is not altered.

       html_fmt tries to parse everything that is actually out there on the
       Web.  In fact, html_fmt will assume any file fed to it was intended as
       HTML, and will produce its best guess of the author's intent.

       html_fmt supplies missing start and end tags.  html_fmt's parser is
       extremely liberal in what it accepts.  When its liberalization of the
       standards is not sufficient to make a document into valid HTML,
       html_fmt will pick characters to treat as noise or "cruft".  The parser
       ignores cruft in determining the structure of the document.

       When html_fmt adds a missing start tag, it precedes the new start tag
       with a comment.  When html_fmt adds a missing end tag, it follows the
       new end tag with a comment.  When html_fmt classifies characters as
       "cruft", it adds a comment to that effect before the "cruft".

       "pre" elements receive special treatment.  The contents of "pre"
       elements are not reformatted.  When missing tags or cruft occur inside
       a "pre" element, the comments to that effect are placed before the
       "<pre>" start tag.

       The argument to html_score can be either as a URI or a file name.  If
       it starts with alphanumerics followed by a colon, it is treated as a
       URI.  Otherwise it is treated as file name.

SAMPLE OUTPUT
       Given this input:

           <title>Test page<tr>x<head attr="I am cruft"><p>Final graf

       html_fmt returns

           <!-- Following start tag is replacement for a missing one -->
           <html>
             <!-- Following start tag is replacement for a missing one -->
             <head>
               <title>
                 Test page
               </title>
               <!-- Preceding end tag is replacement for a missing one -->
             </head>
             <!-- Preceding end tag is replacement for a missing one -->
             <!-- Following start tag is replacement for a missing one -->
             <body>
               <!-- Following start tag is replacement for a missing one -->
               <table>
                 <!-- Following start tag is replacement for a missing one -->
                 <tbody>
                   <tr>
                     <!-- Following start tag is replacement for a missing one -->
                     <td>
                       x
                       <!-- Next line is cruft -->
                       <head attr="I am cruft">
                       <p>
                         Final graf
                       </p>
                       <!-- Preceding end tag is replacement for a missing one -->
                     </td>
                     <!-- Preceding end tag is replacement for a missing one -->
                   </tr>
                   <!-- Preceding end tag is replacement for a missing one -->
                 </tbody>
                 <!-- Preceding end tag is replacement for a missing one -->
               </table>
               <!-- Preceding end tag is replacement for a missing one -->
             </body>
             <!-- Preceding end tag is replacement for a missing one -->
           </html>
           <!-- Preceding end tag is replacement for a missing one -->

PURPOSE
       This program is a demo of a demo.  It purpose is to show how easy it is
       to write applications which look at the structure of web pages using
       Marpa::HTML.  And the purpose of Marpa::HTML is to demonstrate the
       power of its parse engine, Marpa.  Marpa::HTML was written in a few
       days, and its logic is a straightforward, natural expression of the
       structure of HTML.

ACKNOWLEDGMENTS
       The starting template for this code was HTML::TokeParser, by Gisle Aas.
       See also the acknowledgments for Marpa as a whole.

LICENSE AND COPYRIGHT
       Copyright 2007-2010 Jeffrey Kegler, all rights reserved.  Marpa is free
       software under the Perl license.  For details see the LICENSE file in
       the Marpa distribution.

perl v5.20.2                      2015-09-16                       HTML_FMT(1)