Converts HTML to PDF (here: for command-line linux (MS-Windows not covered, but similar))
Dokumentation: http://www.htmldoc.org/documentation.php
Quelle1)
To convert a single web page type:
<code> htmldoc --webpage -f output.pdf filename.html ENTER</code>
Try the following exercise: You want to convert the file myhtml.html into a PDF file. The new file will be called mypdf.pdf. How would you do this? (Don't worry, it's answered for you on the next line. But try first.) To accomplish this type:
htmldoc --webpage -f mypdf.pdf myhtml.html ENTER
To convert more than one web page with page breaks between each HTML file, type:
htmldoc --webpage -f output.pdf file1.html file2.html ENTER
All we are doing is adding another file. In this example we are converting two files: file1.html and file2.html. Try this example: Convert one.html and two.html into a PDF file named 12pdf.pdf. Again, the answer is on the next line.
Your line command should look like this:
<code> htmldoc --webpage -f 12pdf.pdf one.html two.html ENTER</code>
We've been using HTML files, but you can also use URLs. For example:
htmldoc --webpage -f output.pdf http://slashdot.org/ ENTER
Type one of the following commands to generate a book from one or more HTML files:
htmldoc --book -f output.html file1.html file2.html ENTER
htmldoc --book -f output.pdf file1.html file2.html ENTER
htmldoc --book -f output.ps file1.html file2.html ENTER
HTMLDOC will build a table of contents for the book using the heading elements (H1, H2, etc.) in your HTML files. It will also add a title page using the document TITLE text (you're going to learn about title files shortly) and other META information you supply in your HTML files. See Chapter 6 - HTML Reference for more information on the META variables that are supported. Note: When using book mode, HTMLDOC starts rendering with the first H1 element. Any text, images, tables, and other viewable elements that precede the first H1 element are silently ignored. Because of this, make sure you have an H1 element in your HTML file, otherwise HTMLDOC will not convert anything!
The –titlefile option sets the HTML file or image to use on the title page:
<code> htmldoc --titlefile filename.bmp ... ENTER htmldoc --titlefile filename.gif ... ENTER htmldoc --titlefile filename.jpg ... ENTER htmldoc --titlefile filename.png ... ENTER htmldoc --titlefile filename.html ... ENTER</code>
HTMLDOC supports BMP, GIF, JPEG, and PNG images, as well as generic HTML text you supply for the title page(s).
htmldoc --book -f 12book.pdf 1book.html 2book.html --titlefile bookcover.jpg ENTER
Take a look at the entire command line. Dissect the information. Can you see what the new filename is? What are the names of the files being converted? Do you see the titlepage file? What kind of file is your titlefile? Figure it out? The new file is 12book.pdf. The files converted were 1book.html and 2book.html. A title page was created using the JPEG image file bookcover.jpg.
Quelle2)
HTMLDOC supports many special HTML comments to initiate page breaks, set the header and footer text, and control the current media options:
<!-- FOOTER LEFT "foo" -->
Sets the left footer text; the test is applied to the current page if empty, or the next page otherwise.
<!-- FOOTER CENTER "foo" -->
Sets the center footer text; the test is applied to the current page if empty, or the next page otherwise.
<!-- FOOTER RIGHT "foo" -->
Sets the right footer text; the test is applied to the current page if empty, or the next page otherwise.
<!-- HALF PAGE -->
Break to the next half page.
<!-- HEADER LEFT "foo" -->
Sets the left header text; the test is applied to the current page if empty, or the next page otherwise.
<!-- HEADER CENTER "foo" -->
Sets the center header text; the test is applied to the current page if empty, or the next page otherwise.
<!-- HEADER RIGHT "foo" -->
Sets the right header text; the test is applied to the current page if empty, or the next page otherwise.
<!-- MEDIA BOTTOM nnn -->
Sets the bottom margin of the page. The „nnn“ string can be any standard measurement value, e.g. 0.5in, 36, 12mm, etc. Breaks to a new page if the current page is already marked.
<!-- MEDIA COLOR "foo" -->
Sets the media color attribute for the page. The „foo“ string is any color name that is supported by the printer, e.g. „Blue“, „White“, etc. Breaks to a new page or sheet if the current page is already marked.
<!-- MEDIA DUPLEX NO -->
Chooses single-sided printing for the page; breaks to a new page or sheet if the current page is already marked.
<!-- MEDIA DUPLEX YES -->
Chooses double-sided printing for the page; breaks to a new sheet if the current page is already marked.
<!-- MEDIA LANDSCAPE NO -->
Chooses portrait orientation for the page; breaks to a new page if the current page is already marked.
<!-- MEDIA LANDSCAPE YES -->
Chooses landscape orientation for the page; breaks to a new page if the current page is already marked.
<!-- MEDIA LEFT nnn -->
Sets the left margin of the page. The „nnn“ string can be any standard measurement value, e.g. 0.5in, 36, 12mm, etc. Breaks to a new page if the current page is already marked.
<!-- MEDIA POSITION nnn -->
Sets the media position attribute (input tray) for the page. The „nnn“ string is an integer that usually specifies the tray number. Breaks to a new page or sheet if the current page is already marked.
<!-- MEDIA RIGHT nnn -->
Sets the right margin of the page. The „nnn“ string can be any standard measurement value, e.g. 0.5in, 36, 12mm, etc. Breaks to a new page if the current page is already marked.
<!-- MEDIA SIZE foo -->
Sets the media size to the specified size. The „foo“ string can be „Letter“, „Legal“, „Universal“, or „A4“ for standard sizes or „WIDTHxHEIGHTunits“ for custom sizes, e.g. „8.5x11in“; breaks to a new page or sheet if the current page is already marked.
<!-- MEDIA TOP nnn -->
Sets the top margin of the page. The „nnn“ string can be any standard measurement value, e.g. 0.5in, 36, 12mm, etc. Breaks to a new page if the current page is already marked.
<!-- MEDIA TYPE "foo" -->
Sets the media type attribute for the page. The „foo“ string is any type name that is supported by the printer, e.g. „Plain“, „Glossy“, etc. Breaks to a new page or sheet if the current page is already marked.
<!-- NEED length -->
Break if there is less than length units left on the current page. The length value defaults to lines of text but can be suffixed by in, mm, or cm to convert from the corresponding units.
<!-- NEW PAGE -->
Break to the next page.
<!-- NEW SHEET -->
Break to the next sheet.
<!-- NUMBER-UP nn -->
Sets the number of pages that are placed on each output page. Valid values are 1, 2, 4, 6, 9, and 16.
<!-- PAGE BREAK -->
Break to the next page.