TibiCAM
Veteran OT User
- Joined
- Feb 3, 2020
- Messages
- 214
- Reaction score
- 295
Converting HTML to PDF has always been a big topic among web developers. There's been numerous tools over the years and they all come and go. One that still stands is PrinceXML. But their license is absurd. It's about $3800 USD or something for 1 license. Absolute lunacy.
A lot of people resort to using Puppeteer or Chrome in a headless browser. But that's resource heavy and terrible at cold starts. Especially if you're going to print a few thousand PDFs per day.
I'm in need of a way to dynamically generate PDFs in my web app, mainly as invoices but also for some other stuff. I have been thinking of using HTML to PDF. Currently, I've tried Puppeteer and didn't like how much resources it uses. And I've also tried wkhtmltopdf which is deprecated and has some security issues. It works, but like I said, it's unsecure.
So then I researched and all signs lead to PrinceXML. But you "may not" use their tool for commercial use (i.e. invoices). And their free version has a watermark (and probably metadata that they add). I wonder if there's anyone here who knows how to get it for free (i.e. remove watermark and any additional metadata)? Because let's be honest, nobody is paying $3800 as a small web dev to generate crisp PDFs from HTML.
I've also been thinking about LaTeX to PDF. My web app is in Node.js and I have all the data there, but I could easily programmatically build a LaTeX file from Node.js. I've never used LaTeX myself, ever. But from my understanding it's quite straight forward to generate LaTeX files. And I've heard they have some CLI tool (unofficial or official?) to convert to PDF.
So I wonder if HTML to PDF isn't the best option, how would you dynamically generate PDFs from data you have in your web app/database?
Anyone here ever used PrinceXML? I can't find any videos showing what it looks like as it seems their only customers are large businesses. And has anyone here used LaTeX to create PDFs? That means I'd need to convert my data into LaTeX and then to PDF. All can be done via Node.js by running shell commands to a bash script.
Note: I do not need to use any fancy CSS. Even just text and some table lists is all I need. And my logo (image).
Consider this receipt I made in HTML:
I'm running this through "PrinceXML" CLI tool on Debian:
If I check on this site for EXIF data it tells me it includes a lot of MetaData from PrinceXML: Check files for metadata info (https://www.metadata2go.com)
So I installed "exiftool" to delete any metadata, like this:
And then I run it on my generated PDF:
If I check again on that EXIF website, it now shows no more metadata from PrinceXML. That's fine.
However... they still have their watermark in the top-right corner. And if I check with grep I can still find some info:
So I need a way to:
1) Delete any extra information inside the file about "PrinceXML"
2) Delete their watermark on the PDF.
Any idea how to do this?
I downloaded PrinceXML from their website using the deb package to Debian 12:
www.princexml.com
I refuse to pay thousands of dollars to turn HTML into PDF.
I'm attaching the PDF here with removed metadata using exiftool.
A lot of people resort to using Puppeteer or Chrome in a headless browser. But that's resource heavy and terrible at cold starts. Especially if you're going to print a few thousand PDFs per day.
I'm in need of a way to dynamically generate PDFs in my web app, mainly as invoices but also for some other stuff. I have been thinking of using HTML to PDF. Currently, I've tried Puppeteer and didn't like how much resources it uses. And I've also tried wkhtmltopdf which is deprecated and has some security issues. It works, but like I said, it's unsecure.
So then I researched and all signs lead to PrinceXML. But you "may not" use their tool for commercial use (i.e. invoices). And their free version has a watermark (and probably metadata that they add). I wonder if there's anyone here who knows how to get it for free (i.e. remove watermark and any additional metadata)? Because let's be honest, nobody is paying $3800 as a small web dev to generate crisp PDFs from HTML.
I've also been thinking about LaTeX to PDF. My web app is in Node.js and I have all the data there, but I could easily programmatically build a LaTeX file from Node.js. I've never used LaTeX myself, ever. But from my understanding it's quite straight forward to generate LaTeX files. And I've heard they have some CLI tool (unofficial or official?) to convert to PDF.
So I wonder if HTML to PDF isn't the best option, how would you dynamically generate PDFs from data you have in your web app/database?
Anyone here ever used PrinceXML? I can't find any videos showing what it looks like as it seems their only customers are large businesses. And has anyone here used LaTeX to create PDFs? That means I'd need to convert my data into LaTeX and then to PDF. All can be done via Node.js by running shell commands to a bash script.
Note: I do not need to use any fancy CSS. Even just text and some table lists is all I need. And my logo (image).
Post automatically merged:
Consider this receipt I made in HTML:
HTML:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="author" content="">
<meta name="title" content="">
<meta name="description" content="">
<meta name="keywords" content="">
<title>Receipt</title>
<style>
*, *::after, *::before {
box-sizing: border-box;
margin: 0;
padding: 0;
}
html, body {
font-family: "Arial", Helvetica, sans-serif;
line-height: 1.6;
color: #333;
background-color: #fff;
}
body {
padding: 16px;
max-width: 800px;
margin: 0 auto;
height: 100%;
width: 100%;
overflow: hidden;
}
/* Header */
h1 {
font-size: 24px;
font-weight: 600;
color: #222;
margin-bottom: 10px;
}
.receipt-meta {
color: #292929;
margin-bottom: 30px;
}
.receipt-meta span {
display: block;
font-size: 14px;
margin: 4px 0;
}
/* Table Styles */
.table {
width: 100%;
border-collapse: collapse;
margin-bottom: 20px;
}
th, td {
padding: 5px 20px;
text-align: left;
font-size: 14px;
border-bottom: 1px solid #ddd;
}
th {
padding: 12px 20px;
}
th {
background-color: #f8f8f8;
font-weight: 600;
color: #444;
}
td {
color: #3b3b3b;
}
td span {
font-size: 12px;
color: #888;
}
.total {
font-weight: 700;
}
.total td {
font-size: 14px;
padding: 10px 20px;
}
/* Footer */
.company-info {
margin-top: 40px;
font-size: 14px;
line-height: 1.6;
color: #777;
}
.company-info p {
margin-bottom: 10px;
}
.company-info p:last-child {
margin-bottom: 0;
}
/* Utility Classes */
.align-right {
text-align: right;
}
.align-center {
text-align: center;
}
.text-muted {
color: #999;
}
.bold {
font-weight: 700;
}
.shipping,
.vat-included {
border: none;
}
.shipping {
border-top: 2px solid #333;
}
.receipt-footer {
margin-top: 32px;
color: #292929;
font-size: 12px;
}
</style>
</head>
<body>
<h1>Receipt from Example, LLC.</h1>
<div class="receipt-meta">
<span>Date: <strong>2025-07-25</strong></span>
<span>Receipt number: <strong>1234567890</strong></span>
<span>Order number: <strong>987654</strong></span>
<span>Customer reference: <strong>[email protected]</strong></span>
<span>Payment method: <strong>VISA ************1234</strong></span>
</div>
<table class="table">
<thead>
<tr>
<th>Product</th>
<th>Quantity</th>
<th class="align-right">Amount</th>
</tr>
</thead>
<tbody>
<tr>
<td>Jacket (Green)<br><span>Prod.No: 12345</span></td>
<td>2</td>
<td class="align-right">20.00 EUR</td>
</tr>
<tr>
<td>Jacket (White)<br><span>Prod.No: 22300</span></td>
<td>1</td>
<td class="align-right">15.00 EUR</td>
</tr>
<tr class="shipping">
<td colspan="2" class="align-right">Shipping:</td>
<td class="align-right">8.00 EUR</td>
</tr>
<tr class="vat-included">
<td colspan="2" class="align-right">Amount excl. VAT:</td>
<td class="align-right">34.40 EUR</td>
</tr>
<tr class="vat-included">
<td colspan="2" class="align-right">VAT amount (25.00%):</td>
<td class="align-right">8.70 EUR</td>
</tr>
<tr class="total">
<td colspan="2" class="align-right bold">Total incl. VAT:</td>
<td class="align-right bold">43.00 EUR</td>
</tr>
</tbody>
</table>
<div class="company-info">
<p>
<span class="bold">VAT number:</span> 10001-0001<br>
<span class="bold">Org. number:</span> 1234-5678<br>
<span class="bold">Website:</span> www.example.com<br>
<span class="bold">Contact:</span> [email protected]
</p>
<div class="company-address">
<p>
<span class="bold">Example, LLC.</span><br>
123 Maple Street<br>
90210 Beverly Hills<br>
United States
</p>
</div>
</div>
<p class="receipt-footer">Your satisfaction is important to us. If you need to return your purchase, you have 14 calendar days from the date of receipt. Items must be unused, with all safety tags attached, and in their original packaging. Return shipping is at your expense. For full terms, visit www.example.com/terms. We recommend that you save or print this receipt. This receipt may be requested for returns or claims.</p>
</body>
</html>
I'm running this through "PrinceXML" CLI tool on Debian:
Bash:
cat index.html | prince --input=html --no-network --output="Receipt-987654.pdf" -
If I check on this site for EXIF data it tells me it includes a lot of MetaData from PrinceXML: Check files for metadata info (https://www.metadata2go.com)
So I installed "exiftool" to delete any metadata, like this:
Bash:
sudo apt install libimage-exiftool-perl
And then I run it on my generated PDF:
Bash:
exiftool -all= -overwrite_original Receipt-987654.pdf
If I check again on that EXIF website, it now shows no more metadata from PrinceXML. That's fine.
However... they still have their watermark in the top-right corner. And if I check with grep I can still find some info:
Bash:
$ grep -ai "prince" Receipt-987654.pdf
<</Producer (Prince 16.1 \(www.princexml.com\))
So I need a way to:
1) Delete any extra information inside the file about "PrinceXML"
2) Delete their watermark on the PDF.
Any idea how to do this?
I downloaded PrinceXML from their website using the deb package to Debian 12:
Prince - Download Prince 16
Convert HTML documents to PDF. Beautiful printing with CSS. Support for JavaScript and SVG.
I refuse to pay thousands of dollars to turn HTML into PDF.
I'm attaching the PDF here with removed metadata using exiftool.
Attachments
-
Receipt-987654.pdf34.2 KB · Views: 14 · VirusTotal
Last edited: