Applying higher compression to PDF files

Are you missing a feature? Request its implementation here.
chrispitude
Posts: 584
Joined: Thu May 02, 2019 2:32 pm

Applying higher compression to PDF files

Post by chrispitude »

Hi everyone,

This is a follow-up to

"Using a more recent PDF version for PDF production"
topic21147.html

I have a 1200 page book with many figures. When I compare -Dpdf.version=1.4 to -Dpdf.version=1.5, I see a definite size reduction:

Code: Select all

33036 -rw-rw-rw- 1 doc src 33688601 Jan 13 07:19 o14/ptug.pdf
21612 -rw-rw-rw- 1 doc src 22033988 Jan 13 07:23 o15/ptug.pdf
However, when I compress the PDF with Ghostscript (or any one of the many available PDF compression utilities):

Code: Select all

/depot/Ghostscript/ghostscript-9.53.3/bin/gs -sDEVICE=pdfwrite \
 -dPDFSETTINGS=/printer \
 -dPrinted=false \
 -dCompatibilityLevel=1.5 \
 -dEmbedAllFonts=true \
 -dSubsetFonts=true \
 -dFastWebView=true \
 -dNOPAUSE -dQUIET -dBATCH \
 -sOutputFile='FILE_opt.pdf' \
  'FILE.pdf'
I see that much more reduction is still available:

Code: Select all

33036 -rw-rw-rw- 1 doc src 33688601 Jan 13 07:19 o14/ptug.pdf
14640 -rw-rw-rw- 1 doc src 14925393 Jan 13 07:30 o14/ptug_opt.pdf

21612 -rw-rw-rw- 1 doc src 22033988 Jan 13 07:23 o15/ptug.pdf
14640 -rw-rw-rw- 1 doc src 14925393 Jan 13 07:30 o15/ptug_opt.pdf
Even starting from the PDF v1.5 file, I can still reduce the file size from 22MB to 15MB.

In an Adobe FrameMaker flow, Adobe PDF Distiller (the actual PDF generation engine) provides fine control over PDF compression options. Even with PDF version v1.5 (a big step forward!), PDF Chemistry PDFs are larger than the correspond FrameMaker PDFs.

In our production flow (linux), we have a bash shell batch file that runs Ghostscript as a post-processing operation to get the size reduction.

But when our writers create PDFs on their Windows laptops, there's no easy way of doing that.

There's a neat commercial pure-Java PDF compression utility here:

https://www.pdftron.com/documentation/s ... imizerTest

This would work across all platforms. Or maybe there's an open-source Java PDF compression utility floating around?
chrispitude
Posts: 584
Joined: Thu May 02, 2019 2:32 pm

Re: Applying higher compression to PDF files

Post by chrispitude »

I also thought about pursuing better compression with the Apache FOP folks, but I wonder if they might want to stay focused on page layout and not get into the intricacies of optimizing PDF data structures. (Or maybe I'm wrong and they'd want to improve in this area too?)
Dan
Posts: 500
Joined: Mon Feb 03, 2003 10:56 am

Re: Applying higher compression to PDF files

Post by Dan »

I recorded an issue about trying to reduce the size of the PDF files. The PDF Box: https://pdfbox.apache.org/ library can handle PDF and can load and resave them with other compression settings, maybe you can try it.
ritus
Posts: 4
Joined: Thu Feb 24, 2022 8:28 am

Re: Applying higher compression to PDF files

Post by ritus »

Hello,

Please support higher PDF compression in Oxygen PDF chemistry itself so that we do not have to depend on external tools.
Most of our books are > 1200 pages. Some as big as 5000 pages. The current pdf.version default value of 1.5 does not reduce the PDF size much.

Thanks,
Ritu
julien_lacour
Posts: 303
Joined: Wed Oct 16, 2019 3:47 pm

Re: Applying higher compression to PDF files

Post by julien_lacour »

Hello,

I added your vote into the PDF compression feature request, normally this feature is planned for the next Oxygen major release (25.0).

Regards,
Julien
chrispitude
Posts: 584
Joined: Thu May 02, 2019 2:32 pm

Re: Applying higher compression to PDF files

Post by chrispitude »

Hi Julien,

I am happy to hear that this enhancement will get some consideration in Oxygen 25.0!

There are various aspects of PDF compression that could be considered:
  • Compressing streams
  • Merging identical streams
  • Downscaling/compressing images
  • Reordering primitives for fast web viewing
I shared these Ghostscript settings for PDF compression earlier in this discussion:

Code: Select all

gs \
 -sDEVICE=pdfwrite \
 -dPDFSETTINGS=/printer \
 -dPrinted=false \
 -dCompatibilityLevel=1.6 \
 -dEmbedAllFonts=true \
 -dSubsetFonts=true \
 -dFastWebView=true \
 -dNOPAUSE -dQUIET -dBATCH \
 -sOutputFile='output.pdf' \
  'input.pdf'
These settings have given us good file compression with no noticeable loss in quality. I don't claim that these are the best settings - there are many discussions of PDF compression settings for Ghostscript, but this is what I arrived at after some experimentation.

They might provide a good reference point for exploring PDF compression in PDF Chemistry.
chrispitude
Posts: 584
Joined: Thu May 02, 2019 2:32 pm

Re: Applying higher compression to PDF files

Post by chrispitude »

Hi Julien,

Is PDF compression still being considered for v25?

Thanks!

- Chris
julien_lacour
Posts: 303
Joined: Wed Oct 16, 2019 3:47 pm

Re: Applying higher compression to PDF files

Post by julien_lacour »

Hi Chris,

PDF Compression is still in Oxygen 25.0 roadmap, but at this moment we are still investigating how to reduce the PDFs sizes.
If anything changes, I will update this thread with the latest information.

Regards,
Julien
Post Reply