Page 1 of 1

Applying higher compression to PDF files

Posted: Wed Jan 13, 2021 6:50 pm
by chrispitude
Hi everyone,

This is a follow-up to

"Using a more recent PDF version for PDF production"
topic21147.html

I have a 1200 page book with many figures. When I compare -Dpdf.version=1.4 to -Dpdf.version=1.5, I see a definite size reduction:

Code: Select all

33036 -rw-rw-rw- 1 doc src 33688601 Jan 13 07:19 o14/ptug.pdf
21612 -rw-rw-rw- 1 doc src 22033988 Jan 13 07:23 o15/ptug.pdf
However, when I compress the PDF with Ghostscript (or any one of the many available PDF compression utilities):

Code: Select all

/depot/Ghostscript/ghostscript-9.53.3/bin/gs -sDEVICE=pdfwrite \
 -dPDFSETTINGS=/printer \
 -dPrinted=false \
 -dCompatibilityLevel=1.5 \
 -dEmbedAllFonts=true \
 -dSubsetFonts=true \
 -dFastWebView=true \
 -dNOPAUSE -dQUIET -dBATCH \
 -sOutputFile='FILE_opt.pdf' \
  'FILE.pdf'
I see that much more reduction is still available:

Code: Select all

33036 -rw-rw-rw- 1 doc src 33688601 Jan 13 07:19 o14/ptug.pdf
14640 -rw-rw-rw- 1 doc src 14925393 Jan 13 07:30 o14/ptug_opt.pdf

21612 -rw-rw-rw- 1 doc src 22033988 Jan 13 07:23 o15/ptug.pdf
14640 -rw-rw-rw- 1 doc src 14925393 Jan 13 07:30 o15/ptug_opt.pdf
Even starting from the PDF v1.5 file, I can still reduce the file size from 22MB to 15MB.

In an Adobe FrameMaker flow, Adobe PDF Distiller (the actual PDF generation engine) provides fine control over PDF compression options. Even with PDF version v1.5 (a big step forward!), PDF Chemistry PDFs are larger than the correspond FrameMaker PDFs.

In our production flow (linux), we have a bash shell batch file that runs Ghostscript as a post-processing operation to get the size reduction.

But when our writers create PDFs on their Windows laptops, there's no easy way of doing that.

There's a neat commercial pure-Java PDF compression utility here:

https://www.pdftron.com/documentation/s ... imizerTest

This would work across all platforms. Or maybe there's an open-source Java PDF compression utility floating around?

Re: Applying higher compression to PDF files

Posted: Thu Jan 14, 2021 3:29 pm
by chrispitude
I also thought about pursuing better compression with the Apache FOP folks, but I wonder if they might want to stay focused on page layout and not get into the intricacies of optimizing PDF data structures. (Or maybe I'm wrong and they'd want to improve in this area too?)

Re: Applying higher compression to PDF files

Posted: Fri Jan 15, 2021 11:24 am
by Dan
I recorded an issue about trying to reduce the size of the PDF files. The PDF Box: https://pdfbox.apache.org/ library can handle PDF and can load and resave them with other compression settings, maybe you can try it.

Re: Applying higher compression to PDF files

Posted: Thu Feb 24, 2022 8:59 am
by ritus
Hello,

Please support higher PDF compression in Oxygen PDF chemistry itself so that we do not have to depend on external tools.
Most of our books are > 1200 pages. Some as big as 5000 pages. The current pdf.version default value of 1.5 does not reduce the PDF size much.

Thanks,
Ritu

Re: Applying higher compression to PDF files

Posted: Thu Feb 24, 2022 12:03 pm
by julien_lacour
Hello,

I added your vote into the PDF compression feature request, normally this feature is planned for the next Oxygen major release (25.0).

Regards,
Julien

Re: Applying higher compression to PDF files

Posted: Fri Mar 04, 2022 3:23 pm
by chrispitude
Hi Julien,

I am happy to hear that this enhancement will get some consideration in Oxygen 25.0!

There are various aspects of PDF compression that could be considered:
  • Compressing streams
  • Merging identical streams
  • Downscaling/compressing images
  • Reordering primitives for fast web viewing
I shared these Ghostscript settings for PDF compression earlier in this discussion:

Code: Select all

gs \
 -sDEVICE=pdfwrite \
 -dPDFSETTINGS=/printer \
 -dPrinted=false \
 -dCompatibilityLevel=1.6 \
 -dEmbedAllFonts=true \
 -dSubsetFonts=true \
 -dFastWebView=true \
 -dNOPAUSE -dQUIET -dBATCH \
 -sOutputFile='output.pdf' \
  'input.pdf'
These settings have given us good file compression with no noticeable loss in quality. I don't claim that these are the best settings - there are many discussions of PDF compression settings for Ghostscript, but this is what I arrived at after some experimentation.

They might provide a good reference point for exploring PDF compression in PDF Chemistry.

Re: Applying higher compression to PDF files

Posted: Tue Jul 05, 2022 1:36 pm
by chrispitude
Hi Julien,

Is PDF compression still being considered for v25?

Thanks!

- Chris

Re: Applying higher compression to PDF files

Posted: Thu Jul 07, 2022 10:15 am
by julien_lacour
Hi Chris,

PDF Compression is still in Oxygen 25.0 roadmap, but at this moment we are still investigating how to reduce the PDFs sizes.
If anything changes, I will update this thread with the latest information.

Regards,
Julien

Re: Applying higher compression to PDF files

Posted: Wed Oct 19, 2022 6:39 pm
by harry43
Open a PDF in Acrobat.
Choose File> Reduce File Size or Compress PDF Online.
Choose the location to save the file and File Save. Acrobat displays a message showing the successful reduction in PDF size.

Re: Applying higher compression to PDF files

Posted: Fri Oct 28, 2022 4:16 pm
by chrispitude
Hi Harry,

Thanks for the suggestion! Writers sometimes do this for larger books when publishing review builds from Oxygen on their Windows laptops. But some of our help collections have dozens of PDFs, and it is not practical to do this for so many files.

Re: Applying higher compression to PDF files

Posted: Fri Nov 11, 2022 1:55 am
by chrispitude
Hi Julien,

Can you let us know the issue ID for PDF compression? Thanks!

Re: Applying higher compression to PDF files

Posted: Fri Nov 11, 2022 9:48 am
by julien_lacour
Hi Chris,

Sure, CH-661 it is.

Regards,
Julien