Welcome to LEAD Support Forum Login | Register | Faq  

    LEAD Support Forum
  Resource to find answers and post technical questions about LEAD products.
Search    
   

Merging PDFs loses pages
Started by keithway at 11-17-2008 12:44. Topic has 5 replies.

Print Search « Previous Thread
  11-17-2008, 12:44
keithway is not online. Last active: 11/17/2008 4:57:14 PM keithway

Top 50 Posts
Joined on 01-09-2007
Posts 64
Merging PDFs loses pages

Attachment: LT PDF Merge Demo.zip
Reply Quote
Hello, I am using Leadtools for .NET (Document Imaging, PDF Read/Save) version 15.0.1.0

I am having a problem loading multiple PDF files and saving (merging) them into a single .tif file.

I have built a demo application to let you guys see what is going on and it is attached to this post.

Here is how to reproduce...


1) Unzip the attached .zip file. You will find the following files in the first directory...

40pages.pdf - this is the test file I have been using to reproduce this issue. Note: It is not a big file.. only 90K.

Bin Directory.JPG - This is just a picture of what Leadtools files I had in my Bin directory. Note: The exact version numbers are present in this photo. You will need to have the Leadtools.dll and Leadtools.Codecs.dll referenced in your project. The codecs and PDF folder will just need to be placed in your bin.

LT PDF Merge Demo.sln - VB.Net solution file for the demo application

LT PDF Merge Demo (Folder) - Project files for the demo application.

2) Re-add the LT unlock codes. You will find the sub that unlocks the licenses in Common.vb under the namespace 'Initialization' and the Sub 'Initialize_Licenses'

3) Re-add the missing LT references in your project. Take a look at Bin Directory.jpg as described above for exact version numbers.

4) Re-add your codec files as well as the PDF directory.

5) Build and run the demo application.

6) Use the 'source files' list box to select the 40pages.pdf file that is included in this demo. NOTE: Depending on how much RAM you have on your machine you may need to take additional steps as outlined below in 'Additional Info'

7) You may leave the destination file name alone or select a new name.

8) Click on the merge button. All controls will be disabled until the merge is complete. Once all controls are enabled again, you can then go check your destination file.

9) Check the number of pages in your destination file. You can do this either by 'right-clicking' on the image file, choosing properties, selecting the summary tab and then clicking the 'advanced' button, or you can open the image up in a viewer that supports multiple pages.

If you used the 40pages.pdf file included in this demo and the resulting file still has 40 pages, please read 'additional info' below to see how to reproduce this, if you have less than 40 pages, you have already reproduced the issue.

ADDITIONAL INFO:

While trouble shooting this problem, I have uncovered a few things...

1) The number of source files needed to reproduce this issue seems to vary depending on the amount of RAM you have on your machine. For example, I have run this on a machine that only has a half a Gig of RAM and I was able to reproduce this using only the 40 page pdf provided. However, on a machine with a full Gig of RAM I had to make a copy of the 40 page pdf and then input '40pages.pdf' and 'Copy of 40pages.pdf' as my source files to reproduce. If you are having trouble reproducing this, I suggest making 3 or 4 additional copies and inputing them all as the source files.

2) The problem here seems to be with the
RasterCodecs object.

Note: This example assumes I am on a machine with 1 gig of RAM and I have two input files... 40pages.pdf and Copy of 40pages.pdf

On line 57 of Form1.vb in the MergeFiles function I load each individual source image like this....

' Load the image
RasImage = MergeCodecs.Load(LocalPath)

Well for the FIRST image, if I check the pagecount of the RasImage object the codecs returned, the pagecount is correct and is 40. However, when the loop returns to load the SECOND image, the merge codecs does load the file (or part of it) and returns a RasImage object, but the pagecount is off. In my testing it would repeatedly return a pagecount of 12 even though I knew the source image was 40 pages.

To me it seems the RasterCodecs object may be running out of memory.

3) If you do not close the demo application before running another test, it seems that the RasterCodecs object will eventually run out of memory no matter how many source files you started with.

Note: This may be due to me not calling the RasterCodecs.Shutdown method, however, I have been able to reproduce losing pages on the first run of the demo repeatedly.

4) This only seems to be happening when using PDFs as my source files, I have been able to run successful tests using this same code and having .tif files as my source files with no problems.

   Report 
  11-17-2008, 17:47
jigar is not online. Last active: 12/12/2008 5:08:42 PM jigar



Top 10 Posts
Joined on 08-22-2007
Posts 479
Re: Merging PDFs loses pages
Reply Quote
When dealing with large (in terms of DPI and page count) PDFs you should load a certain number of pages at a time.  PDFs are rasterized when they are loaded.  the pages in the PDF are 24"x32.32" and you are loading them at 100dpi 24bits/pixel, then each page will need ~23.3mb of memory.  With 40 pages that is ~930.8mb that you need.  So the solution is to load maybe 5 pages at a time.

LEADTOOLS Technical Support
   Report 
  11-18-2008, 14:43
keithway is not online. Last active: 11/17/2008 4:57:14 PM keithway

Top 50 Posts
Joined on 01-09-2007
Posts 64
Re: Merging PDFs loses pages

Attachment: 5 pages.zip
Reply Quote
Im sorry but that solution just will not work.

It does not solve the fact that the rastercodecs object is not freeing the memory that it is using. To prove this I am attaching a 5 page pdf test file to this post. If you make 10-15 copies of the file and use all of them as input files in the demo that I originally posted you will see that pages are still being lost. With 10 copies of the file, the resulting image should be 50 pages. While testing here on a maching that has 1 gig of RAM I am only getting 37 pages.

Furthermore, if you use multiple 40 page pdf test files in the demo app - the rastercodecs object is able to load the first one correctly - it isnt until the second (or third in some cases) file attempts to load before there are problems. This should further prove that the RasterCodecs object is not freeing up memory correctly. If it can load and save one 40 page file, it should be able to load and save a second copy of the same file.
   Report 
  11-19-2008, 17:23
jigar is not online. Last active: 12/12/2008 5:08:42 PM jigar



Top 10 Posts
Joined on 08-22-2007
Posts 479
Re: Merging PDFs loses pages
Reply Quote
Our RasterCodecs object is releasing the handle to the memory it was using, but the garbage collector is not freeing it.  At the end of the foreach loop, call GC.Collect().  This will force the garbage collector to free the memory.  I tested it out and it seems to be working fine.

LEADTOOLS Technical Support
   Report 
  01-05-2009, 17:03
keithway is not online. Last active: 11/17/2008 4:57:14 PM keithway

Top 50 Posts
Joined on 01-09-2007
Posts 64
Re: Merging PDFs loses pages
Reply Quote
Forcing the garbage collector to run seems to slow the problem down but not fix it... I am still able to lose pages even with this solution.

Please test again on your end using more source pages and you should see similar results.
   Report 
  Yesterday, 13:43
jigar is not online. Last active: 12/12/2008 5:08:42 PM jigar



Top 10 Posts
Joined on 08-22-2007
Posts 479
Re: Merging PDFs loses pages
Reply Quote
Ok, I re-tested with the same 40p PDF and I lost some pages when I merged 6 of them.  I only got 198 pages instead of the expected 240.  Now, if I call RasImage.Dispose() after I'm done with the RasImage object then it doesn't loose any pages.  I tested this on a 1GB machine.  The RasterImage object holds the image data in unmanaged memory so you should call Dispose() on it to free the memory it uses.  Try it out on your end and let me know how it goes.

LEADTOOLS Technical Support
   Report 
Post
LEAD Support Fo... » Developer » PDF » Merging PDFs loses pages

Powered by Community Server, by Telligent Systems