Welcome Guest, Not a member yet? Create Account  


Extracting Elevation Data from PDF files (Contours, Spot Levels etc...)

#1
(This post was last modified: 01-30-2017, 10:51 AM by Ted Woods. Edit Reason: Spelling correction )

I have been looking into the process of extracting elevation data from PDF files into CAD data for use Kubla's takeoff module.

There are a number of issues that come up immediately as PDF files are not designed to store technical meta data. So technical complications often arise during the process. It should not be attempted with the expectation of perfect results every time. 

However it is a huge time saver potentially.  I have identified the following as items we could extract : 

  • Continuos Lines (Outlines, Breaks lines, Contour lines) : Perhaps the easiest element to extract from a PDF is a continous line (i.e one that is not dashed or dotted). These can be loaded into Kubla Cubed as contour lines, break lines or outlines. It is important to realise that the PDF lines contain no elevation information so the elevations will need to be corrected by adjusting the polylines Z property in a CAD program or adjusting the elevation features level in Kubla Cubed.

  • Dashed Lines (Outlines, Breaks lines, Contour lines) : Dashed lines often cause problems as they usually get converted to a polyline for each dash. Converting these dash segments back into polylines in CAD is so time consuming it makes the whole process counter productive. The more advanced conversion tools can have special functionality for handling this requirement and can often extract a dashed line as single polyline.

  • Crossed lines (Points): In PDF files point levels are often marked with crosses, there is no concept of a CAD 'point' in a PDF file. Kubla Cubed can extract points from crosses in CAD files so the positions of point levels can often be extracted. Again no elevation will be contained in the points extracted from crossed lines. However Kubla Cubed has the ability to extract elevations from nearby text. If text is also successfully extracted from the PDF then both position and elevation of points can be extracted.


  • Text (Point Elevations): The CAD importer in Kubla Cubed has the ability to match points with nearby text that may contain elevation information. This means that if we can extract text from the PDF file and the crosses of points we can import points with their elevations into Kubla Cubed. A frequent problem with this is that some programs save text into PDF files as collections of lines rather than as a text entity (this is usually when a non truetype font has been used). Again some of the more advanced conversion software can try to handle this and convert the polylines representing text back into a CAD text element.
So far I have been experimenting in just extracting contour lines (that are not dashed) and have had some mixed results using the freeware program InkScape.  It seems if the PDF is too complex the DXF export often fails.  However I have heard of better software to do this like :

Able2Extract Standard
www.investintech.com

Able2Extract Pro
www.investintech.com

PDF2DWG
www.dotsoft.com

Print2CAD
http://www.backtocad.com/


Has anyone else got experience with these?  To me it seems Print2CAD or Back2CAD as it is also known are the market leaders in this area.  They do regular seminars and can do things like convert dashed lines into CAD polylines and extract text using Optical Character Recognition (OCR).
Reply


Messages In This Thread
Extracting Elevation Data from PDF files (Contours, Spot Levels etc...) - by Ted Woods - 01-27-2017, 02:50 PM



Users browsing this thread:
1 Guest(s)