Milestone release of Open Source Java tool for working with PDF documents features dozens of improvements and enhancements
Forest Hill, MD —21 March 2016— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® PDFBox™ v2.0, the Open Source Java tool for working with Portable Document Format (PDF) documents.
PDF was first released by Adobe Systems in 1993, and became an ISO International Standard – ISO 32000-1 in 2008. Apache PDFBox allows for the creation of new PDF documents, manipulation, rendering, signing of existing documents and the ability to extract content from documents. In addition, PDFBox includes several command line utilities. In February 2015, the project became the first Open Source Partner Organization of the PDF Association.
"PDF is a very popular and easy to use format for document exchange. It is used by millions of people every day, however the format itself is quite complicated and a real challenge to write a piece of software to work with it," said Andreas Lehmkühler, Vice President of Apache PDFBox. "This new major release of PDFBox includes a lot of improvements, fixes and new features which should make the life easier for our users."
Under The Hood
The Apache PDFBox library enables users to create new PDF documents, manipulate existing documents, extract content, digitally sign, print, and validate files against the PDF/A-1b standard. Its command line utilities include encrypt, decrypt, overlay, debugger, merger, PDFToImage, and TextToPDF.
PDFBox v2.0 reflects 1,167 solved issues, 418 of which were back-ported to v1.8, as well as dozens of improvements and enhancements. Highlights include:
- improved rendering and text extraction
- Unicode support for PDF creation
- overhauled interactive forms support
- extended signing and encryption support
- overhauled parser including a self-healing mechanism for malformed or corrupted PDFs
- reduced memory/resources footprint including fine grained control of memory usage
- enhanced preflight module for PDF/A-1b conformance checking
- rearranged package structure to allow smaller runtime environments
A guide to migrating to v2.0 is available at http://pdfbox.apache.org/2.0/migration.html , with community support at http://pdfbox.apache.org/mailinglists.html
"We thank all the people from our small but fine community for their support," explained Lehmkühler. "Special thanks also goes to our fellow colleagues from the Apache Tika project for their cooperation in stress-testing with a corpus of 250,000 PDF files."
"We are grateful for the Google Summer of Code program," said PDFBox committer Tilman Hausherr. "The project allowed us to hire students to improve 3D rendering and the PDFDebugger stand-alone application, which also sped up our own bug finding."
"Apache PDFBox v2.0 is a significant milestone as it took us several years to complete," added Lehmkühler. "This long-awaited release is the collective achievement of more than 150 individuals who have contributed code to date. Without their frequent contributions it wouldn’t be possible to drive a project like PDFBox."
Availability and Oversight
Apache PDFBox software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project’s day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache PDFBox, visit http://pdfbox.apache.org/
About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server –the world’s most popular Web server software. Through the ASF’s meritocratic process known as "The Apache Way," more than 550 individual Members and 5,300 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation’s official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF
© The Apache Software Foundation. "Apache", "Apache PDFBox", "PDFBox", "ApacheCon", and their logos are registered trademarks or trademarks of The Apache Software Foundation in the U.S. and/or other countries. All other brands and trademarks are the property of their respective owners.
# # #