Using Apache POI With Protected Excel Files

Jonny Hackett Java, Problem Solving, Spring Batch, Technology Snapshot 1 Comment

While working on a recent project at a client, we had the opportunity to refactor some data extracts that were using a commercial Excel writing library, which we then converted to using the Apache POI Library for Excel. These data extracts were reports that included some calculated values, and depending on the client, were required to be password protected. When completed, the reports would be emailed to the recipients configured for each client.

In this post, we discuss the challenge of delivering protected Microsoft documents via email. We introduce a Java code solution for emailing password-protected Excel files when using the Apache POI Library.

Some of the required calculations we chose to implement using Excel formulas. Implementing formulas wasn’t a hard task and worked for what was needed.

Challenge

During testing, we found an issue with the Excel documents that were delivered via email. When the end user received these Excel-based reports via email and saved them locally, they were initially viewed in a “protected” mode. When opened, the following message was displayed at the top of the Excel file:

Unfortunately, the formulas weren’t executed or available for view until the end user clicked the button to “Enable Editing.” Those values for the resulting formulas all displayed as zero or empty cells, which made it appear as though the data was missing or incorrect. The end user didn’t know that they need to Enable Editing before the report would display the data correctly.

There was a potential way to work around the issue, but required the end user to modify their security settings in order to disable the protected view that Microsoft has enabled.

See Also:  The Wonderful Wide World of webpack: Unpacking Awesomeness

This error wasn’t noticed until later in testing because if you open the file locally where it was created, you aren’t opening the Excel spreadsheet in protected view mode. Since the Excel report is being delivered to clients via Email, when you save that email attachment Microsoft by default protects the end user from unintentionally running a virus or harmful code. You can read more about it here.

Password Protecting & Encrypting

The other requirement was fairly straightforward: password protecting and encrypting the Excel spreadsheet.

Because we might use this same feature in future projects and these reports were being generated using Spring Batch, we created a simple Spring Batch Tasklet to handle this.

Here’s the code listing:

public class ExcelPoiPasswordProtectTasklet implements Tasklet
{

    public static final Logger LOGGER = LoggerFactory.getLogger(ExcelPoiPasswordProtectTasklet.class);

    @Override
    public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception
    {
        ReportParams jobParams = (ReportParams) ExecutionContextUtils.getJobParams(chunkContext.getStepContext().getStepExecution());

        if (StringUtils.equals("Y", jobParams.getClientHoldingsExtractConfig().getPasswordProtected()))
        {
            String fileName = jobParams.getOutboundDirectory() + jobParams.getOutboundFileName();
            File file = new File(fileName);
            if (file.exists())
            {
                FileOutputStream fileOut = null;

		//sets the password that will be used for protecting the workbook
                Biff8EncryptionKey.setCurrentUserPassword(jobParams.getOutputFilePassword());
                HSSFWorkbook wb = null;

		// Read the workbook using POI's NPOIFSFileSystem
                try (NPOIFSFileSystem fs = new NPOIFSFileSystem(file, true))
                {
                    wb = new HSSFWorkbook(fs.getRoot(), true);
	            
		    //create the fileout stream 
                    fileOut = new FileOutputStream(file.getAbsoluteFile());

		    //call the function that write protects the workbook
                    wb.writeProtectWorkbook(Biff8EncryptionKey.getCurrentUserPassword(), "");

		    //finish writing out the workbook
                    wb.write(fileOut);
                }
                catch (Exception e)
                {
                    LOGGER.error("Error protecting and encrypting Excel workbook");
                    e.printStackTrace(System.out);
                }
                finally
                {
	            //make sure the workbook and fileout stream is closed
                    if (wb != null)
                    {
                        wb.close();
                    }
                    if (fileOut != null)
                    {
                        fileOut.close();
                    }

                }

		// VERY IMPORTANT, RESET THE ENCRYPTION KEY WHEN DONE
                Biff8EncryptionKey.setCurrentUserPassword(null);
            }
            else
            {
                LOGGER.error("Extract file {0} not found, unable to perform password protection.", file.getAbsolutePath());
                throw new IllegalStateException("Extract file {0} not found, unable to perform password protection.");
            }
        }
        else
        {
            LOGGER.info("Extract {0} not configured for password encryption.", jobParams.getReportId());
        }

        return null;
    }

}

The first several lines of code in the tasklet deal with getting the job parameters from the Spring Batch ExecutionContext, determining if this report should be password protected, and making sure the specified Excel file exists.

See Also:  Go Forth and AppSync!

I’ve added comments for the actual code that handles reading the workbook, setting the password, writing out the encrypted workbook, and tidying things up. It’s very important that you reset the Biff8EncryptionKey to null when done.

Now you’ve got a simple tasklet that you can plug into any Excel writing Spring Batch job to password protect an Excel file. Just wire it into a step after the step that produces the Excel file.

Final Thoughts

For simple Excel documents, I would personally prefer to use an open source library such as the JExcel API. However, due to the requirements of this project (having Excel documents that were password protected and the desire to utilize an open source library), the only available option at this time is the Apache POI library.

In the end, it wasn’t technically difficult to enable password protection. Other than not being able to use formulas within the spreadsheet for email-delivered documents, I can honestly say that POI is a widely used and well-maintained library for working with Excel documents.

Comments 1

What Do You Think?