Encrypting Working Files Locally in Spring Batch

Rik Scarborough Java, Spring, Spring Batch, Technology Snapshot Leave a Comment

It seems that quite often we read stories in the news about computer systems being cracked and data being compromised. It’s a growing concern that should be a consideration for everyone in Information Technology. There is probably not just one solution that will keep all data safe, but hopefully small efforts in many areas will provide us with the best possible solution.

In this post, I show a solution for encrypting sensitive files for local use with Java’s Encryption library & tying directly into Spring Batch readers and writers.

Recently we began writing a Spring Batch application that would handle sensitive data. The application servers were set up with some very good, basic security, but we felt the data could use some extra protection.

The data would be delivered to the company on a well-protected and secure FTP server. Mark Fricke did an excellent post recently on Spring Integration and Spring Batch in which he discusses downloading an encrypted file from a FTP server and decrypting it, see it here.

Unfortunately, this was not exactly the problem we had. We needed to download a unencrypted file, but never write it to the Application Server unencrypted. But, we needed to be able to read that file and process it in Spring Batch.

Using Java’s built-in cryptography, we are able to extend Spring Batch to encrypt the file on the disk and then read that file in a Spring Batch Reader. In addition, we can write the results out as an encrypted file then transfer that file back to the secure FTP server as clean text.

Wow, that sounds like a lot and will be a really complex solution. Actually the code turned out to not be all that complex. This solution relies partly on the Delegate Pattern I wrote about before, so I will be using the same code I developed for that and just showing the changes here. Please refer back to the original post here.

Transfer & Encrypt

In order to transfer and encrypt the data, let’s add a couple of Steps that each contain a Tasklet.

Step #1

Here is the first Step:

@Bean
    public Step encTransferFirstStep(StepBuilderFactory stepBuilderFactory) {
        return stepBuilderFactory.get("encTransferFirstStep").tasklet(new Tasklet() {
            @Override
            public RepeatStatus execute(StepContribution sc, ChunkContext cc) throws Exception {
                String filename = "testfile.txt";
                String path = "./";
                File localFile = new File(path, filename);

                // generate key
                KeyGenerator kgen;
                kgen = KeyGenerator.getInstance("AES");
                kgen.init(128);
                SecretKey aesKey = kgen.generateKey();
                cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().put("inKey", aesKey);

                // Encrypt cipher
                Cipher encryptCipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
                encryptCipher.init(Cipher.ENCRYPT_MODE, aesKey);
                cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().put("inIv", encryptCipher.getIV());

                // setup encrypted output
                FileOutputStream fos = new FileOutputStream(localFile.getAbsoluteFile() + ".enc");
                CipherOutputStream cipherOutputStream = new CipherOutputStream(fos, encryptCipher);
                BufferedOutputStream bos = new BufferedOutputStream(cipherOutputStream);

                // ftp the file to encrypted file
                FTPClient client = new FTPClient();
                client.connect(FTP_URL);
                client.login(FTPUSER, FTP_PW);
                client.enterLocalPassiveMode();
                //client.changeWorkingDirectory("/");
                //client.setFileType(FTP.BINARY_FILE_TYPE);
                boolean retVal = client.retrieveFile(filename, bos);
                logger.info("ftp returned " + retVal);

                bos.flush();
                bos.close();
                cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().put("encFileName", localFile.getAbsoluteFile() + ".enc");

                return RepeatStatus.FINISHED;
            }
        }).build();

For the first Tasklet, I created it right in the Step. In my opinion, this is a fine technique for simple or at least short tasks that we know will never be reused. This puts all the code in the same class as the configuration, so it is easy to find if you need to maintain it. However, if you will be using this type of process across multiple Spring Batch Jobs, or may be across multiple projects, you should make it more generic and move it to its own class.

So what’s going on here?

First we have to figure out what the name of our file is. I hard-code it into this example, but it can be gotten from the parameters, the context, or even by looking it up on our FTP server. We create a File object so we can use File’s methods letter to get path and other information.

Encryption

Next we are going to use Java’s cryptography library to generate a key. While no security is going to be absolutely fool proof, I think this one is going to be pretty strong. We generate the key in the application instead of passing it in, because we will never try to decrypt the file outside of the program, or even outside of this job. This gives us the additional security of having a key that no one else knows.

See Also:  Conditionally Disabling and Filtering Tests in JUnit 5

We then place that key in the Job Context to use later in the application. Now, on some systems that may create a weak point in my security. Spring Batch may be writing the context back to a database somewhere. If your database might be compromised at the same time your application server is compromised, you may want to look for another way to store this. Remember, a cracker would have to have this value, the value of the initialization vector we will discuss in a moment, and the name of the file, for any of them to be useful.

Once we have the key, we create a Cypher that we’ll use to encrypt our file. At this time, we will grab the initialization vector that we used to randomize the encryption. This will be required later to read the file.

Transferring

Now that our encryption is in place, we create a File to write the data to. We are wrapping the BufferedOutputStream around a CipherOutputStream provided for us by Java.

To do the FTP, we are using Apache’s FTP client so that we can set the Output stream we want to use. We can swap out a regular OutputStream with our new encrypted one, meaning we don’t have to download the file and then encrypt it.

Once the file is transmitted, we flush the stream and close it. We specifically call the flush before closing because I was having issues with the stream closing before all the data was written. This caused the encrypted file to be corrupted, but there was no way to just open the file and see that. I spent a great deal of time trying to figure out why my file was corrupted.

Finally, we let the rest of the application know our filename. We added a .enc to the end so we can find it easily later during testing.

Step #2

Ok, let’s skip to the next Step which will take our result file and write it back to the FTP server.

@Bean
    public Step encTransferLastStep(StepBuilderFactory stepBuilderFactory) {
        return stepBuilderFactory.get("encTransferLastStep").tasklet(new Tasklet() {
            @Override
            public RepeatStatus execute(StepContribution sc, ChunkContext cc) throws Exception {
                SecretKey aesKey = (SecretKey) cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().get("inKey");
                byte[] iv = (byte[]) cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().get("inIv");
                Cipher encryptCipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
                encryptCipher.init(Cipher.DECRYPT_MODE, aesKey, new IvParameterSpec(iv));

                File f = new File("outfile.txt.enc");
                FileInputStream fis = new FileInputStream(f);
                CipherInputStream cis = new CipherInputStream(fis, encryptCipher);
                BufferedInputStream bis = new BufferedInputStream(cis);
                FTPClient client = new FTPClient();
                client.connect(FTP_URL);
                client.login(FTPUSER, FTP_PW);
                client.changeWorkingDirectory("/");
                client.setFileType(FTP.BINARY_FILE_TYPE);
                client.storeFile("outfile.txt", bis);

                f.delete();
                String encfilename = (String) cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().get("encFileName");
                File inputFile = new File(encfilename);
                inputFile.delete();
                return RepeatStatus.FINISHED;
            }
        }).build();
    }

We get the key and the initialization vector (IV) back from context and create a new Cipher for reading the encrypted result file. We are using the same key and IV we created on the input file. You could create new ones in the Writer later on if you desire.

This time we use CipherInputStream to read the file. Apache’s FTP client will again allow us to provide a input stream, so we hand it our new encrypted input stream. This will write the file to the secured FTP server as plain text.

Reading & Writing

Now that we have a way to transfer the files around, how to we read and write them. First, we’ll need two new classes. We’ve extended UrlResource and FileSystemResource to provide access to the CipherInputStream and CipherOutputStream and we can use these Resources in our Reader and Writer.

public class CyrptUrlResource extends UrlResource {

    private final Cipher encryptCipher;

    public CyrptUrlResource(URI uri, Cipher encryptCipher) throws MalformedURLException {
        super(uri);
        this.encryptCipher = encryptCipher;
    }

    @Override
    public InputStream getInputStream() throws IOException {
        return new PushbackInputStream(new CipherInputStream(super.getInputStream(), encryptCipher), (2048 * 2048));
    }

}

And

public class CyrptFileSystemResource extends FileSystemResource {

    private Cipher encryptCipher;

    public CyrptFileSystemResource(String path, Cipher encryptCipher) {
        super(path);
        this.encryptCipher = encryptCipher;
    }

    @Override
    public OutputStream getOutputStream() throws IOException {
        return new CipherOutputStream(super.getOutputStream(), encryptCipher);
    }

    @Override
    public InputStream getInputStream() throws IOException {
        return new CipherInputStream(super.getInputStream(), encryptCipher);
    }

}

We created both because we found that in some situations, especially binary files, the UrlResource works better than the File System Resource. Try both and see which one fits your situation better. The CyrptUrlResource contains a PushBackInputStream. We implemented that because we were reading Excel files (a future post) and the reader we were using required that. This can be removed if not needed, or a flag added to signal whether to use it or not. I left it here in case someone else was having a similar issue.

See Also:  AWS AppSync with Lambda Data Sources

To use them, I’m going back to the Reader and Writer in the Delegate post I mention earlier. For the Reader, the change is as simple as adding the following code to the beforeStep:

            final SecretKey aesKey = (SecretKey) stepExecution.getJobExecution().getExecutionContext().get("inKey");
            byte[] iv = (byte[]) stepExecution.getJobExecution().getExecutionContext().get("inIv");
            String encfilename = (String) stepExecution.getJobExecution().getExecutionContext().get("encFileName");
            File f = new File(encfilename);
            Cipher encryptCipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
            encryptCipher.init(Cipher.DECRYPT_MODE, aesKey, new IvParameterSpec(iv));
            CyrptUrlResource cur = new CyrptUrlResource(f.toURI(), encryptCipher);
            delegate.setResource(cur);

We replace the Resource on the delegate with the CyrptUrlResource. The Resource is created using the key and IV we generated earlier in the first Tasklet.

Our writer is much more of a rewrite.

@Configuration
@StepScope
public class BookListWriter implements ItemStreamWriter<List<BookList>> {

    private static final Logger logger = LoggerFactory.getLogger(BookListWriter.class);

    private StepExecution stepExecution;
    private DelimitedLineAggregator<BookList> dla;
    private BufferedOutputStream bos;

    @BeforeStep
    public void beforeStep(StepExecution stepExecution) {
        logger.debug("beforeStep");
        this.stepExecution = stepExecution;
        dla = new DelimitedLineAggregator<>();
        dla.setDelimiter(",");
        BeanWrapperFieldExtractor<BookList> fieldExtractor = new BeanWrapperFieldExtractor<>();
        fieldExtractor.setNames(new String[]{"bookName", "author"});
        dla.setFieldExtractor(fieldExtractor);
    }

    @Override
    public void close() throws ItemStreamException {
        try {
            bos.flush();
            bos.close();
        } catch (IOException ex) {
            logger.error(ex.getMessage(), ex);
            throw new ItemStreamException(ex);
        }

    }

    @Override
    public void open(ExecutionContext ec) throws ItemStreamException {
        try {
            SecretKey aesKey = (SecretKey) stepExecution.getJobExecution().getExecutionContext().get("inKey");
            byte[] iv = (byte[]) stepExecution.getJobExecution().getExecutionContext().get("inIv");
            String encfilename = "outfile.txt.enc";
            File f = new File(encfilename);
            Cipher encryptCipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
            encryptCipher.init(Cipher.ENCRYPT_MODE, aesKey, new IvParameterSpec(iv));
            CyrptFileSystemResource cur = new CyrptFileSystemResource(f.getAbsolutePath(), encryptCipher);
            bos = new BufferedOutputStream(cur.getOutputStream());
        } catch (NoSuchAlgorithmException | NoSuchPaddingException | InvalidKeyException | InvalidAlgorithmParameterException | IOException ex) {
            logger.error(ex.getMessage(), ex);
            throw new ItemStreamException(ex);
        }

    }

    @Override
    public void update(ExecutionContext ec) throws ItemStreamException {
    }

    @Override
    public void write(List<? extends List<BookList>> list) throws Exception {
        logger.info("write");
        for (List<BookList> bookList : list) {
            for (BookList book : bookList) {
                String line = dla.aggregate(book) + "\n";
                bos.write(line.getBytes());
            }
        }
    }
}

We move the Aggregator to the class level because we are going to use it later. Also, we capture the StepExecution for later use. A new BufferedOutputStream is created that will actually do the writing. We can no longer use the flat file writer because we are now dealing with a binary file.

The Open Method

So let’s jump over close and talk about the open method. Here we once again get our key and IV from the context and use them to create our Cipher. Our Cipher is then used to create a File Resource which is in turn used to create our BufferedOutputStream. Since we are not using a built-in Writer as a Delegate anymore, we could have skipped the Resource and created our OutputStream for BufferedOutputStream to wrap right here, but we have it and might as well use it.

We now have a file ready to write our encrypted data. The update method no longer has anything to do, so we leave it blank and move on to the write. As our Processor created a list of BookLists, and Spring Batch hands the output of the Reader/Processor to the Writer as a list, we have a list of lists to loop through. We use our aggregator, which is unchanged from the original version, to put together our output string, but don’t forget the end of line character.

Once all our data has been written out, we come back to our close method. Again, we call the flush method before closing to ensure that all the data is written before moving on.

Final Thoughts

That is the complete change we made to provide a little more security to the data we are passing around. While no single security measure can provide a 100% guarantee of safety, the more layers you add, the more secure you become. This solution provides a easy to maintain layer.

Believe in good code.

What Do You Think?