It seems that quite often we read stories in the news about computer systems being cracked and data being compromised. It’s a growing concern that should be a consideration for everyone in Information Technology. There is probably not just one solution that will keep all data safe, but hopefully small efforts in many areas will provide us with the best possible solution.
In this post, I show a solution for encrypting sensitive files for local use with Java’s Encryption library & tying directly into Spring Batch readers and writers.
Recently we began writing a Spring Batch application that would handle sensitive data. The application servers were set up with some very good, basic security, but we felt the data could use some extra protection.
The data would be delivered to the company on a well-protected and secure FTP server. Mark Fricke did an excellent post recently on Spring Integration and Spring Batch in which he discusses downloading an encrypted file from a FTP server and decrypting it, see it here.
Unfortunately, this was not exactly the problem we had. We needed to download a unencrypted file, but never write it to the Application Server unencrypted. But, we needed to be able to read that file and process it in Spring Batch.
Using Java’s built-in cryptography, we are able to extend Spring Batch to encrypt the file on the disk and then read that file in a Spring Batch Reader. In addition, we can write the results out as an encrypted file then transfer that file back to the secure FTP server as clean text.
Wow, that sounds like a lot and will be a really complex solution. Actually the code turned out to not be all that complex. This solution relies partly on the Delegate Pattern I wrote about before, so I will be using the same code I developed for that and just showing the changes here. Please refer back to the original post here.
Transfer & Encrypt
In order to transfer and encrypt the data, let’s add a couple of Steps
that each contain a Tasklet
.
Step #1
Here is the first Step
:
@Bean public Step encTransferFirstStep(StepBuilderFactory stepBuilderFactory) { return stepBuilderFactory.get("encTransferFirstStep").tasklet(new Tasklet() { @Override public RepeatStatus execute(StepContribution sc, ChunkContext cc) throws Exception { String filename = "testfile.txt"; String path = "./"; File localFile = new File(path, filename); // generate key KeyGenerator kgen; kgen = KeyGenerator.getInstance("AES"); kgen.init(128); SecretKey aesKey = kgen.generateKey(); cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().put("inKey", aesKey); // Encrypt cipher Cipher encryptCipher = Cipher.getInstance("AES/CBC/PKCS5Padding"); encryptCipher.init(Cipher.ENCRYPT_MODE, aesKey); cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().put("inIv", encryptCipher.getIV()); // setup encrypted output FileOutputStream fos = new FileOutputStream(localFile.getAbsoluteFile() + ".enc"); CipherOutputStream cipherOutputStream = new CipherOutputStream(fos, encryptCipher); BufferedOutputStream bos = new BufferedOutputStream(cipherOutputStream); // ftp the file to encrypted file FTPClient client = new FTPClient(); client.connect(FTP_URL); client.login(FTPUSER, FTP_PW); client.enterLocalPassiveMode(); //client.changeWorkingDirectory("/"); //client.setFileType(FTP.BINARY_FILE_TYPE); boolean retVal = client.retrieveFile(filename, bos); logger.info("ftp returned " + retVal); bos.flush(); bos.close(); cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().put("encFileName", localFile.getAbsoluteFile() + ".enc"); return RepeatStatus.FINISHED; } }).build();
For the first Tasklet
, I created it right in the Step
. In my opinion, this is a fine technique for simple or at least short tasks that we know will never be reused. This puts all the code in the same class as the configuration, so it is easy to find if you need to maintain it. However, if you will be using this type of process across multiple Spring Batch Jobs
, or may be across multiple projects, you should make it more generic and move it to its own class.
So what’s going on here?
First we have to figure out what the name of our file is. I hard-code it into this example, but it can be gotten from the parameters, the context, or even by looking it up on our FTP server. We create a File
object so we can use File’s methods letter
to get path and other information.
Encryption
Next we are going to use Java’s cryptography library to generate a key. While no security is going to be absolutely fool proof, I think this one is going to be pretty strong. We generate the key in the application instead of passing it in, because we will never try to decrypt the file outside of the program, or even outside of this job
. This gives us the additional security of having a key that no one else knows.
We then place that key in the Job Context
to use later in the application. Now, on some systems that may create a weak point in my security. Spring Batch may be writing the context back to a database somewhere. If your database might be compromised at the same time your application server is compromised, you may want to look for another way to store this. Remember, a cracker would have to have this value, the value of the initialization vector we will discuss in a moment, and the name of the file, for any of them to be useful.
Once we have the key, we create a Cypher
that we’ll use to encrypt our file. At this time, we will grab the initialization vector that we used to randomize the encryption. This will be required later to read the file.
Transferring
Now that our encryption is in place, we create a File to write the data to. We are wrapping the BufferedOutputStream
around a CipherOutputStream
provided for us by Java.
To do the FTP, we are using Apache’s FTP client so that we can set the Output
stream we want to use. We can swap out a regular OutputStream
with our new encrypted one, meaning we don’t have to download the file and then encrypt it.
Once the file is transmitted, we flush
the stream and close it. We specifically call the flush
before closing because I was having issues with the stream closing before all the data was written. This caused the encrypted file to be corrupted, but there was no way to just open the file and see that. I spent a great deal of time trying to figure out why my file was corrupted.
Finally, we let the rest of the application know our filename. We added a .enc
to the end so we can find it easily later during testing.
Step #2
Ok, let’s skip to the next Step
which will take our result file and write it back to the FTP server.
@Bean public Step encTransferLastStep(StepBuilderFactory stepBuilderFactory) { return stepBuilderFactory.get("encTransferLastStep").tasklet(new Tasklet() { @Override public RepeatStatus execute(StepContribution sc, ChunkContext cc) throws Exception { SecretKey aesKey = (SecretKey) cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().get("inKey"); byte[] iv = (byte[]) cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().get("inIv"); Cipher encryptCipher = Cipher.getInstance("AES/CBC/PKCS5Padding"); encryptCipher.init(Cipher.DECRYPT_MODE, aesKey, new IvParameterSpec(iv)); File f = new File("outfile.txt.enc"); FileInputStream fis = new FileInputStream(f); CipherInputStream cis = new CipherInputStream(fis, encryptCipher); BufferedInputStream bis = new BufferedInputStream(cis); FTPClient client = new FTPClient(); client.connect(FTP_URL); client.login(FTPUSER, FTP_PW); client.changeWorkingDirectory("/"); client.setFileType(FTP.BINARY_FILE_TYPE); client.storeFile("outfile.txt", bis); f.delete(); String encfilename = (String) cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().get("encFileName"); File inputFile = new File(encfilename); inputFile.delete(); return RepeatStatus.FINISHED; } }).build(); }
We get the key and the initialization vector (IV) back from context and create a new Cipher
for reading the encrypted result file. We are using the same key and IV we created on the input file. You could create new ones in the Writer
later on if you desire.
This time we use CipherInputStream
to read the file. Apache’s FTP client will again allow us to provide a input stream, so we hand it our new encrypted input stream. This will write the file to the secured FTP server as plain text.
Reading & Writing
Now that we have a way to transfer the files around, how to we read and write them. First, we’ll need two new classes. We’ve extended UrlResource
and FileSystemResource
to provide access to the CipherInputStream
and CipherOutputStream
and we can use these Resources in our Reader
and Writer
.
public class CyrptUrlResource extends UrlResource { private final Cipher encryptCipher; public CyrptUrlResource(URI uri, Cipher encryptCipher) throws MalformedURLException { super(uri); this.encryptCipher = encryptCipher; } @Override public InputStream getInputStream() throws IOException { return new PushbackInputStream(new CipherInputStream(super.getInputStream(), encryptCipher), (2048 * 2048)); } }
And
public class CyrptFileSystemResource extends FileSystemResource { private Cipher encryptCipher; public CyrptFileSystemResource(String path, Cipher encryptCipher) { super(path); this.encryptCipher = encryptCipher; } @Override public OutputStream getOutputStream() throws IOException { return new CipherOutputStream(super.getOutputStream(), encryptCipher); } @Override public InputStream getInputStream() throws IOException { return new CipherInputStream(super.getInputStream(), encryptCipher); } }
We created both because we found that in some situations, especially binary files, the UrlResource
works better than the File System Resource
. Try both and see which one fits your situation better. The CyrptUrlResource
contains a PushBackInputStream
. We implemented that because we were reading Excel files (a future post) and the reader we were using required that. This can be removed if not needed, or a flag added to signal whether to use it or not. I left it here in case someone else was having a similar issue.
To use them, I’m going back to the Reader
and Writer
in the Delegate post I mention earlier. For the Reader
, the change is as simple as adding the following code to the beforeStep:
final SecretKey aesKey = (SecretKey) stepExecution.getJobExecution().getExecutionContext().get("inKey"); byte[] iv = (byte[]) stepExecution.getJobExecution().getExecutionContext().get("inIv"); String encfilename = (String) stepExecution.getJobExecution().getExecutionContext().get("encFileName"); File f = new File(encfilename); Cipher encryptCipher = Cipher.getInstance("AES/CBC/PKCS5Padding"); encryptCipher.init(Cipher.DECRYPT_MODE, aesKey, new IvParameterSpec(iv)); CyrptUrlResource cur = new CyrptUrlResource(f.toURI(), encryptCipher); delegate.setResource(cur);
We replace the Resource on the delegate with the CyrptUrlResource
. The Resource is created using the key and IV we generated earlier in the first Tasklet
.
Our writer is much more of a rewrite.
@Configuration @StepScope public class BookListWriter implements ItemStreamWriter<List<BookList>> { private static final Logger logger = LoggerFactory.getLogger(BookListWriter.class); private StepExecution stepExecution; private DelimitedLineAggregator<BookList> dla; private BufferedOutputStream bos; @BeforeStep public void beforeStep(StepExecution stepExecution) { logger.debug("beforeStep"); this.stepExecution = stepExecution; dla = new DelimitedLineAggregator<>(); dla.setDelimiter(","); BeanWrapperFieldExtractor<BookList> fieldExtractor = new BeanWrapperFieldExtractor<>(); fieldExtractor.setNames(new String[]{"bookName", "author"}); dla.setFieldExtractor(fieldExtractor); } @Override public void close() throws ItemStreamException { try { bos.flush(); bos.close(); } catch (IOException ex) { logger.error(ex.getMessage(), ex); throw new ItemStreamException(ex); } } @Override public void open(ExecutionContext ec) throws ItemStreamException { try { SecretKey aesKey = (SecretKey) stepExecution.getJobExecution().getExecutionContext().get("inKey"); byte[] iv = (byte[]) stepExecution.getJobExecution().getExecutionContext().get("inIv"); String encfilename = "outfile.txt.enc"; File f = new File(encfilename); Cipher encryptCipher = Cipher.getInstance("AES/CBC/PKCS5Padding"); encryptCipher.init(Cipher.ENCRYPT_MODE, aesKey, new IvParameterSpec(iv)); CyrptFileSystemResource cur = new CyrptFileSystemResource(f.getAbsolutePath(), encryptCipher); bos = new BufferedOutputStream(cur.getOutputStream()); } catch (NoSuchAlgorithmException | NoSuchPaddingException | InvalidKeyException | InvalidAlgorithmParameterException | IOException ex) { logger.error(ex.getMessage(), ex); throw new ItemStreamException(ex); } } @Override public void update(ExecutionContext ec) throws ItemStreamException { } @Override public void write(List<? extends List<BookList>> list) throws Exception { logger.info("write"); for (List<BookList> bookList : list) { for (BookList book : bookList) { String line = dla.aggregate(book) + "\n"; bos.write(line.getBytes()); } } } }
We move the Aggregator
to the class level because we are going to use it later. Also, we capture the StepExecution
for later use. A new BufferedOutputStream
is created that will actually do the writing. We can no longer use the flat file writer because we are now dealing with a binary file.
The Open Method
So let’s jump over close and talk about the open
method. Here we once again get our key and IV from the context and use them to create our Cipher. Our Cipher is then used to create a File Resource which is in turn used to create our BufferedOutputStream
. Since we are not using a built-in Writer
as a Delegate
anymore, we could have skipped the Resource
and created our OutputStream
for BufferedOutputStream
to wrap right here, but we have it and might as well use it.
We now have a file ready to write our encrypted data. The update method no longer has anything to do, so we leave it blank and move on to the write
. As our Processor
created a list of BookLists
, and Spring Batch hands the output of the Reader
/Processor
to the Writer
as a list, we have a list of lists to loop through. We use our aggregator
, which is unchanged from the original version, to put together our output string, but don’t forget the end of line character.
Once all our data has been written out, we come back to our close method. Again, we call the flush
method before closing to ensure that all the data is written before moving on.
Final Thoughts
That is the complete change we made to provide a little more security to the data we are passing around. While no single security measure can provide a 100% guarantee of safety, the more layers you add, the more secure you become. This solution provides a easy to maintain layer.
Believe in good code.