Getting Started With Spring Batch, Part Two

Jonny Hackett Java, Spring, Spring Batch, Technology Snapshot, Tutorial 17 Comments

Now that we’ve had a high level overview of some of the simple and basic features of Spring Batch, let’s dive into what it takes to get up and running. The main purpose of this quick and simple tutorial is to give you a starting point for exploring Spring Batch to see if you’d like to implement it for one of your projects.

Since this tutorial is based on SpringSource Tool Suite (STS), if you haven’t already, the first thing you’ll need to do is download and install STS from the SpringSource website. If you’re going to be doing any Spring-based development I highly recommend you use STS, which is based on Eclipse with the focus on Spring development.

Next, start STS and open it with a new workspace. Once STS is up and running:

  1. Right click in the Project Explorer and select New -> Spring Template Project.
  2. Select Simple Spring Batch Project and click Next.
  3. Fill out the project name and top level package entry fields and click Finish.

Once it’s done downloading the dependencies and setting up the project, you should see the project structure in the default Maven structure. The process is pretty straightforward and you shouldn’t have any compile errors, but if you do the first place to look should be any missing Maven dependencies.

In the src/test/java directory under the base package name that you provided in the setup, you should see three JUnit tests named ExampleItemReaderTests, ExampleItemWriterTests and ExampleJobConfigurationTests. You should be able to run all of these tests successfully to verify that the newly created batch template project was set up successfully.

There are two important configuration files that were created for the Spring Batch template project. The first is the launch-context.xml which can be found under the src/main/resources/ directory and contains the Spring context configuration. The other configuration file is the module-context.xml and can be found under the src/main/resources/META-INF/spring/ directory. The module-context.xml configuration file should contain an example job configuration that looks like this:

<batch:job id="job1">
    <batch:step id="step1"  >
        <batch:tasklet transaction-manager="transactionManager" start-limit="100" >
            <batch:chunk reader="reader" writer="writer" commit-interval="1" />
        </batch:tasklet>
    </batch:step>
</batch:job>

The example job configuration that is provided contains one Job named “job1” that consists of one Step that utilizes a chunk-oriented task that implements an ItemReader and ItemWriter that processes one chunk at a time as indicated by the commit-interval.  The reader and writer defined in the Step’s configuration are references to the beans ExampleItemReader and ExampleItemWriter that can be found in the base package you specified under the java source directory.

Since we’re just using the simple Spring Batch template project for this tutorial, there are a couple of different ways to execute the example batch job. If you ran the ExampleJobConfigurationTests mentioned earlier you’ve already executed a job using a JUnit test which is the first method. The other method is uses the CommandLineJobRunner, which is provided by Spring Batch. To run it via the CommandLineJobRunner within STS you can create a Debug Configuration that has the arguments “launch-context.xml job1” which specifies the Spring context and the job name to be executed.

Now that we’ve taken a quick look at the example and have executed it successfully, we’re going to replace the example job configuration with a new one that contains only one Step and will show an example usage of the Spring Batch supplied FlatFileItemReader, a simple ItemProcessor and an ItemWriter that logs the item out to the console.

The first bean we need to define is the FlatFileItemReader. Spring Batch’s implementation of the FlatFileItemReader is quite configurable and can be used for a wide range of file format types. Most commonly it’s used to read CSV files, other delimited files and fixed length files. But it can also be configured to read files containing multiple record types. For example a file format that uses multiple record types might contain a header record, multiple detail item records and a summary record whose format could be either delimited or fixed length. For this example however, we’re going to use the Yahoo Finance stock data download service. The service allows you to call a Yahoo Finance Lookup URL with certain parameters that specify the tickers to look up and the data fields to be supplied. In return, the data you have requested is returned in a CSV file format. Details about the service can be found at: http://www.gummy-stuff.org/Yahoo-data.htm

Here’s the configuration for the FlatFileItemReader:

<bean name="tickerReader"
    class="org.springframework.batch.item.file.FlatFileItemReader">
    <property name="resource" value="http://finance.yahoo.com/d/quotes.csv?s=XOM+IBM+JNJ+MSFT&f=snd1ol1p2" />
	<property name="lineMapper" ref="tickerLineMapper" />
</bean>

The FlatFileItemReader needs a Resource and a LineMapper in order to parse the file. For this particular case, the Resource we’re going to be using is a URL instead of a static file. We also need to define a LineMapper, and for this particular example we’re going to use the DefaultLineMapper. It’s the LineMapper’s responsibility to be fed a line of data from the file and map the line to a data object based on its configuration. The DefaultLineMapper needs to have a FieldSetMapper and LineTokenizer configured, which handle the parsing of the line and mapping it to the data object.

First we need to create the data object that the line will be mapped to and for the Yahoo data format we’ve specified the class will look like this:

package com.keyhole.example;

import java.io.Serializable;
import java.math.BigDecimal;
import java.util.Date;

public class TickerData implements Serializable {

	private static final long serialVersionUID = 6463492770982487812L;
	private String symbol;
	private String name;
	private Date lastTradeDate;
	private BigDecimal open;
	private BigDecimal lastTrade;
	private String changePct;
	private BigDecimal openGBP;
	private BigDecimal lastTradeGBP;

	@Override
	public String toString() {
		return "TickerData [symbol=" + symbol + ", name=" + name
			+ ", lastTradeDate=" + lastTradeDate + ", open=" + open
			+ ", lastTrade=" + lastTrade + ", changePct=" + changePct
			+ ", openGBP=" + openGBP + ", lastTradeGBP=" + lastTradeGBP
			+ "]";
	}
	// Getters and Setters removed for brevity
}

Now that we’ve defined the data object that the CSV file line will be mapped to, we need to create the implementation of the FieldSetMapper. The LineMapper will tokenize the line according to the specified tokenizer provided in the configuration and parse it into a FieldSet. The FieldSetMapper is then responsible for mapping the data from the FieldSet into the data object and returning that object to the reader. Here’s the code for mapping the Yahoo ticker data:


package com.keyhole.example;

import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.stereotype.Component;
import org.springframework.validation.BindException;

@Component("tickerMapper")
public class TickerFieldSetMapper implements FieldSetMapper {

	public TickerData mapFieldSet(FieldSet fieldSet) throws BindException {
		TickerData data = new TickerData();
		data.setSymbol(fieldSet.readString(0));
		data.setName(fieldSet.readString(1));
		data.setLastTradeDate(fieldSet.readDate(2, "mm/DD/yyyy"));
		data.setOpen(fieldSet.readBigDecimal(3));
		data.setLastTrade(fieldSet.readBigDecimal(4));
		data.setChangePct(fieldSet.readString(5));
		return data;
	}
}

The corresponding configuration snippet for the DefaultLineMapper is:

<bean name="tickerLineMapper"
    class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
        <property name="fieldSetMapper" ref="tickerMapper" />
        <property name="lineTokenizer" ref="tickerLineTokenizer" />
</bean>

Since we’re parsing a CSV format, the LineTokenizer we’ll be using is the DelimitedLineTokenizer and doesn’t require any properties to be set since the defaults are for parsing CSV files. Other common implementations provided by Spring Batch are the FixedLengthTokenizer and PatternMatchingCompositeLineTokenizer. Here’s the bean definition for the LineTokenizer we’ll be using in the example:

<bean name="tickerLineTokenizer"
    class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer" />

Putting it all together, the complete configuration for the Yahoo ticker data reader looks like this:

<bean name="tickerReader"
    class="org.springframework.batch.item.file.FlatFileItemReader">
    <property name="resource" value="http://finance.yahoo.com/d/quotes.csv?s=XOM+IBM+JNJ+MSFT&f=snd1ol1p2" />
	<property name="lineMapper" ref="tickerLineMapper" />
</bean>

<bean name="tickerLineMapper"
    class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
        <property name="fieldSetMapper" ref="tickerMapper" />
        <property name="lineTokenizer" ref="tickerLineTokenizer" />
</bean>

<bean name="tickerLineTokenizer"
    class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer" />

Now that we’ve defined the ItemReader for our Job, it’s time to create the ItemProcessor and ItemWriter. For our ItemProcessor we’re going to create a simple processor that takes the TickerData object as input, calls a CurrencyConversionService to convert the opening and closing amounts from USD into GBP currency at the current conversion rate. The code looks like this:

package com.keyhole.example;

import java.math.BigDecimal;

import org.springframework.batch.item.ItemProcessor;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;

@Component("tickerPriceProcessor")
public class TickerPriceProcessor implements ItemProcessor<TickerData, TickerData> {

    @Autowired
    private CurrencyConversionService conversionService;

    @Override
    public TickerData process(TickerData ticker) throws Exception {

        BigDecimal openGBP = conversionService.convertCurrency(ticker.getOpen(), Currency.USD, Currency.GBP);
        BigDecimal lastTradeGBP =    conversionService.convertCurrency(ticker.getLastTrade(), Currency.USD, Currency.GBP);

        ticker.setOpenGBP(openGBP);
        ticker.setLastTradeGBP(lastTradeGBP);

        return ticker;
	}

}

For the ItemWriter, we’re going to create a very simple writer that simply logs the object being processed for output to the console and the code looks like this:

package com.keyhole.example;

import java.util.List;

import org.apache.log4j.Logger;
import org.springframework.batch.item.ItemWriter;
import org.springframework.stereotype.Component;

@Component("tickerWriter")
public class LogItemWriter implements ItemWriter<TickerData> {

	private static final Logger LOG = Logger.getLogger(LogItemWriter.class);

	public void write(List<? extends TickerData> items) throws Exception {
		for (TickerData ticker: items) {
			LOG.info(ticker.toString());
		}
	}

}

(Code snippet updated 3/29/2014)

Now that we’ve created all of the components, all we have left is to configure the Job itself and wire the components together.  I’ve replaced the existing job configuration in the module-context.xml config file and the final configuration including bean definitions looks like this:

<batch:job id="TickerPriceConversion">
<batch:step id="convertPrice">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="tickerReader"
	processor="tickerPriceProcessor"
				writer="tickerWriter" commit-interval="10" />
		</batch:tasklet>
	</batch:step>
</batch:job>

<bean name="tickerReader"
	class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource"
value="http://finance.yahoo.com/d/quotes.csv?s=XOM+IBM+JNJ+MSFT&f=snd1ol1p2" />
	<property name="lineMapper" ref="tickerLineMapper" />
</bean>

<bean name="tickerLineMapper"
class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper" ref="tickerMapper" />
	<property name="lineTokenizer" ref="tickerLineTokenizer" />
</bean>

<bean name="tickerLineTokenizer"
class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer" />

Finally a quick JUnit test for executing the Job:

package com.keyhole.example;

import static org.junit.Assert.assertEquals;

import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.test.context.ContextConfiguration;
import org.springframework.test.context.junit4.SpringJUnit4ClassRunner;

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = "classpath:/launch-context.xml")
public class TickerPriceConversionTest {

	@Autowired
	private JobLauncher jobLauncher;

	@Autowired
	@Qualifier(value = "TickerPriceConversion")
	private Job job;

	@Test
	public void testJob() throws Exception {
		JobParametersBuilder builder = new JobParametersBuilder();
		JobExecution jobExecution = jobLauncher.run(job,
				builder.toJobParameters());
		assertEquals("COMPLETED", jobExecution.getExitStatus().getExitCode());
	}
}

Now that you’ve had a quick primer on how to get up and running with Spring Batch, you’re ready to explore the additional features and spend more time solving business problems related to enterprise batch processing– instead of solving the technical challenges surrounding them.

— Jonny Hackett, asktheteam@keyholesoftware.com

Spring Batch Blog Series

Part One: Introducing Spring Batch

Part Two:  Getting Started With Spring Batch

Part Three: Generating Large Excel Files Using Spring Batch

Scaling Spring Batch – Step Partitioning

Spring Batch Unit Testing and Mockito

Spring Batch – Replacing XML Job Configuration With JavaConfig

Additional Resources:

The following books can also be found in Safari Tech Books Online


About the Author
Jonny Hackett

Jonny Hackett

Twitter

Jonny is a Senior Software Engineer and Mentor with 15+ years of experience in IT. As a Java Developer, avid SportingKC fan, and photographer (check him out on www.Facebook.com/no9photography.) , Jonny is also our resident Spring Batch expert.


Share this Post

Comments 17

  1. Pingback: Introducing Spring Batch, Part One « Keyhole Software

  2. It is so hard to find good examples using Spring Batch. This is not only one of the few examples, it is also an excellent example. Thanks for taking the time to provide the details and great explanation.

  3. Pingback: Generating Large Excel Files Using Spring Batch, Part Three « Keyhole Software

    1. Hey Joe, apologies for leaving that code out. There wasn’t anything special in the service that added much value to the main purpose of the example. All it did was multiply the amount provided by a static value. Here’s the code:

      @Service
      public class CurrencyConversionService {

      private static final BigDecimal GBP_RATE = new BigDecimal(“0.638365784”);

      public BigDecimal convertCurrency(BigDecimal amt, Currency from, Currency to) {
      return amt.multiply(GBP_RATE).setScale(2, RoundingMode.HALF_EVEN);
      }

      }

  4. Thanks for putting this together;
    however this is not a complete article; not seeingCurrencyConversionService; (i made it to return some static value); Also getting => No bean named ‘tickerMapper’ is defined 🙁
    could you please help?

    1. Hello John, thanks for checking out the article. In my reply to Joe above you can see the code for the conversion service, however there wasn’t anything interesting about the service other than it was just used as a simple example of wiring a service into the process. You weren’t too far off the mark in just returning a static value.

      You should have a class somewhere in your code named TickerFieldSetMapper and that should be annotated with @Component(“tickerMapper”) which will be scanned into the app context rather than having it defined in the xml configuration. I’m not very fond of a lot of xml hand coding so I prefer to use the spring stereotype annotations whenever possible. If you can’t find it in your project, it can be found as the second class in the code listings of the article.

  5. I am getting the error “No bean named ‘transactionManager’ is defined”. If I remove the transactionManager from the “tasklet” line, I get the error “No bean named ‘jobRepository’ is defined”.

    Are there additional schemaLocations I need to add, or additional jars. Any help would be greatly appreciated

    1. Hi Steve, the transactionManager bean should have been created as part of the Spring Template project setup. You should be able to find a class in your project named ExampleConfiguration, which should have been generated during the Spring Template project setup. The ExampleConfiguration class is marked by some spring annotations and should be within a package that will be scanned by spring’s component scanner as defined in spring’s application context config file.

      By default, the Spring Template project setup uses an in-memory job repository using hsqldb, which the transactionManager uses to control commits during job processing and where the job meta-data resides. There should also be a batch.properties file containing the hsqldb configuration settings for that in-memory job repository that is used to create the data source for the transaction manager.

      More information can be found here:
      http://static.springsource.org/spring-batch/reference/html-single/index.html#txConfigForJobRepository

  6. Pingback: Scaling Spring Batch – Step Partitioning Tutorial | Keyhole Software

  7. Hello, receiving error Name clash: the method write(List) of type LogItemWriter has the same erasure as write(List) of type ItemWriter but does not override it. Here is the line of code in error:
    public void write(List items) throws Exception {

    Can you please explain the resolution to this issue?

    Thanks,

    John

    1. I too get the same error in the following code. Can anyone give me the resolution please.

      Name clash: The method write(List) of type LogItemWriter has the same erasure as write(List) of type ItemWriter but does not override it

      @Component(“tickerWriter”)
      public class LogItemWriter implements ItemWriter {
      private static final Logger LOG = Logger.getLogger(LogItemWriter.class);
      public void write(List items) throws Exception {
      for (TickerData ticker: items) {
      LOG.info(ticker.toString());
      }
      }
      }

      1. Sorry for the long delay on answering this guys, I missed the original comment and became aware of it with the second comment posted.

        It turned out that this was a compile issue with java generics. Whatever version I happened to be on at the time didn’t enforce the generics and the error only started cropping up with newer version. Since the Spring Batch interface ItemWriter specifies the type T expected to be used the compiler was generating that error when I didn’t specify a type on the ItemWriter implementation.

        Thanks for the catch!

  8. Pingback: Generating Large Excel Files Using Spring Batch, Part Three | Keyhole Software

  9. This is probably the most comprehensive yet simple Spring Batch intro I have come across on the web. Great tut Jonny.

Leave a Reply