Getting Started With Spring Batch, Part Two

Jonny Hackett Development Technologies, Intro to Spring Batch Series, Java, Spring, Spring Batch, Tutorial 17 Comments

Attention: The following article was published over 12 years ago, and the information provided may be aged or outdated. Please keep that in mind as you read the post.

Now that we’ve had a high level overview of some of the simple and basic features of Spring Batch, let’s dive into what it takes to get up and running. The main purpose of this quick and simple tutorial is to give you a starting point for exploring Spring Batch to see if you’d like to implement it for one of your projects.

Since this tutorial is based on SpringSource Tool Suite (STS), if you haven’t already, the first thing you’ll need to do is download and install STS from the SpringSource website. If you’re going to be doing any Spring-based development I highly recommend you use STS, which is based on Eclipse with the focus on Spring development.

Next, start STS and open it with a new workspace. Once STS is up and running:

  1. Right click in the Project Explorer and select New -> Spring Template Project.
  2. Select Simple Spring Batch Project and click Next.
  3. Fill out the project name and top level package entry fields and click Finish.

Once it’s done downloading the dependencies and setting up the project, you should see the project structure in the default Maven structure. The process is pretty straightforward and you shouldn’t have any compile errors, but if you do the first place to look should be any missing Maven dependencies.

In the src/test/java directory under the base package name that you provided in the setup, you should see three JUnit tests named ExampleItemReaderTests, ExampleItemWriterTests and ExampleJobConfigurationTests. You should be able to run all of these tests successfully to verify that the newly created batch template project was set up successfully.

There are two important configuration files that were created for the Spring Batch template project. The first is the launch-context.xml which can be found under the src/main/resources/ directory and contains the Spring context configuration. The other configuration file is the module-context.xml and can be found under the src/main/resources/META-INF/spring/ directory. The module-context.xml configuration file should contain an example job configuration that looks like this:

<batch:job id="job1">
    <batch:step id="step1"  >
        <batch:tasklet transaction-manager="transactionManager" start-limit="100" >
            <batch:chunk reader="reader" writer="writer" commit-interval="1" />
        </batch:tasklet>
    </batch:step>
</batch:job>

The example job configuration that is provided contains one Job named “job1” that consists of one Step that utilizes a chunk-oriented task that implements an ItemReader and ItemWriter that processes one chunk at a time as indicated by the commit-interval.  The reader and writer defined in the Step’s configuration are references to the beans ExampleItemReader and ExampleItemWriter that can be found in the base package you specified under the java source directory.

Since we’re just using the simple Spring Batch template project for this tutorial, there are a couple of different ways to execute the example batch job. If you ran the ExampleJobConfigurationTests mentioned earlier you’ve already executed a job using a JUnit test which is the first method. The other method is uses the CommandLineJobRunner, which is provided by Spring Batch. To run it via the CommandLineJobRunner within STS you can create a Debug Configuration that has the arguments “launch-context.xml job1” which specifies the Spring context and the job name to be executed.

Now that we’ve taken a quick look at the example and have executed it successfully, we’re going to replace the example job configuration with a new one that contains only one Step and will show an example usage of the Spring Batch supplied FlatFileItemReader, a simple ItemProcessor and an ItemWriter that logs the item out to the console.

The first bean we need to define is the FlatFileItemReader. Spring Batch’s implementation of the FlatFileItemReader is quite configurable and can be used for a wide range of file format types. Most commonly it’s used to read CSV files, other delimited files and fixed length files. But it can also be configured to read files containing multiple record types. For example a file format that uses multiple record types might contain a header record, multiple detail item records and a summary record whose format could be either delimited or fixed length. For this example however, we’re going to use the Yahoo Finance stock data download service. The service allows you to call a Yahoo Finance Lookup URL with certain parameters that specify the tickers to look up and the data fields to be supplied. In return, the data you have requested is returned in a CSV file format. Details about the service can be found at: http://www.gummy-stuff.org/Yahoo-data.htm

Here’s the configuration for the FlatFileItemReader:

<bean name="tickerReader"
    class="org.springframework.batch.item.file.FlatFileItemReader">
    <property name="resource" value="http://finance.yahoo.com/d/quotes.csv?s=XOM+IBM+JNJ+MSFT&f=snd1ol1p2" />
	<property name="lineMapper" ref="tickerLineMapper" />
</bean>

The FlatFileItemReader needs a Resource and a LineMapper in order to parse the file. For this particular case, the Resource we’re going to be using is a URL instead of a static file. We also need to define a LineMapper, and for this particular example we’re going to use the DefaultLineMapper. It’s the LineMapper’s responsibility to be fed a line of data from the file and map the line to a data object based on its configuration. The DefaultLineMapper needs to have a FieldSetMapper and LineTokenizer configured, which handle the parsing of the line and mapping it to the data object.

First we need to create the data object that the line will be mapped to and for the Yahoo data format we’ve specified the class will look like this:

package com.keyhole.example;

import java.io.Serializable;
import java.math.BigDecimal;
import java.util.Date;

public class TickerData implements Serializable {

	private static final long serialVersionUID = 6463492770982487812L;
	private String symbol;
	private String name;
	private Date lastTradeDate;
	private BigDecimal open;
	private BigDecimal lastTrade;
	private String changePct;
	private BigDecimal openGBP;
	private BigDecimal lastTradeGBP;

	@Override
	public String toString() {
		return "TickerData [symbol=" + symbol + ", name=" + name
			+ ", lastTradeDate=" + lastTradeDate + ", open=" + open
			+ ", lastTrade=" + lastTrade + ", changePct=" + changePct
			+ ", openGBP=" + openGBP + ", lastTradeGBP=" + lastTradeGBP
			+ "]";
	}
	// Getters and Setters removed for brevity
}

Now that we’ve defined the data object that the CSV file line will be mapped to, we need to create the implementation of the FieldSetMapper. The LineMapper will tokenize the line according to the specified tokenizer provided in the configuration and parse it into a FieldSet. The FieldSetMapper is then responsible for mapping the data from the FieldSet into the data object and returning that object to the reader. Here’s the code for mapping the Yahoo ticker data:


package com.keyhole.example;

import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.stereotype.Component;
import org.springframework.validation.BindException;

@Component("tickerMapper")
public class TickerFieldSetMapper implements FieldSetMapper {

	public TickerData mapFieldSet(FieldSet fieldSet) throws BindException {
		TickerData data = new TickerData();
		data.setSymbol(fieldSet.readString(0));
		data.setName(fieldSet.readString(1));
		data.setLastTradeDate(fieldSet.readDate(2, "mm/DD/yyyy"));
		data.setOpen(fieldSet.readBigDecimal(3));
		data.setLastTrade(fieldSet.readBigDecimal(4));
		data.setChangePct(fieldSet.readString(5));
		return data;
	}
}

The corresponding configuration snippet for the DefaultLineMapper is:

<bean name="tickerLineMapper"
    class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
        <property name="fieldSetMapper" ref="tickerMapper" />
        <property name="lineTokenizer" ref="tickerLineTokenizer" />
</bean>

Since we’re parsing a CSV format, the LineTokenizer we’ll be using is the DelimitedLineTokenizer and doesn’t require any properties to be set since the defaults are for parsing CSV files. Other common implementations provided by Spring Batch are the FixedLengthTokenizer and PatternMatchingCompositeLineTokenizer. Here’s the bean definition for the LineTokenizer we’ll be using in the example:

<bean name="tickerLineTokenizer"
    class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer" />

Putting it all together, the complete configuration for the Yahoo ticker data reader looks like this:

<bean name="tickerReader"
    class="org.springframework.batch.item.file.FlatFileItemReader">
    <property name="resource" value="http://finance.yahoo.com/d/quotes.csv?s=XOM+IBM+JNJ+MSFT&f=snd1ol1p2" />
	<property name="lineMapper" ref="tickerLineMapper" />
</bean>

<bean name="tickerLineMapper"
    class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
        <property name="fieldSetMapper" ref="tickerMapper" />
        <property name="lineTokenizer" ref="tickerLineTokenizer" />
</bean>

<bean name="tickerLineTokenizer"
    class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer" />

Now that we’ve defined the ItemReader for our Job, it’s time to create the ItemProcessor and ItemWriter. For our ItemProcessor we’re going to create a simple processor that takes the TickerData object as input, calls a CurrencyConversionService to convert the opening and closing amounts from USD into GBP currency at the current conversion rate. The code looks like this:

package com.keyhole.example;

import java.math.BigDecimal;

import org.springframework.batch.item.ItemProcessor;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;

@Component("tickerPriceProcessor")
public class TickerPriceProcessor implements ItemProcessor<TickerData, TickerData> {

    @Autowired
    private CurrencyConversionService conversionService;

    @Override
    public TickerData process(TickerData ticker) throws Exception {

        BigDecimal openGBP = conversionService.convertCurrency(ticker.getOpen(), Currency.USD, Currency.GBP);
        BigDecimal lastTradeGBP =    conversionService.convertCurrency(ticker.getLastTrade(), Currency.USD, Currency.GBP);

        ticker.setOpenGBP(openGBP);
        ticker.setLastTradeGBP(lastTradeGBP);

        return ticker;
	}

}

For the ItemWriter, we’re going to create a very simple writer that simply logs the object being processed for output to the console and the code looks like this:

package com.keyhole.example;

import java.util.List;

import org.apache.log4j.Logger;
import org.springframework.batch.item.ItemWriter;
import org.springframework.stereotype.Component;

@Component("tickerWriter")
public class LogItemWriter implements ItemWriter<TickerData> {

	private static final Logger LOG = Logger.getLogger(LogItemWriter.class);

	public void write(List<? extends TickerData> items) throws Exception {
		for (TickerData ticker: items) {
			LOG.info(ticker.toString());
		}
	}

}

(Code snippet updated 3/29/2014)

Now that we’ve created all of the components, all we have left is to configure the Job itself and wire the components together.  I’ve replaced the existing job configuration in the module-context.xml config file and the final configuration including bean definitions looks like this:

<batch:job id="TickerPriceConversion">
<batch:step id="convertPrice">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="tickerReader"
	processor="tickerPriceProcessor"
				writer="tickerWriter" commit-interval="10" />
		</batch:tasklet>
	</batch:step>
</batch:job>

<bean name="tickerReader"
	class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource"
value="http://finance.yahoo.com/d/quotes.csv?s=XOM+IBM+JNJ+MSFT&f=snd1ol1p2" />
	<property name="lineMapper" ref="tickerLineMapper" />
</bean>

<bean name="tickerLineMapper"
class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper" ref="tickerMapper" />
	<property name="lineTokenizer" ref="tickerLineTokenizer" />
</bean>

<bean name="tickerLineTokenizer"
class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer" />

Finally a quick JUnit test for executing the Job:

package com.keyhole.example;

import static org.junit.Assert.assertEquals;

import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.test.context.ContextConfiguration;
import org.springframework.test.context.junit4.SpringJUnit4ClassRunner;

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = "classpath:/launch-context.xml")
public class TickerPriceConversionTest {

	@Autowired
	private JobLauncher jobLauncher;

	@Autowired
	@Qualifier(value = "TickerPriceConversion")
	private Job job;

	@Test
	public void testJob() throws Exception {
		JobParametersBuilder builder = new JobParametersBuilder();
		JobExecution jobExecution = jobLauncher.run(job,
				builder.toJobParameters());
		assertEquals("COMPLETED", jobExecution.getExitStatus().getExitCode());
	}
}

Now that you’ve had a quick primer on how to get up and running with Spring Batch, you’re ready to explore the additional features and spend more time solving business problems related to enterprise batch processing– instead of solving the technical challenges surrounding them.

— Jonny Hackett, [email protected]

Spring Batch Blog Series

Part One: Introducing Spring Batch

Part Two:  Getting Started With Spring Batch

Part Three: Generating Large Excel Files Using Spring Batch

Scaling Spring Batch – Step Partitioning

Spring Batch Unit Testing and Mockito

Spring Batch – Replacing XML Job Configuration With JavaConfig

Spring Batch Testing & Mocking Revisited with Spring Boot

Additional Resources:

The following books can also be found in Safari Tech Books Online

0 0 votes
Article Rating
Subscribe
Notify of
guest

17 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments