Read XML

Using JAXB And StaxEventItemReader To Read XML Data

Jonny Hackett Development Technologies, Programming, Spring, Spring Batch Leave a Comment

In one of my previous Spring Batch blog articles, I wrote about the need to read a set of data, process the data, and export the transformed data into XML for consumption by another system.

In this blog, I’ll be doing the opposite. I’ll show you how to read data from an XML format instead.

Process Overview

For this particular example, we’re going to be reading an XML file that represents some basic employee contact info, parsing the XML, and logging it out to the console. Since this example is focused on the reading aspect, I won’t show any specific processing or a specific output method.

However, if there was a need to do additional transformation or processing of the data, you have the option of implementing a custom ItemProcessor. For the output, you might have a need to store the data into a database or simply export the records to a pipe-delimited flat file.

Step 1: Creating the XML Mapped Bean and Batch ItemReader

Let’s say we’re given an XML file named contact-data.xml. The file contains some simple data such as first name, last name, email, cell phone, and some info regarding the role of the employee in the company.

Here’s a snippet of what the XML will look like:

  <EmployeeContact team="IT Operations" role="Developer" status="Full Time Employee">
        <FirstName>John</FirstName>
        <LastName>Doe</LastName>
        <EmailAddress>jdoe@gmail.com</EmailAddress>
        <CellPhone>111-543-1234</CellPhone>
    </EmployeeContact>

As you can see, it’s similar to the format we saw in my previous blog article for writing the XML output. The biggest difference is that I’ve added some XML attributes on the ExmployeeContact element for providing the person’s role within the company. I could have achieved similar results using additional elements, but I wanted a simple example that would show how XML attributes are parsed.

Now that we know the format of the XML file, we’ll need to create a class that uses JAXB 2.0 binding annotations. These will provide direction on how to map the XML class to the marshaling engine.

Here’s the code for this new EmployeeContactXML class that will be used to map the XML file. Getters and setters have been removed for brevity but can be found in the full code listing at the end of the article.

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(propOrder = { "firstName", "lastName", "emailAddress", "cellPhone" })
@XmlRootElement(name = "EmployeeContact")
public class EmployeeContactXml {
	
	@XmlAttribute(required = true)
	protected String team;
	
	@XmlAttribute(required =true)
	protected String role;
	
	@XmlAttribute(required=true)
	protected String status;

	@XmlElement(name = "FirstName", required = true)
	protected String firstName;
	
	@XmlElement(name = "LastName", required = true)
	protected String lastName;

	@XmlElement(name = "EmailAddress", required = true)
	protected String emailAddress;
	
	@XmlElement(name = "CellPhone", required = true)
	protected String cellPhone;
}

For reference, here’s an explanation of the different annotations used in this example.

  • @XmlAccessorType: This defines whether the fields and properties of a class will be serialized. In this example, I’ve set the value to XmlAccessType.FIELD, which means that every non-static, nontransient field will be automatically bound to XML unless annotated by @XmlTransient. The names for the XML elements will be derived by default from the field names.
  • @XmlType: This allows us to define additional properties such as mapping the class to a specific schema, namespace, and specific order of children. In this specific case, we’re only using it to define the particular order of elements.
  • @XmlRootElement: This is used to map the class to a specific root element. By default, it derives the root element tag from the class name. For this example, we’re specifying a different name.
  • @XmlElement: This maps a JavaBean property to an XML element derived from the property name by default.
  • @XmlAttribute: This maps a JavaBean property to an XML attribute derived from the property name by default.

Alright, so at this point, our class is defined so that the marshaller will be used to unmarshal the XML and map it to the object. Since this is set up, we can now define the ItemReader bean in the job configuration.

For this, we’ll use the StaxEventItemReader that is provided with Spring Batch. Here’s how we define that in the Spring Batch job configuration. I’ve added some comments to note what’s going on.

	@Bean
	@StepScope
	public StaxEventItemReader<EmployeeContactXml> employeeContactsReader() {
		
		//define the resource that the reader will be consuming
		Resource resource = new FileSystemResource("/c:/dev/data/contact-data.xml");
		//instantiate a new StaxEventItemReader binding the ExmployeeContactXml class
		StaxEventItemReader<EmployeeContactXml> xmlFileReader = 
			new StaxEventItemReader<>();
		//set the resource on the xmlFileReader
		xmlFileReader.setResource(resource);
		//define the root element of the xml fragment
		xmlFileReader.setFragmentRootElementName("EmployeeContact");
		
		//instantiate a new Jaxb2Marshaller
		Jaxb2Marshaller xmlMarshaller = new Jaxb2Marshaller();
		//define the Jaxb annotated classes to be recognized in the JAXBContext
		xmlMarshaller.setClassesToBeBound(EmployeeContactXml.class);
		//define the marshaller that maps xml fragments to objects
		xmlFileReader.setUnmarshaller(xmlMarshaller);
		return xmlFileReader;
	}

Step 2: Creating an ItemWriter To Display the Results

Since the focus of this article is reading XML files, there isn’t an ItemProcessor to plug into the job’s step. To display the resulting output, I’ve created a simple ItemWriter that calls the toString on each object sent to the writer.
Here’s the code for that:

public class EmployeeContactWriter implements ItemWriter<EmployeeContactXml> {
	
	private static final Logger LOGGER = 					LoggerFactory.getLogger(EmployeeContactWriter.class);

	@Override
	public void write(List<? extends EmployeeContactXml> items) throws Exception {

		for ( EmployeeContactXml contact : items) {
			LOGGER.info("Writing contact: {}", contact);
		}
		
	}

}

Here’s the resulting output from the writer:

[SimpleJob: [name=Exmployee-Contact-Processing-Job]] launched with the following parameters: [{}]
Executing step: [processEmployeeContactsFile]
Writing contact: ContactXml [team=IT Operations, role=Developer, status=Contractor, firstName=John, lastName=Smith, emailAddress=jsmith@gmail.com, cellPhone=111-333-4444]
Writing contact: ContactXml [team=IT Operations, role=Developer, status=Full Time Employee, firstName=John, lastName=Doe, emailAddress=jdoe@gmail.com, cellPhone=111-543-1234]
Writing contact: ContactXml [team=Human Resources, role=Manager, status=Full Time Employee, firstName=Jane, lastName=Doe, emailAddress=janed@gmail.com, cellPhone=111-463-8583]
Writing contact: ContactXml [team=Finance, role=Director, status=Full Time Employee, firstName=Jimmy, lastName=Lovine, emailAddress=jimmy.lovine@gmail.com, cellPhone=111-234-9367]
Step: [processEmployeeContactsFile] executed in 75ms
Job: [SimpleJob: [name=Exmployee-Contact-Processing-Job]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 83ms

In Summary

As you can see, it’s fairly straightforward and simple to consume XML-based data using the provided StaxEventItemReader in Spring Batch. Although this was a pretty simple XML document, it’s entirely possible to use the same pattern to read much more complex XML data.

In a real-world example, this job would likely have implemented a custom ItemProcessor to further transform or enrich the data that was consumed from the XML. Because of the way Spring Batch works, there are several out-of-the-box ItemWriters provided that could be plugged into this job configuration.

Thank you for reading, and please let me know if you have any questions in the comments below!

Complete Code Listing

Just so you have it all in one place for convenient access, here is the complete code listing for the example. I hope you find it useful!

XML Document

<?xml version="1.0" encoding="UTF-8"?>
<EmployeeContacts>
    <EmployeeContact team="IT Operations" role="Developer" status="Contractor">
        <FirstName>John</FirstName>
        <LastName>Smith</LastName>
        <EmailAddress>jsmith@gmail.com</EmailAddress>
        <CellPhone>111-333-4444</CellPhone>
    </EmployeeContact>
    <EmployeeContact team="IT Operations" role="Developer" status="Full Time Employee">
        <FirstName>John</FirstName>
        <LastName>Doe</LastName>
        <EmailAddress>jdoe@gmail.com</EmailAddress>
        <CellPhone>111-543-1234</CellPhone>
    </EmployeeContact>
    <EmployeeContact team="Human Resources" role="Manager" status="Full Time Employee">
        <FirstName>Jane</FirstName>
        <LastName>Doe</LastName>
        <EmailAddress>janed@gmail.com</EmailAddress>
        <CellPhone>111-463-8583</CellPhone>
    </EmployeeContact>
    <EmployeeContact team="Finance" role="Director" status="Full Time Employee">
        <FirstName>Jimmy</FirstName>
        <LastName>Lovine</LastName>
        <EmailAddress>jimmy.lovine@gmail.com</EmailAddress>
        <CellPhone>111-234-9367</CellPhone>
    </EmployeeContact>
</EmployeeContacts>

Job Configuration:

package com.example.demo.batch.xml.read;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepScope;
import org.springframework.batch.item.xml.StaxEventItemReader;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.FileSystemResource;
import org.springframework.core.io.Resource;
import org.springframework.oxm.jaxb.Jaxb2Marshaller;

@Configuration
@EnableBatchProcessing
public class EmployeeContactProcessingJobConfig {

	public static final String JOB_NAME = "Exmployee-Contact-Processing-Job";
	
	@Autowired
	private JobBuilderFactory jobBuilderFactory;
	
	@Autowired
	private StepBuilderFactory stepBuilderFactory;

	@Bean
	public Step processEmployeeContactsFile() {
		return this.stepBuilderFactory.get("processEmployeeContactsFile").<EmployeeContactXml, EmployeeContactXml>chunk(100).reader(employeeContactsReader())
				.writer(employeeContactsWriter()).build();
	}

	@Bean
	public Job processEmployeeContactsFileJob() {
		return this.jobBuilderFactory.get(JOB_NAME).start(processEmployeeContactsFile()).build();
	}

	@Bean
	@StepScope
	public StaxEventItemReader<EmployeeContactXml> employeeContactsReader() {
		
		//define the resource that the reader will be consuming
		Resource resource = new FileSystemResource("/c:/dev/data/contact-data.xml");
		//instantiate a new StaxEventItemReader binding the ExmployeeContactXml class
		StaxEventItemReader<EmployeeContactXml> xmlFileReader = new StaxEventItemReader<>();
		//set the resource on the xmlFileReader
		xmlFileReader.setResource(resource);
		//define the root element of the xml fragment
		xmlFileReader.setFragmentRootElementName("EmployeeContact");
		
		//instantiate a new Jaxb2Marshaller
		Jaxb2Marshaller xmlMarshaller = new Jaxb2Marshaller();
		//define the Jaxb annotated classes to be recognized in the JAXBContext
		xmlMarshaller.setClassesToBeBound(EmployeeContactXml.class);
		//define the marshaller that maps xml fragments to objects
		xmlFileReader.setUnmarshaller(xmlMarshaller);
		return xmlFileReader;
	}

	@Bean
	@StepScope
	public EmployeeContactWriter employeeContactsWriter() {
		return new EmployeeContactWriter();
	}
}

EmployeeContactXml

package com.example.demo.batch.xml.read;

import javax.xml.bind.annotation.XmlAccessType;
import javax.xml.bind.annotation.XmlAccessorType;
import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(propOrder = { "firstName", "lastName", "emailAddress", "cellPhone" })
@XmlRootElement(name = "EmployeeContact")
public class EmployeeContactXml {
	
	@XmlAttribute(required = true)
	protected String team;
	
	@XmlAttribute(required =true)
	protected String role;
	
	@XmlAttribute(required=true)
	protected String status;

	@XmlElement(name = "FirstName", required = true)
	protected String firstName;
	
	@XmlElement(name = "LastName", required = true)
	protected String lastName;

	@XmlElement(name = "EmailAddress", required = true)
	protected String emailAddress;
	
	@XmlElement(name = "CellPhone", required = true)
	protected String cellPhone;

	public String getCellPhone() {
		return cellPhone;
	}
	
	public String getEmailAddress() {
		return emailAddress;
	}

	public String getFirstName() {
		return firstName;
	}

	public String getLastName() {
		return lastName;
	}

	public String getRole() {
		return role;
	}

	public String getStatus() {
		return status;
	}

	public String getTeam() {
		return team;
	}

	public void setCellPhone(String cellPhone) {
		this.cellPhone = cellPhone;
	}

	public void setEmailAddress(String emailAddress) {
		this.emailAddress = emailAddress;
	}

	public void setFirstName(String firstName) {
		this.firstName = firstName;
	}

	public void setLastName(String lastName) {
		this.lastName = lastName;
	}

	public void setRole(String role) {
		this.role = role;
	}

	public void setStatus(String status) {
		this.status = status;
	}

	public void setTeam(String team) {
		this.team = team;
	}

	@Override
	public String toString() {
		return "ContactXml [team=" + team + ", role=" + role + ", status=" + status + ", firstName=" + firstName
				+ ", lastName=" + lastName + ", emailAddress=" + emailAddress + ", cellPhone=" + cellPhone + "]";
	}



}

EmployeeContactWriter

package com.example.demo.batch.xml.read;

import java.util.List;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.batch.item.ItemWriter;

public class EmployeeContactWriter implements ItemWriter<EmployeeContactXml> {
	
	private static final Logger LOGGER = LoggerFactory.getLogger(EmployeeContactWriter.class);

	@Override
	public void write(List<? extends EmployeeContactXml> items) throws Exception {

		for ( EmployeeContactXml contact : items) {
			LOGGER.info("Writing contact: {}", contact);
		}
		
	}

}
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments