Codementor Events

Spring Batch

Published Dec 18, 2018

Learn about Spring Batch in this article by René Enríquez, a technical leader in a multinational company headquartered in Silicon Valley, and Alberto Salazar, an entrepreneur,and a passionate Java consultant.

https://github.com/PacktPublishing/Software-Architecture-with-Spring-5.0/tree/master/Chapter07

Spring Batch is a complete framework for creating a robust batch application (https://projects.spring.io/spring-batch/). You can create reusable functions to process large volumes of data or tasks, commonly known as bulk processing.
Spring Batch provides many useful features, such as the following:

• Logging and tracing
• Transaction management
• Job statistics
• Managing the process; for example, through restarting jobs, skipping steps, and resource management
• Administration Web Console

This framework is designed to manage a high volume of data and achieve high-performance batch processes using partition features. This article will look at a simple project explaining each principal component of Spring Batch.
As mentioned in the Spring Batch documentation (https://docs.spring.io/spring-batch/trunk/reference/html/spring-batch-intro.html), the most common scenarios for using the framework are as follows:

• Committing batch processes periodically
• Concurrent batch processing for parallel processing a job
• Staged, enterprise message-driven processing
• Large parallel batch processing
• Manual or scheduled restart after failures
• Sequential processing of dependent steps (with extensions to workflow-driven batches)
• Partial processing: Skip records (for example, on rollback)
• Whole-batch transaction: For cases with a small batch size or existing stored procedures/scripts

In enterprise applications, the need to process millions of records (data) or read from a source is very common. This source may contain large files with several records (such as CSV or TXT files) or database tables. On each of these records, it is common to apply some business logic, execute validations or transformations, and finish the task, writing the result to another output format (for example, the database or file).

Spring Batch provides a complete framework to implement this kind of requirement, minimizing human interaction.

You’ll review the basic concepts of Spring batch:

• A job encapsulates the batch process and must consist of one or more steps. Each step can run in sequence, run in parallel, or be partitioned.
• A step is the sequential phase of a job.
• JobLauncher is in charge of taking a JobExecution of a job that is running.
• JobRepository is the metadata repository of the JobExecution.

Create a simple example of a job using Spring Batch, in order to understand how it works. First, create a simple Java project and include the spring-batch dependency. For this, create a Spring Boot application using its initializer (https://start.spring.io), as shown in the following screenshot:
1.PNG

Add the dependency for Spring Batch. You can do this by typing Spring Batch into the search bar within the dependencies box, and clicking on Enter. A green box with the word Batch in it will appear on the selected dependencies section. When this has been done, click on the Generate Project button.

The structure of the project will be as follows:
2.PNG

If you look at the dependencies section that was added by the initializer, you’ll see the spring-batch starter on the pom.xml file, as follows:

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-test</artifactId>
<scope>test</scope>
</dependency>

If you are not using Spring Boot, you can add spring-batch-core explicitly, as a project dependency. The following shows how it looks using Maven:<dependencies> <dependency> <groupId>org.springframework.batch</groupId> <artifactId>spring-batch-core</artifactId> <version>4.0.1.RELEASE</version> </dependency></dependencies>. Alternatively, you can do this using Gradle:dependencies{ compile 'org.springframework.batch:spring-batch-core:4.0.1.RELEASE'}

The project will need a data source; if you try to run the application without one, you’ll get a message in the console showing an error:
3.PNG

To fix this issue, add a dependency as a part of the pom.xml file, to configure an embedded data source. For testing purposes, use HSQL (http://hsqldb.org/):

<dependency>
<groupId>org.hsqldb</groupId>
<artifactId>hsqldb</artifactId>
<scope>runtime</scope>
</dependency>

Now, you need to add the @EnabledBatchProcessing and @Configuration annotations to the application:

@SpringBootApplication
@EnableBatchProcessing
@Configuration
public class SimpleBatchApplication {

Next, set up your first job using the JobBuildFactory class with one task process, based on Spring Batch, using the StepBuilderFactory class:

@Autowired
privateJobBuilderFactoryjobBuilderFactory;

@Autowired
privateStepBuilderFactorystepBuilderFactory;

The Job method will then show that it is starting:

@Bean
public Job job(Step ourBatchStep) throws Exception {
returnjobBuilderFactory.get("jobPackPub1")
         .incrementer(new RunIdIncrementer())
         .start(ourBatchStep)
         .build();
}

Once the Job has been created, add a new task (Step) to the Job:

@Bean
public Step ourBatchStep() {
returnstepBuilderFactory.get("stepPackPub1")
         .tasklet(new Tasklet() {
publicRepeatStatus execute(StepContribution contribution, 
ChunkContextchunkContext) {
return null;
  
            }
         })
         .build();
}

The following code shows what the application class looks like:

@EnableBatchProcessing
@SpringBootApplication
@Configuration
public class SimpleBatchApplication {

public static void main(String[] args) {
SpringApplication.run(SimpleBatchApplication.class, args);
   }

@Autowired
privateJobBuilderFactoryjobBuilderFactory;

@Autowired
privateStepBuilderFactorystepBuilderFactory;

@Bean
public Step ourBatchStep() {
returnstepBuilderFactory.get("stepPackPub1")
            .tasklet(new Tasklet() {
publicRepeatStatus execute
                (StepContribution contribution, 
ChunkContextchunkContext) {
return null;
               }
            })
            .build();
   }

@Bean
public Job job(Step ourBatchStep) throws Exception {
returnjobBuilderFactory.get("jobPackPub1")
            .incrementer(new RunIdIncrementer())
            .start(ourBatchStep)
            .build();
   }
}

In order to check that everything is okay, run the application. To do this, execute the following on the command line:

$ mvnspring-boot:run

Alternatively, you could build the application by running Maven:

$ mvn install

Next, run your recently built jar on the Terminal:

$ java -jar target/simple-batch-0.0.1-SNAPSHOT.jar

Finally, you’ll see the following output in the console:
4.PNG

Pay attention to the console output. To do this, run the job named jobPackPub1 and execute the bean as stepPackPub1.Now, look at the components behind the following steps in more detail:
• ItemReader represents the retrieval of the input for a step
• ItemProcessor represents the business processing of an item
• ItemWriter represents the output of a step

The following diagram shows the big picture of Spring Batch's main elements:
5.PNG

Now, you can complete your example using an ItemReader, ItemProcessor, and an ItemWriter. Using and explaining these components, Pipe-and-Filter architectures can be implemented using Spring Batch.

If you found this article interesting, you can explore Software Architecture with Spring 5.0 for insights into the most common architectural models and when and where they can be used. Software Architecture with Spring 5.0 explains in detail how to choose the right architecture and apply best practices during your software development cycle to avoid technical debt and support every business requirement.

Discover and read more posts from PACKT
get started
post commentsBe the first to share your opinion
Show more replies