spring, spring batch

Spring-batch useful job-repository queries

One confusing thing when starting with spring-batch project is to understand the ability to query the Job repository database and understand what is happening with the job. The awesome datamodel is best modeled to suit well for the spring-batch to keep track of the jobs its managing, but when it comes for someone to query on a job or view the status of jobs, you might have to jump to multiple tables or write up a query joining tables to figure out what is going on. So I decided to use this space to document all the frequent queries that I use when I am viewing the job repository. I will keep updating this space if I find something interesting or useful.

The queries are pretty straight forward and can be easily derived if you properly understand the job repository model described in the web-page here.

PS: My queries may not be the best, but they work for me and I am happy to receive feedback on improving them. So lets get started with the queries now.

Query #1: Query to list all the jobs with job name and parameters of the job.

Often we would want to get the job name along with the parameters with which it was started, when it started etc. So this query below is handy for me when I want to find out what are all the jobs that are in the job repository starting from the most recent one first.

SELECT
je.JOB_EXECUTION_ID, je.JOB_INSTANCE_ID, ji.JOB_NAME, je.START_TIME, je.END_TIME, je.STATUS, bjep.*,je.EXIT_MESSAGE
FROM
BATCH_JOB_EXECUTION je inner join BATCH_JOB_INSTANCE ji inner join BATCH_JOB_EXECUTION_PARAMS bjep
where je.JOB_INSTANCE_ID=ji.JOB_INSTANCE_ID and je.JOB_EXECUTION_ID = bjep.JOB_EXECUTION_ID order by je.JOB_EXECUTION_ID desc;

 

Query#2: List all the steps and its status for a job with a job name

Next comes the step information. Finding out which steps are executing what is the status of each step, to what job is this associated to and to what run is it bound to are questions that are often difficult to comprehend and would require jumps to multiple tables. So this query always comes handy to me when I am having such questions.


select
bse.STEP_EXECUTION_ID,bse.JOB_EXECUTION_ID,ji.JOB_NAME, bse.STEP_NAME, bse.START_TIME, bse.END_TIME,bse.COMMIT_COUNT, bse.READ_COUNT, bse.WRITE_COUNT, bse.STATUS, bse.EXIT_MESSAGE, bse.LAST_UPDATED
from
BATCH_STEP_EXECUTION bse inner join BATCH_JOB_INSTANCE ji inner join BATCH_JOB_EXECUTION je
where ji.JOB_INSTANCE_ID = je.JOB_INSTANCE_ID and bse.JOB_EXECUTION_ID = je.JOB_EXECUTION_ID order by bse.job_execution_id desc;

So far these are the two queries that I use very often to find out more information about the jobs. There are a few more I will update once I feel they serve a very common problem. But if there is something that you might want to add, feel free to leave a comment and I will update the post with those.

 

 

Standard
spring

Scheduling a Spring batch job using Java config

I recently ran upon a scenario where I had to schedule a spring batch job. The way I was currently running was using  a CommandLineRunner interface. I did not want to run the project manually. Having a scheduler that runs the job automatically after a certain interval would save me lot of manual work.

Let me show a snippet on how my job configuration looked like.

@Autowired
List jobsToRun;

@Bean
public CommandLineRunner commandLineRunner(JobLauncher jobLauncher,
    JobRepository jobRepository, JobOperator jobOperator, JobExplorer jobExplorer,
    JobRegistry jobRegistry) {
  return args -> {
    SimpleCommandLinePropertySource cli = new SimpleCommandLinePropertySource(args);
    jobsToRun.forEach(job -> {
      try {
        if (!cli.containsProperty("restart")) {
          JobParameters parameters = new JobParametersBuilder()
              .addLong("timestamp", System.currentTimeMillis())
              .toJobParameters();
          jobLauncher.run(job, parameters);
        } else {
          long jobInstanceId = jobOperator.getJobInstances(job.getName(), 0, 1).get(0);
          long lastExecutionId = jobOperator.getExecutions(jobInstanceId).get(0);
          if (jobExplorer.getJobExecution(lastExecutionId).getStatus() == BatchStatus.FAILED) {
            JobExecution execution = jobExplorer.getJobExecution(lastExecutionId);
            JobParameters restartParameters = execution.getJobParameters();
            jobLauncher.run(job, restartParameters);
            log.info("Restarting the job " + job.getName());
          } else {
            log.warn("Cannot restart the job. Job not in Failed state.");
          }
        }
      } catch (JobExecutionAlreadyRunningException | JobRestartException | JobInstanceAlreadyCompleteException | JobParametersInvalidException e) {
        log.error(
            "Error occured while running job " + job.getName() + " Reason: " + e.getMessage());
        throw new RuntimeException(e);
      } catch (NoSuchJobException | NoSuchJobInstanceException e) {
        throw new RuntimeException(e);
      }
    });

  };
}

Note that jobsToRun is an @Autowired List.
The snippet above is just creating a CommandLineRunner bean. Which essentially means that when spring starts up and finds a command line runner bean, then the run(String... args) code will get execute. But still, this has be done manually everytime I want to run the jobs in the context.

Now, lets see how we can schedule the job to run for lets say every 1 minute.

@Scheduled(cron = "0 */1 * ? * *")
public void cronJob() {
  if (deltaMode) {
    jobsToRun.forEach(job -> {
          JobParameters parameters = new JobParametersBuilder()
              .addString("processId", PROCESS_ID + "-" + jobIncrementer
                  .get())
              .toJobParameters();
          try {
            jobLauncher.run(job, parameters);
          } catch (JobExecutionAlreadyRunningException jae) {
            log.info("Job already running. Skipping this cron scheduler pass at {}",
                LocalDateTime.now());
          } catch (JobInstanceAlreadyCompleteException jobAlreadyRunningException) {
            log.info(
                "Starting a new instance of the job on the next schedule. Resetting the jobParameter");
            jobIncrementer.incrementAndGet();
          } catch (JobRestartException | JobParametersInvalidException e) {
            log.error(e.toString());
            throw new RuntimeException(e);
          }
        }
    );
  }
}

It is very similar to what CommandLineRunner was doing, but its not a bean anymore. All you have to do is to annotate a void method with @Scheduled and @EnableScheduling to your Application class. The magic actually happens at the value you provide to the @Scheduled annotation. The string “0 */1 * ? * *” is a cron expression. The cron expression translates to run whatever has to run every minute. More documentation on cron can be found here.

@SpringBootApplication
@EnableBatchProcessing
@EnableScheduling
public class Application {
  public static void main(String[] args) {
    SpringApplication app = new SpringApplication(Application.class);
    app.run(args);
  }
}

That is it! Once you run your job now, spring will try to start it up next minute and try to start it every minute from then.

If your job is in RUNNING state, when cron is trying to start, it will wait until next pass to see if it can start the job.

Standard