The JVM concurrency Java and Scala concurrency

By Alvin Olson,2015-03-10 17:41
12 views 0
The JVM concurrency Java and Scala concurrency

    The JVM concurrency: Java and Scala concurrency

    Processor speed has for decades been sustained and rapid development, and turn of the century comes to an end.Since then, the processor manufacturer is more by increasing the core to improve the performance of the chip and not by increasing clock rate to improve chip performance.Multicore systems is now from mobile phones to enterprise servers all equipment such as the standard, and this trend could continue and accelerated.Developers are more and more needed in their application code to support multiple core, so as to meet the performance requirements.

    In this series, you'll learn some for Java and Scala, a new method of concurrent programming language, including Java how Scala and other language based on the JVM has explored the idea of together.First the article will introduce some background, through the introduction of Java 7 and Scala some of the latest technology, to help understand the panoramic view of concurrent programming on the JVM.You will learn how to use Java ExecutorService and ForkJoinPool class to simplify concurrent programming.Will also learn some existing in extending the concurrent programming options to pure Java function beyond basic Scala feature.In the process, you will see different ways to have any impact on performance of concurrent programming.Subsequent installments article will introduce the Java concurrency improvement and extension of eight, including to perform extensible Java and Scala programmingAkkaKit.

    Java concurrency support

    In the beginning of the birth of the Java platform, concurrency support is one of its features, and the realization of the synchronous thread for it provides language beyond other competition advantage.Scala is based on Java and runs on the JVM, can directly access to all the Java runtime (including all concurrency support).So on the analysis of the Scala features before, I will first quick review the Java language has provided functions. Java thread based

    In the process of Java programming is very easy to create and use thread.Them by Java. Lang. Thread class, said a Thread to execute the code for the Java. Lang. The form of a Runnable instance.If necessary, you can create a large number of threads in the application, you can even create thousands of threads.When there are multiple core, the JVM to use them to the concurrent execution of multiple threads;Beyond the core number of threads will be sharing these core.

    Java 5: the turning point of concurrency

    Java from thestart, includes support for threads and synchronization.But in the imperfection of the initial specification between threads share data, which has led to a major change in the Java language Java 5 update (JSR - 133).The Java Language Specification for Java 5 correction and standardization of the synchronized and volatile operation.The specification also provides the same object is how to use the multithreading.(basically speaking, as long as when performing constructor does not allow references to "escape", the same object is always thread-safe.)Ago, interaction between threads usually need to use the blocked synchronized operation.These changes to support the use of volatile perform non-blocking coordinated between threads.In Java 5, therefore, adds new concurrent collection classes to support a non-blocking operations - this early and only supports method than the blocked thread safety is a major improvement.

    Thread operation coordination is difficult to understand.As long as keep all the content is consistent, from the perspective of application the Java compiler and JVM will not the operation reordering in your code, this makes the problem become more complicated.For example, if the two combined with different operation variables, the compiler or the sequence of the JVM can be installed with a specified reverse order to perform these operations, so long as the process is not in the two operations are complete before using the total number of two variables.The reordering operation flexibility helps to improve the performance of Java but consistency is only allowed to application in a single thread.Hardware might also bring the thread problem.Level of modern systems use a variety of cache memory, in general, not all core in the system can also see the cache.One of the core changes when a memory value, the other core may not immediately see this change.

    Due to these problems, in a thread to use another thread to modify data, you must explicitly control thread interact.Java USES a special operation to provide this kind of control, see in the view of the data in different threads to establish order.Basic operation is that the thread using synchronized keyword to access an object.When a thread on an object in sync, this thread will be unique to this object of exclusive access to a lock.If another thread has held the lock, and wait for the lock of the thread must wait for, or be blocked, until the lock is released.When the thread to restore the execution within a synchronized code block, Java will ensure that the thread can "see" other threads used to hold the same lock to write all of the data, but only the threads by leaving their synchronized lock to release the lock before writing the data.This guarantee applies to the compiler or operations performed by the JVM reordering, applies to hardware memory cache.A synchronized block inside is a lonely island of stability in your code, the thread can be safely performed in turn, interaction and sharing information.

    The use of the volatile keyword in the variable, provides the security interaction between threads in the form of a slightly weaker.The synchronized keyword can make sure when you get the lock can see other threads of storage, and after you get the lock of the other threads can also see your storage.The volatile keyword will this guarantee is decomposed into two different parts.If a thread to write data to the volatile variables, so first will be erased it before that.If a thread to read the variable, then the thread will see not only write the value of the variable, also see writing written by threads of all other values.So reads a volatile variable

    will provide the same as the input a synchronized block of memory, and write a volatile variable will provide the same as the left a synchronized block of memory.But there's a big difference between them: the volatile variables to read or write shall never be blocked. Abstract Java concurrency

    Synchronization is useful, and many are multi-threaded applications in Java using only basic developed synchronized blocks.But coordinating threads may be very troublesome, especially when dealing with many threads and many.Ensure that thread only interaction in the way of security and to avoid potential deadlocks (two or more threads waiting for each other after the lock is released to continue), it is very difficult.Support concurrency and not deal directly with threads and locks of abstraction, which provides developers with a better method of dealing with the common use cases.

    Java. Util. Concurrent deformation of layered structure contains some collection, they support concurrent access, in view of the atomic operation wrapper classes, as well as synchronization primitives.Many of these classes are designed to support a non-blocking access, it avoids the problem of deadlock, and realize the more efficient the thread.These classes that define and control the interaction between the thread easier, but they still face some of the complexities of the basic thread model.

    Java. Util. Concurrent in a pair of abstraction, supports the adoption of a more separation method to deal with concurrency: the Future < T > interface, Executor and the ExecutorService interface.These relevant interface and then became the Java concurrency support the basis of many Akka and Scala extensions, so a more detailed understanding of these interfaces and their implementation is worth it.

    Future < T > a value of type T holders, but it is strange that the value in creating the Future before use.Right after executing a synchronous operation, will obtain the value.Thread can call a method to receive Future:

    ; Check to see if the value is available

    ; Waiting for the value becomes available

    ; When the value available for it

    ; If it no longer requires the value to cancel the operation

    The realization of a Future structural support asynchronous operations of different processing ways.

    Executor is something around a mission of the abstract.This "thing" will eventually is a thread, but the interface hides the threading implementation details.The applicability of the Executor itself is limited, the ExecutorService interface provides the management end of extension method, the results of tasks and generate the Future.Executor of all standard implementation will also achieve the ExecutorService, so in fact, you can ignore the root interface.

    Threads are relatively heavyweight resources, but compared with the distribution and discard them, reuse them more meaningful.ExecutorService simplifies work Shared between threads, also supports automatic reuse threads, realized easier programming and higher performance.ExecutorService ThreadPoolExecutor implementation runs a thread pool on a mission.

    Application of Java concurrency

    The practical application of concurrency is often involved with your main processing logic independent external interaction tasks (and) the interaction of the user, storage, or some other system.It is difficult to concentrate in such applications as a simple example, so at the time of presentation of concurrency, people usually use simple computationally intensive tasks, such as mathematical calculations or sorting.I will use a similar sample.

    Task is to find an unknown input closest known word, which is defined according to the Levenshtein distance recently: converts the input to the known words needed at least add, delete or change the number of characters.I am using the code based on WikipediaThe Levenshtein distanceAn example in the article, the example to calculate the Levenshtein distance of every known word, and return the best match value (or if multiple words have the same distance is known, then return the result is uncertain).

    Listing 1 shows the calculation the Levenshtein distance of Java code.The computation to generate a matrix, the row and column compared with two match the size of the text, add 1 on each dimension.In order to improve the efficiency, the implementation USES a pair of size and the target text in the same array to said the straight line of the matrix, to pack these arrays in each cycle, because I only need a line of value can be calculated on the next line. Listing 1. The Levenshtein distance computation in Java


     * Calculate edit distance from targetText to known word.


     * @param word known word

     * @param v0 int array of length targetText.length() + 1

     * @param v1 int array of length targetText.length() + 1

     * @return distance


    private int editDistance(String word, int[] v0, int[] v1) {

     // initialize v0 (prior row of distances) as edit distance for empty 'word'

     for (int i = 0; i < v0.length; i++) {

     v0[i] = i;


     // calculate updated v0 (current row distances) from the previous row v0

     for (int i = 0; i < word.length(); i++) {

     // first element of v1 = delete (i+1) chars from target to match empty 'word'

     v1[0] = i + 1;

     // use formula to fill in the rest of the row

     for (int j = 0; j < targetText.length(); j++) {

     int cost = (word.charAt(i) == targetText.charAt(j)) ? 0 : 1;

     v1[j + 1] = minimum(v1[j] + 1, v0[j + 1] + 1, v0[j] + cost);


     // swap v1 (current row) and v0 (previous row) for next iteration

     int[] hold = v0;

     v0 = v1;

     v1 = hold;


     // return final value representing best edit distance

     return v0[targetText.length()];


    If there is a large number of known word to comparing with unknown input, and you run on a multi-core system, you can use the concurrency to speed up the processing: decomposition of the set of known words to multiple blocks, each block as an independent task to deal with.By changing the number of words in each block, you can easily change the granularity of task decomposition, so as to understand their effect on the overall performance.Listing 2 shows the Java code of partitioned calculation fromThe sample codeThe ThreadPoolDistance in class.Listing 2 USES a standard ExecutorService, sets the number of threads to the number of available processors.

    In listing 2. Java execution block distance calculation by multiple threads

    private final ExecutorService threadPool;

    private final String[] knownWords;

    private final int blockSize;

public ThreadPoolDistance(String[] words, int block) {

     threadPool =


     knownWords = words;

     blockSize = block;


public DistancePair bestMatch(String target) {

     // build a list of tasks for matching to ranges of known words

     List<DistanceTask> tasks = new ArrayList<DistanceTask>();

     int size = 0;

     for (int base = 0; base < knownWords.length; base += size) {

     size = Math.min(blockSize, knownWords.length - base);

     tasks.add(new DistanceTask(target, base, size));


     DistancePair best;

     try {

     // pass the list of tasks to the executor, getting back list of futures

     List<Future<DistancePair>> results = threadPool.invokeAll(tasks);

     // find the best result, waiting for each future to complete

     best = DistancePair.WORST_CASE;

     for (Future<DistancePair> future: results) {

     DistancePair result = future.get();

     best =, result);


     } catch (InterruptedException e) {

     throw new RuntimeException(e);

     } catch (ExecutionException e) {

     throw new RuntimeException(e);


     return best;



     * Shortest distance task implementation using Callable.


    public class DistanceTask implements Callable<DistancePair> {

     private final String targetText;

     private final int startOffset;

     private final int compareCount;

     public DistanceTask(String target, int offset, int count) {

     targetText = target;

     startOffset = offset;

     compareCount = count;


     private int editDistance(String word, int[] v0, int[] v1) {



     /* (non-Javadoc)

     * @see java.util.concurrent.Callable#call()



     public DistancePair call() throws Exception {

     // directly compare distances for comparison words in range

     int[] v0 = new int[targetText.length() + 1];

     int[] v1 = new int[targetText.length() + 1];

     int bestIndex = -1;

     int bestDistance = Integer.MAX_VALUE;

     boolean single = false;

     for (int i = 0; i < compareCount; i++) {

     int distance = editDistance(knownWords[i + startOffset], v0, v1);

     if (bestDistance > distance) {

     bestDistance = distance;

     bestIndex = i + startOffset;

     single = true;

     } else if (bestDistance == distance) {

     single = false;



     return single ? new DistancePair(bestDistance, knownWords[bestIndex]) :

     new DistancePair(bestDistance);



    In listing 2 bestMatch () method to construct a DistanceTask distance list, and then pass the list to the ExecutorService.This calls for the ExecutorService form will accept a Collection <?Extends Callable < T > > type of parameter, the parameter to perform a task.This call returns a Future list < T >, use it to show the result of the execution.The ExecutorService using calls to call on each task () method returns the value of the asynchronous fill in these results.T type for DistancePair - in this case, a simple distance and matching words said the value of the object, or found in not only matches the value near distance.

    BestMatch () method of the primordial thread of execution, in turn, waiting for each Future is completed, the result of the cumulative best and return it when finish.Through the

    execution of multiple threads to handle DistanceTask, the primordial thread just waiting for a small part of the results.The remaining results comparable to the original thread waiting for the results of completed concurrently.

    The concurrency performance

    To make full use of the system are available on the number of processors, must be the ExecutorService configuration is at least as many threads and processor.You must also pass at least as more tasks and processors to ExecutorService to execute.In fact, you may want to have is much larger than the processor, in order to achieve the best performance.In this way, the processor will be busy with tasks one by one, finally is free.But because the overhead involved (in the process of creating a task and the future, switching between tasks in the process of the thread, and eventually return to task results), you must keep the task is big enough, so that cost is reduced in proportion.

    Figure 1 shows the I in the use of Oracle's Java 7 for 64 - bit Linux ? quad-core AMD system running on the test code when measuring the performance of the number of different tasks.Each input word, in turn, compared with 12564 known words, each task in a range of known words to find the best matching values.All 933 misspelled input words will run repeatedly, running between each round will pause for a moment the JVM can handle, the picture is used in the 10 rounds after the operation the best time.As can be seen from the figure 1, a second input word performance within reasonable block size (basically, from 256 to more than 256) seems to be reasonable, only in the task are very small or very big, the performance will sharply declined.For the block size of 16384, the last value almost created a task, so the performance shows a single thread.

    Figure 1. ThreadPoolDistance performance


    Java 7 another implementation: the introduction of the ExecutorService ForkJoinPool class.ForkJoinPool is for efficient processing can be broken down into subtasks repeatedly

Report this document

For any questions or suggestions please email