The @ThreadSafe annotation

When you’re writing a library (or any code, really), you should make sure that you’re not only writing code which will cover your use case for it, but also that you’re doing it in a way that others could benefit from it. With this I don’t mean early optimisations, or functionality which could be easily labeled as YAGNI, let me be clear on that. But effective code should be self-explanatory, not only in the naming and its modularity, but also in the way it can be used. I found a great example of how to do this on the AWS SDK for Java, since I had to use it while trying AWS Lambda. I was trying to determine if I could safely reuse one of their clients when I stumbled upon this comment on StackOverflow about the @ThreadSafe annotation.

Place this annotation on methods that can safely be called from more than one thread concurrently. The method implementer must ensure thread safety using a variety of possible techniques including immutable data, synchronized shared data, or not using any shared data at all.

That’s exactly what I needed, official confirmation from the author of the library (in this case AWS) about its use on a multi-threaded context. It seems that the original idea for this annotation, and others, came in the “de-facto” Java concurrency bible, Java Concurrency in Practice.

AWS have their own implementation of the @ThreadSafe annotation.

/**
 * Documenting annotation to indicate a class is thread-safe and may be shared among multiple threads.
 *
 * @see NotThreadSafe
 */
@Documented
@Target(ElementType.TYPE)
@Retention(RetentionPolicy.CLASS)
public @interface ThreadSafe {
}

And yes, the client I needed is thread-safe.

Stream – intermediate operations are lazy

More formally, on the JDK 8 Stream javadocs:

Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate. Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.

Which also means that if you don’t have a terminal operation, no intermediate operations code will get executed. This may not be obvious at first sight, though:

On this little example, we’re just creating a List of Strings, and then a Stream of Strings from that very same list. Let’s say that we also want to print all the words on our stream before every step of our Stream processing, which in this case involve:

  1. Filtering the Stream to keep only the words which contain the letter “a” as we start processing them.
  2. Making the remaning Strings uppercase.
  3. Counting them.

The step 3 is commented out, please run this main method and check how nothing was printed out. Now, uncomment line 22 and run it again, you’ll see an output like this:

while filtering (intermediate operation)… yo
while filtering (intermediate operation)… quiero
while filtering (intermediate operation)… ser
while filtering (intermediate operation)… llorando
while uppercasing (intermediate operation)… llorando
peeking before counting (terminal operation)… LLORANDO
while filtering (intermediate operation)… el
while filtering (intermediate operation)… hortelano
while uppercasing (intermediate operation)… hortelano
peeking before counting (terminal operation)… HORTELANO

What makes everything being executed is not the peek operation, which is also an intermediate one, but counting the remaining elements in the Stream, which is a terminal operation and triggers every previous intermediate operation. This may be easy to understand conceptually, but it’s also easy to forget about the terminal operation and wonder why that peek operation is not printing out anything in your console.

Accessing an element of a List is NOT a constant operation

At least not always... this should be a pretty straightforward fact for any software developer, but I must confess that it bit me as I was implementing the Oracle MOOC homework in Scala for Week 3. Maybe I got too used to a particular implementation of a List, the ArrayList, which access any of its elements by an index in constant time. But it can only do this due to its particular implementation, we have to be careful when choosing our data structures.

Basically, I realised that there was a problem with my Scala implementation, but I couldn’t see what was it. This is what was going on…

On these two code snippets, we can see the Java version of the initialisation of sourceWords, which will hold a list of words read from a file, and the creation of another list of words with a given size, where all the words come from randomly picked words in sourceWords.

The same had to be implemented in Scala, and I did it like this:

The Java and Scala implementations read pretty much the same, but there was a problem when building the random list with 2.000 elements from sourceWords in Scala, when comparing the time it takes with its Java counterpart:

Scala -> Creation of the list of words took 4146 ms
Java -> Creation of the list of words took 16 ms

Obviously, this didn’t make me very happy. It was Chris Loy who pointed out the possible cause; I just couldn’t see it.  At this point, it’s probably a useful advice to recommend you having a look at this website (from which I stole the next image), in which the complexity of some of the classic operations we can do on different data structures is summarised, using Big O notation. If you don’t know what’s that, apart from reading this Wikipedia link you should probably do one of these two algorithms courses.

data_structure_operations

As we can see in the previous image, lists have a linear access time, rather than a constant one. Retrieving 2000 elements from a list, referencing 2000 random indexes from it is slowing things down quite a bit. If that was the case, though, then the Scala and the Java implementations should be slow. But it turns out that (even if “There are no guarantees on the type, mutability, serializability, or thread-safety of the List returned”, as stated in the Java 8 Collectors#toList javadoc) my Java SDK implementation is actually using an ArrayList when collecting the elements of the Stream. And, checking the Javadoc for the ArrayList, it shouldn’t come as a surprise to read that “The size, isEmpty, get, set, iterator, and listIterator operations run in constant time.” This explains why the Java implementation is so quick. In order to fix the Scala one, I just changed the data structure in which we keep the sample words from being a List to be an Array.

The client only cares about getting a List when calling createList, and that method is still returning a List:

The results of this little tweak:

Scala -> Creation of the list of words took 1 ms
Java -> Creation of the list of words took 8 ms

It’s a bit unsettling to think that I may have made plenty of mistakes like this in the past, but I guess they’re only mistakes when they lead to performance issues.

Know your Big O, gentlemen, know your Big O.

Oracle MOOC on Lambdas and Streams: Week 3 homework in Java and Scala

Part 1: A file called “words” is provided containing a large collection of words. A template file, RandomWords.java, is also provided. In the constructor you need to read all the words (which are one per line) from the source file into a list (remember to use a stream to do this). You also need to write the body of the createList() method. This generates a list of the size specified as a parameter selecting words at random from the list read in the constructor. HINT: You can use the ints() method of the Random class, which returns a stream of random integers. You can specify the size of the stream using a parameter.

Nothing really new in the constructor, we’ve seen this in previous posts.

We’re making use of one of the Random#ints versions, which allows us to generate an specialized type of Stream, an IntStream, with listSize elements with random values in between 0 (inclusive) and the size of the sourceWords list  (exclusive). These will be the indexes of the elements of the sourceWords list which will serve as the random words requested for this exercise. When we map from an IntStream, though, if we’re transforming its elements into objects, we can’t use the map method, but the mapToObj one. Extracted the random words, we collect them into a List as usual.

Nothing new in the Scala version of the constructor, which will generate the list of words.

Through the GenTraversableFactory#fill method, we generate a sequence of listSize elements, with values being generated by Random.nextInt(sourceWords.size), which will provide an integer between 0 (inclusive) and the size of the sourceWords list (exclusive) every time it’s invoked. Fill will invoke it listSize time to create the Sequence with the random integers to serve as indexes of the random words from the sourceWords list. Mapping and retrieving a list is done as we saw before.

Part 2: In order to provide a relatively compute-intensive task we will calculate the Levenshtein distance between two strings… a source file, Levenshtein.java containing a lev() function is provided that will calculate the distance for you. A second template file, Lesson3.java, is provided. This contains the code necessary to measure the time taken to execute the code of the get() method of a Supplier (as you will see in the main() method this is simple to do with a lambda expression). Your task is to write the necessary code in the computeLevenshtein method to calculate the distances between each pair of strings in the wordList using the streams API. You will need to process this sequentially or in parallel based on the flag passed as a parameter. Try modifying the size of the wordList to see what impact this has on the performance, and in particular, how the difference between sequential and parallel performance is affected by the input size.

There are many possible ways to solve this exercise. This is just one of the possible solutions, inspired in one of the proposed in the @SVQJUG mailing list. Firstly we create a Supplier which will generate a Stream of Strings every time we invoked its get method. Being Supplier a functional interface, a lambda expression can be used to fill its body, which we do, although wrapped in a ternary operator, depending on the flag which will mark if the stream to be generated should be a parallel one or not.
We generated two streams around the same wordList list, meaning we will iterate over the elements of the list twice. Since we’re nesting these iterations, we make sure we process every element with every element of the list, including itself. For every couple of words, then, we get their Levenshtein distance and store it in the distances array position referencing the indexes of the elements in the original list. The second map, the one transforming the inner Stream, needs to mapToInt, since we’ll be generating an IntStream, which will allow us to obtain an array of integers directly when collecting its elements through the terminal operation toArray(). The outer Stream, since it will contain arrays of integers, has to go though the typical collection of its elements into a List before this list can be converted into an array (bidimensional, since it will contain arrays) which turns up to be what we wanted to obtained in the first place, being able to be assigned into our already declared variable distances to capture the final result.

Levenshtein for 1000 elements:

Java Sequential Levenshtein
Sequential took 1585ms
Sequential took 814ms
Sequential took 781ms
Sequential took 652ms
Sequential took 638ms
Java Parallel Levenshtein
Parallel took 347ms
Parallel took 206ms
Parallel took 186ms
Parallel took 167ms
Parallel took 173ms

We can clearly see an increase in performance when using parallel streams.

We will make our List to be treated in parallel though the invocation of the Parallelizable#par method, which belongs to the Parallelizable trait, inherited by the List class. We’ve extracted this operation into a different method.
The rest of the process is exactly like the Java version, although there is a lot less noise, cause by the simpler Scala API’s. We about the collectors and the mapToInt Stream related operations, since Streams are not needed, nor exist, in Scala.

Scala for 1000 elements:

Scala Sequential Levenshtein
Sequential took 621ms
Sequential took 650ms
Sequential took 602ms
Sequential took 525ms
Sequential took 682ms
Scala Parallel Levenshtein
Parallel took 189ms
Parallel took 183ms
Parallel took 221ms
Parallel took 144ms
Parallel took 152ms

In Scala, parallel collections proof to be way quicker also. Times are very similar to the Java version.

Part 3: As another demonstration of the differences in sequential and parallel stream processing there is a second method, processWords() for you to implement. Take the list of strings passed and process these using a sequential or parallel stream as required to generate a new list. Start by simply sorting the strings then experiment with adding things like mapping to lower or upper case, filtering out certain words (such as those beginning with a certain letter). See what impact adding distinct() to the stream has. For each of these vary the size f the input list to see how this affects the performance. Find the threshold below which sequential will give a faster answer than parallel.

In my case, I decided to do the next operations with the words in the Stream: sort the stream first, by natural order, then make all its elements to go lower case, then upper case, and then removing all the elements which start with A, M and Z or even the ones would start (none, since we made the elements upper case) with a, m or z.
The difference between having distinct() or not is very well explained on this API note:

Preserving stability for distinct() in parallel pipelines is relatively expensive (requires that the operation act as a full barrier, with substantial buffering overhead), and stability is often not needed. Using an unordered stream source (such as generate(Supplier)) or removing the ordering constraint with BaseStream.unordered() may result in significantly more efficient execution for distinct() in parallel pipelines, if the semantics of your situation permit. If consistency with encounter order is required, and you are experiencing poor performance or memory utilization with distinct() in parallel pipelines, switching to sequential execution with BaseStream.sequential() may improve performance.

In our case, since we’re asked to start ordering the strings, we can infer how using a parallel Stream won’t give us much if distinct() is present. Let’s see numbers for a 30000 elements wordList:

Java Sequential process words with Distinct
Sequential took 115ms
Sequential took 70ms
Sequential took 70ms
Sequential took 49ms
Sequential took 72ms
Java Parallel process words with distinct
Parallel took 141ms
Parallel took 22ms
Parallel took 18ms
Parallel took 15ms
Parallel took 97ms
Java Sequential process words without Distinct
Sequential took 28ms
Sequential took 15ms
Sequential took 15ms
Sequential took 16ms
Sequential took 14ms
Java Parallel process words without distinct
Parallel took 8ms
Parallel took 6ms
Parallel took 6ms
Parallel took 6ms
Parallel took 8ms

Clearly, a parallel Stream is quicker than a sequential, but the difference is bigger when we’re not doing distinct.

By trying different sizes for the wordList, I found the threshold to be 3400 elements. Below that number, sequential implementation of the Stream will almost always be quicker than parallel, so it’s not worth making the Stream parallel unless you need to do this particular processing on collections with less than 3400 elements. At least on my hardware.

The Scala version is straightforward, really. Resuls for a 30000 elements wordList size, with and without distinct():

Scala Sequential process words with distinct
Sequential took 196ms
Sequential took 31ms
Sequential took 17ms
Sequential took 17ms
Sequential took 65ms
Scala Parallel process words with distinct
Parallel took 44ms
Parallel took 29ms
Parallel took 34ms
Parallel took 33ms
Parallel took 25ms
Scala Sequential process words without distinct
Sequential took 14ms
Sequential took 14ms
Sequential took 55ms
Sequential took 15ms
Sequential took 16ms
Scala Parallel process words without distinct
Parallel took 22ms
Parallel took 20ms
Parallel took 26ms
Parallel took 53ms
Parallel took 80ms

As in the Java implementation, the Scala parallel implementation is quicker than the sequential.

Oracle MOOC on Lambdas and Streams: Week 2 homework in Java and Scala

Exercise 1: Create a new list with all the strings from original list converted to lower case and print them out.

For this week homework we’ll start using the infamous Java 8 Streams, finally, being restricted as we were last week. The Collection interface had a new method added on the JDK 8, a default one, called stream(). It will, as its Javadoc states, “Returns a sequential Stream with this collection as its source”. The Stream interface is where the fun begins, since it allows us to invoke the high-order functions which lay at the core of all the buzz, and the fun, of functional programming. One of these functions, or methods, is map. We already talked about it a bit in our previous post, when reviewing the Scala solution for exercise 4 of the Week 1 homework. Map receives a function as a parameter, and according to its Javadocs entry: “Returns a stream consisting of the results of applying the given function to the elements of this stream”. Looking at the code: after we create the stream from our list, we map this stream with a function passed as a parameter which will ultimately make every element of the stream become lower case (since the function passed is the method reference Lambda String::toLowerCase). Mapping a stream, though, is an intermediate operation, so we still need to do a terminal operation on the stream to extract or do something with the stream values.
Our terminal operation here is collect, which “Performs a mutable reduction operation on the elements of this stream using a Collector”. In this case, the collector used (Collectors.toList()) will collect all the elements of the stream in a new List. Collectors is one of the most important type of terminal operations, and you should probably have a look at them, they are very useful.
We end up iterating through the elements of the new list, printing them out with through the method reference lambda System.out::println.

The Scala implementation of this exercise is pretty straight forward. As we said last week, Scala Collections implement directly the high-order functions, so you can map directly over a List object. We know also by now that we can pass a function directly as a parameter to the map method, and how the underscore placeholder shortens our syntax.
It’s important to state, though, that there is not such concept as terminal or intermediate operation in Scala. If you map on a List, you’ll get another List. It’ll depend on the function you’re passing to the map method if the elements contained within the List retain their type or become something else. In this case, the function we’re passing works directly on a String, transforming it into another String, but nothing really would prevent us from returning the length of every word of the List, for example, which would have given us a List[Int], rather than the initial List[String].
As I was saying map is not an intermediate operation in Scala, at least not necessarily. You can stop after the first map and capture its result in a new List. In Java 8, since the Collections interface don’t have a default implementation for these high-order functions, you need to promote them to Stream, apply the high-order functions and then downgrade or terminate the stream with any of the available terminal operations.

Exercise 2: Modify exercise 1 so that the new list only contains strings that have an odd length.

This exercise builds on top of the previous one, so we’ll only explain the addition of another intermediate operation, filter, which “Returns a stream consisting of the elements of this stream that match the given predicate”. The Predicate passed is obviously a lambda expression which will hold only for the Strings with an odd length, as required.

The Scala implementation is pretty straightforward, given that we also have the filter method in the Scala Lists. Again, this is not an intermediate operation, so in this case the order of execution does matter. After filtering the List, we will only map (hence being turned into lower case) over the elements with odd length, printing them out at the end of it.

Exercise 3: Join the second, third and forth strings of the list into a single string, where each word is separated by a hyphen (-). Print the  resulting string.

Once we have our stream, it’s easy to select a particular slice of the cake, if you want, as long as the cake is big enough. The skip and limit methods will allow us to discard the first n elements of the stream and truncate its length respectively. These are both intermediate operations, at the end of which we will have, best case scenario, a stream with 3 elements on it. Before examining our terminal operation, let me say this:

  1. If you try to skip a number of elements bigger than the number of elements of the stream you’re operating on, you will get an empty stream.
  2. If you try to limit a number of elements bigger than the number of elements of the stream you’re operating on, you will get the same stream.
  3. If you try to skip or limit a negative number of elements, you will get an IllegalArgumentException.

After manipulating our stream according to our purposes, we’re using here the joining collector, which will append all the elements of the stream with the given delimiter. Obviously, they become a String after this.

In Scala, there is a method on the List class which is equivalent semantically to the action of skipping n1 elements of the list and limiting and skipping n2 elements after it; and this method is slice. It takes two integers as parameters, the first being the index of the first element (inclusive) that you want to have in your slice and the second being the index of the element (exclusive) next to the last that you want to have in your slice. After slicing, mkString will display all then elements of the sliced List in a string using the given separator string.

Exercise 4: Count the number of lines in the file using the BufferedReader provided.

On this exercise, there is not much to say about streams and lambdas, other than what’s going on without the print statement on line 4. The method lines() on the BufferedReader is another way of building a stream, in this case containing all the lines of the file from which the reader was opened. After obtaining this stream, all we do is the terminal operation count() on it, which will return the number of elements of the stream, which is precisely the number of lines of the file.

When it comes to the Scala version, we can use the Scala singleton object Source to read the content of a file into a BufferedSource object, which, as its Java BufferedReader counterpart, exposes a method to extract the lines of the file, being in this case called getLines(). The return of that type, though, is not a stream, since they don’t exist in Scala, but an Iterator, which holds another method to return its length, which, again, will coincide with the number of lines of the file.

Exercise 5: Using the BufferedReader to access the file, create a list of words with no duplicates contained in the file. Print the words. HINT: A regular expression, WORD_REGEXP, is already defined for your use.

If you know already the difference between map and flatMap, you can skip to the Scala solution.
We’ve seen before how useful map can be, but it may come short in a few situations. In this case, for example; and I’ll try to explain why before introducing a possible solution. In this exercise solution, we’re creating a stream in line 6, and this stream will contain all the lines in the file. We’re not asked to retrieve the list of lines in the file, but the words on it. So, somehow we need to transform the stream containing the lines of the file to a stream containing the words in the file (and avoiding repetition). We know, also, that there is a static method in the Stream interface to create a stream from a list. At least I knew, and if you didn’t, well, you know now. And we also know how to create a list of words from a line of the file (using the provided regular expression and the classic String#split() method). Well, if we map through every line in the stream, obtaining a list of words per line, and creating a new stream with it, we’ll end up with a stream of streams. I’m not sure about you, but I wouldn’t be too sure about how to collect all the words in the file after that. We need to, somehow, be able to allocate all the words in the file in a single stream, evolved from the primitive stream holding the lines of the file. This is when flatMap comes to the rescue. Its Javadoc is actually quite self-explanatory, if read slowly: “Returns a stream consisting of the results of replacing each element of this stream with the contents of a mapped stream produced by applying the provided mapping function to each element. Each mapped stream is closed after its contents have been placed into this stream”. In our example, this is what will happen:

  1. We create a stream from all the lines of the file.
  2. For every entry of the stream, or line of the file, we split it into a list of words.
  3. That list of words will be grouped in a Stream, or mapped stream, as referred to by the Javadoc.
  4. Once that mapped stream is created, its content is transferred or place to the original, primitive or parent stream.
  5. The mapped stream is closed.
  6. The resulting stream is an aggregation of all the words that were once held by the mapped streams.

The reasons why this operation is called flatMap and not mapFlat, I can’t comprehend, since all it’s happening is a classic map that is transforming something into an aggregation or stream of some things, followed by a flattening process in which the aggregation of the results of the map is lost, and its members brought to the original stream where the operation began. Trying to explain this with words is actually way more painful than understanding it. Once you do, you’ll wonder why it took you so long.

After flatMap successfully aggregates all the words in the file, we just need to call the intermediate operation distinct() to remove duplicates and collect the results in a list, as we already know.

The Scala version also uses the flatMap method, which as map is present on the List class natively (again, streams don’t exist in Scala). The only difference, though, is the fact that, as we said, the Scala Source#getLines() returns an Iterator, and obviously this one does not have any method to remove duplicates. Iterators are not intended to know about the content they refer, but only allowing its access programmatically and sequentially. We need to transform the Iterator into some other data structure which allows us to remove duplicates. Fortunately, we can invoke the Iterator#toSeq() method, which will give us a Seq (base interface “aka traits” for sequences in Scala) in which it’s possible to call Seq#distinct()

Exercise 6: Using the BufferedReader to access the file create a list of words from the file, converted to lower-case and with duplicates removed, which is sorted by natural order. Print the contents of the list.

This post is already quite long, so I won’t say much about this, other than we transform the words to lower case mapping on the result of the flatMap (which was already a stream containing the words of the file), and we sort by natural order calling the Stream#sorted() method (from its Javadoc: “Returns a stream consisting of the elements of this stream, sorted according to natural order”).

The Scala solution does exactly the same, although to sort we’re calling Seq#sorted.

Exercise 7: Modify exercise6 so that the words are sorted by length.

I won’t say much about this either, it’s pretty much like the previous case, with the difference of the Stream#sorted(Comparator) method invoked, in this case the parameterised version. Since Comparator is a functional interface, we just use a lambda expression to implement its only method, which will order the Strings in the Stream by their length. Nothing new apart from this here.

The Scala solution does exactly the same, although to sort specifying a custom ordering, we’re calling a different method this time: Seq#sortBy.

Oracle MOOC on Lambdas and Streams: Week 1 homework in Java and Scala

This post is the first of a series covering the implementation of the home work for this MOOC by Oracle. The course is about Lambdas and Streams on Java 8, but I felt compelled to implement them also in Scala, and compare both implementations. On this post and the next two, I’ll try to explain the main differences between these implementations and the way both languages empower you to use high order functions as part of a functional programming way of solving problems.

7500x7500_hd_half_life_lambda_by_guardians38-d5opt6n

Exercise 1: Create a string that consists of the first letter of each word in the list of Strings provided. HINT: Use a StringBuilder to construct the result.

Through Arrays.asList(alpha, bravo, charlie, delta, echo, foxtrot), we create a List object with the elements passed. In this case, we’re interested in the first letter of every word on the list, so for every element of the list, we extract its first character, appending it to a previously declared and initialised StringBuilder object. This object is used to capture the result, as advised.
It’s important to talk about the forEach method here, which is inherited by the List interface from Iterable. It’s not quite the same as the Stream#forEach method… in both cases, though, they receive an object implementing the Consumer interface as a parameter. As you may have guessed, Consumer is a functional interface, making it possible to replace it with a Lambda expression. In our case, our Lambda will be word > result.append(word.charAt(0)), with word being the lambda parameter and result.append(word.charAt(0)) its body.  This body is far from ideal though, since it’s being used to alter an external (to the lambda) object. In this case we can do it without losing determinism in the result, since the order of execution is guaranteed when calling Iterable#forEach, but the same can’t be said about Stream#forEach. When possible, we should aim to avoid side effects in our lambdas.
The problem begins sooner than the lambda, though. The semantics of the operation forEach comes with SIDE EFFECTS, since a Consumer, by definition, takes something and returns nothing: whatever it does while consuming its input will have a side effect, whether it’s on the input itself or on another entities is the discussion. But I digress…
Let’s see a possible implementation of this first exercise in Scala:

Let me say this first: this is terrible Scala code. You wouldn’t do this in Scala normally, but I am trying to keep it close to the way we solved the exercise in Java (as we were requested to use a StringBuilder to accumulate the result). Again, you wouldn’t do this unless you have a very powerful reason to do so.
About the code… in Scala, you can create and initialise collections on the fly, and that’s what we’re doing in line 4. You also don’t need to specify which implementation you want for the list, as you have to in Java (ie ArrayList). By default, Scala lists are immutable, although you could create a mutable one, as we’ll see in the next exercise. As in the Java example, for every element of the list, we extract the first character of the word (Scala strings don’t exist per se, Java ones are used directly by Scala) and we append it to the StringBuilder to construct the string we’ve been asked for. Let’s move on, nothing special here.

Exercise 2: Remove the words that have odd lengths from the list. HINT: Use one of the new methods from JDK 8.

The creation of the list is pretty straight forward, as in the example before, with the only difference of having to capture the list object in a variable, to illustrate the fact that we will be removing the elements from that very same list, rather than building a new one. This obeys to an strict (if you want) interpretation of the problem wording. Any Java collection contains a default implementation for the method removeIf. It takes a Predicate, and it will remove the elements of the collection for which the Predicate holds. And yes, Predicate is another functional interface, that’s why we can pass a lambda there. About the lambda itself, not much to say, it should be self-explanatory: it will hold for words with odd length.

I must confess I spent some time looking for a method which would allow me removing elements from a mutable list in Scala, but I couldn’t find it. I won’t be using the same approach as in the Java solution here, relaxing the interpretation of the problem enunciate. From our list of words, we filter the elements which length is even. The fact that we’re doing it with filterNot is just to annoy people, since it’s a bit counterintuitive. But in fact it’s quite trivial, really: just a negation of the condition passed to filter.
And, by the way, in case you’re puzzled by the function that we’re passing to the filterNot method, more specifically by the underscore, don’t worry too much. You’ll get a hold of it as soon as you relax and sit back. It’s just a placeholder. This…

_.length % 2 != 0

…is equivalent to…

word => word.length % 2 != 0

There are a lot more places where the underscore may appear, but I wouldn’t worry too much about it for now, if you’re not familiar with Scala.

Exercise 3: Replace every word in the list with its upper case equivalent. HINT: Again, use one of the new methods from JDK 8.

Any Java List contains a default implementation for the method replaceAll. It takes a UnaryOperator, which will operate on an operand and produce a result, being operand and result from the same type. And yes, UnaryOperator is also a functional interface, subsequently we can pass a lambda there too. The lambda will make every string in the list go uppercase, though a call to the appropriate method of the String class. When a lambda just calls a method, it’s known as a method reference lambda.

In this case, though, I found a method on the MutableList interface (inherited from SeqLike) which allows us to replace all the elements of a list with different versions of themselves, in the very same (mutable) list, rather than generating a new one on the fly. This method is transform. We don’t need to drop elements from the mutable list this time, which seemed to be the no game factor for the Scala mutable lists (unless I’m missing something, which may very well be the case). Through transform, every word in the list will be replaced or transformed by itself in uppercase. If we have a look at its prototype:

def transform(f: (A) ⇒ A)

In Scala, functions are first class citizens. I never really understood this sentence until I didn’t see how functions have been introduced in Java. In the case of transform, it receives a function from A to A as a parameter. This is pretty much equivalent to the Java UnaryOperator functional interface. After every word is transformed to its uppercase equivalent, the list position that it used to be on is updated with the new version of the word.
In Scala, when a method does not receive any parameters, you can just ignore the parenthesis when calling it.

Exercise 4: Convert every key-value pair of the map into a string and append them all into a single string, in iteration order. HINT: Again, use a StringBuilder to construct the result String. Use one of the new JDK 8 methods for Map.

Not a lot to say from lines 2 to 6: just create a map and initialise it with a few values. As in exercise 1, we’re asked to accumulate results in an StringBuilder, and we do so. Map.entrySet returns a set (unsurprisingly) of Map.Entry‘s. Being a set, it allows us to do our already seen Iterable#forEach, where we can specify what we want to do on every element of the iterable through a lambda. This lambda will just extract every entry key and value, appending them to our StringBuilder accumulator.

On line 3, we create and initialise our TreeMap in Scala. On line 6, we map every element of the map. Don’t get confused by the unfortunate use of the word map in two different places. The map object will hold a TreeMap map. The map method has nothing to do with it. Map is a method which can be applied to any Scala Collection, and which will receive a function and apply to the elements of the collection, transforming them. It’s the same concept as in the previous exercise, with the subtle difference of allowing us to transform every element of the collection into anything, really, rather than into something of the same type. In this case, we’re concatenating the key and the value of every entry, and the resulting collection would contain, rather than entries, words conformed by the key and the value of every entry. That resulting collection gets invoked by the foreach, which gets every word and appends it into our accumulator.
Map elements in Scala are tuples, which allow accessing its elements through tuple._1 and tuple._2 syntax. More on Scala tuples, in case you’re interested.

Exercise 5: Create a new thread that prints the numbers from the list. HINT: This is a straightforward Lambda expression.

Creating a new Thread object in Java requires you to pass a Runnable. An interface in Java doesn’t get more functional than this one: its only method enables us to use a lambda where a Runnable is expected. In this case, we’re using what it’s known as a Supplier lambda expression, in which the lambda parameter section is empty, but something is happening on its body.
This something is just a list, which elements we iterate though and print in no special fashion. We’ve seen this all before in the above exercises.
After creating the Thread object, we start it and we’re done here.

The Scala implementation of this exercise is pretty much the same as in Java, but we’re kind of limited by interoperability issues, so we can’t use Java lambdas when writing Scala. That forces us to create an anonymous inner class on the fly to define our Runnable. Inside it, though, in its run method, we use the Scala List foreach method, to print them. As stated before, functions being first-class citizens in Scala allows us to pass them as parameters, so we can pass the print function to the foreach method, which will be reflected on every element of the collection being printed. Effectively, it’s pretty much the same as the method reference lambdas.