Java has fantastic language features ex: Generics, Annotations and now Lambda expressions. These are complimented very well by several core capabilities packaged within a JDK ex: collections (also newer concurrency package), multi-threading.
“Streams” – a feature that Java made available in 8 in combination with lambda expressions can be extremely expressive, almost akin to functional programming (leverages Functional interfaces). It also makes writing code to leverage muti-processors, for parallel execution, much more simpler.
Purely for illustrative purposes I thought of putting together a scenario to understand these features better.
Below is the description of the scenario:
- 100,000 products have monthly sales over one year period.
- Requirement is to sort products by their sales of a specific month.
- If we were to do this on a single processor – the filter, sort and collect operations will happen sequentially with sort being the slowest step.
- If we were to do this in a multi-processor system – we have a choice to either do them in parallel or sequentially.
(Caveat all operations do not uniformly benefit with parallel execution ex: Sorting needs specialized techniques to improve performance while filtering should theoretically execute much faster straight away. Concurrency and List implementations esp. ArrayList is again a non-trivial consideration.In addition streams add a different dimension. NOTE: Sorting in parallel, using streams or otherwise, is unstable. Stable refers to the expectation that two equal valued entries in a list appear in their original order post sorting as well.)
To simplify things we could break this into 4 smaller test cases
- In stream execute filter and collect and then sort the list
- In parallel stream execute filter and collect and then sort the list
- In stream execute filter, sort and collect the list
- In parallel stream execute filter, sort and collect operations.(NOTE: don’t try this as noted above)
Timing 100 runs of each of the above 4 test cases and making observations against Amdahl’s law, which helps predict the theoretical speedup when using multiple processors, felt like a fun illustrative way to understand – streams, parallel streams, sorting, lambda expressions and multi-processor runs.
Here’s the code for it and below are results of one complete execution.
- For case 2 vs case 1: Average Speedup = 1.2291615907722224 Amdahls percentage = 0.18190218936593688 and Speedup = 1.1000508269148064
- For case 4 vs case 3:Average Speedup = 2.1987344172707037 Amdahls percentage = 0.8987062837645722 and Speedup = 1.8160459562383011