Word count using Apache Spark 1.6.1 in Scala

I am using Apache Spark with java, recently I start Spark with Scala for new module.  As I was new to Scala so found quite difficult to start with, new syntax and all together different coding style compare to Java.   Here is my first experience with Spark with Scala.

This is simple and you can find this everywhere, during the process of learning the spark using Scala I also tried the word count example.

Word count in spark is simple even more simple if you use Scala. here is my sample :

will discuss above piece of code line by line :

splitting the text with space and storing in flatMap

filtering empty word

counting the each word in map

increasing the counter of repeat word

here is interesting thing, we swap the value with key so we can sort the value by occurrence of the work

sorting the data with key

at end printing the value.

Here is complete example :

 

One thought on “Word count using Apache Spark 1.6.1 in Scala

Leave a Reply

Your email address will not be published. Required fields are marked *