Most of spark streaming jobs are expected to be long running jobs. However, we faced with some cases when streaming job should be stopped, then some action done and then streaming job should continue it work.
Idea and implementation by Dariusz Szablinski. To do that, we need to manage spark streaming job lifecycle: start, stop and start again. Below you can find example of that code.
In the example the action that should be triggered periodically is called compaction and it is run every compactionFrequency seconds.
At start up we start separate thread (it is happening on driver). It will notify main thread that time elapsed. Meanwhile, main thread starts spark streaming context and wait for notification (stopAndTriggerCompaction.await()). When time passed, ssc.stop(stopSparkContext = false, stopGracefully = true) stops streaming context. Notice, that spark context itself is not stopped.