If you like to hear me talk about how to use probabilistic data structures for easy scaling and real-time processing with Spark Streaming, here is a video from the London Hadoop User Group:
https://www.youtube.com/watch?v=yXDpPHvJMT0&index=1&list=PL5OOLwV_m9vYluhSk_7XF4nHpI5W75B3N