rustyrazorblade.com
Introduction to Spark & Cassandra — Rustyrazorblade
http://rustyrazorblade.com/2015/01/introduction-to-spark-cassandra
Introduction to Spark and Cassandra. Fri 02 January 2015. I've been messing with Apache Spark. Quite a bit lately. If you aren't familiar, Spark is a general purpose engine for large scale data processing. Initially it comes across as simply a replacement for Hadoop, but that would be selling it short. Big time. In addition to bulk processing (goodbye MapReduce! Stream processing via Kafka, Flume, ZeroMQ. Sounds awesome, right? Easy scale out and always up. It's approximately this epic:. Let's suppose we...
rustyrazorblade.com
The Myth of Schema-less — Rustyrazorblade
http://rustyrazorblade.com/2014/07/the-myth-of-schema-less
The Myth of Schema-less. Wed 09 July 2014. I have grown increasingly frustrated with the world as people have become more and more convinced that "schema-less" is actually a feature to be proud of (or even exists). For over ten years I've worked with close to a dozen different databases in production and have not once seen "schemaless" truly manifest. What's extremely frustrating is seeing this from. It gets slow to add columns in the most popular implementations as your data set grows. Slow alterations ...
rustyrazorblade.com
On The Bleeding Edge - PySpark, DataFrames, and Cassandra — Rustyrazorblade
http://rustyrazorblade.com/2015/05/on-the-bleeding-edge-pyspark-dataframes-and-cassandra
On The Bleeding Edge - PySpark, DataFrames, and Cassandra. Fri 01 May 2015. A few months ago I wrote a post on Getting Started with Cassandra and Spark. I've worked with Pandas for some small personal projects and found it very useful. The key feature is the data frame, which comes from R. Data Frames are new in Spark 1.3 and was covered in this blog post. We'll be working with open source Cassandra. But this walkthrough will also work with DataStax Enterprise. Step 1: Download Spark. Technically this do...
SOCIAL ENGAGEMENT