Monthly Archives: February 2016

A sample ML Pipeline for Clustering in Spark

Knoldus Blogs Often a machine learning task contains several steps such as extracting features out of raw data, creating learning models to train on features and running predictions on trained models, etc.  With the help of the pipeline API provided by … Continue reading

Posted in Scala | Leave a comment

Saving Spark DataFrames on Amazon S3 got Easier !!!

Knoldus Blogs In our previous blog post, Congregating Spark Files on S3, we explained that how we can Upload Files(saved in a Spark Cluster) on Amazon S3. Well, I agree that the method explained in that post was a little bit complex and hard to … Continue reading

Posted in Scala | Leave a comment