Category Archives: Scala

Structured Streaming: Philosophy behind it

Knoldus Blogs In our previous blogs: Structured Streaming: What is it? &Structured Streaming: How it works? We got to know 2 major points about Structured Streaming – It is a fast, scalable, fault-tolerant, end-to-end, exactly-once stream processing API that helps users … Continue reading

Posted in Scala | 1 Comment

Structured Streaming: What is it?

Knoldus Blogs With the advent of streaming frameworks like Spark Streaming, Flink, Storm etc. developers stopped worrying about issues related to a streaming application, like – Fault Tolerance, i.e., zero data loss, Real-time processing of data, etc. and started focussing only on … Continue reading

Posted in Scala | Leave a comment

KnolX: Understanding Spark Structured Streaming

Knoldus Blogs Hello everyone, Knoldus organized a session on 05th January 2018. The topic was “Understanding Spark Structured Streaming”. Many people attended and enjoyed the session. In this blog post, I am going to share the slides & video of the session. Slides: … Continue reading

Posted in Scala | Leave a comment

A Beginner’s Guide to Deploying a Lagom Service Without ConductR

Knoldus Blogs How to deploy a Lagom Service without ConductR? This question has been asked and answered by many, on different forums. For example, take a look at this question on StackOverflow – Lagom without ConductR? Here the user is trying to … Continue reading

Posted in Scala | Leave a comment

Spark Structured Streaming: A Simple Definition

Knoldus Blogs “Structured Streaming”, nowadays we are hearing this term in Apache Spark ecosystem quite a lot, as it is being preached as next big thing in scalable big data world. Although, we all know that Structured Streaming means a … Continue reading

Posted in Scala | Leave a comment

Apache Spark: 3 Reasons Why You Should Not Use RDDs

Knoldus Blogs Apache Spark, whenever we hear these two words, the first thing that comes to our mind is RDDs, i.e., Resilient Distributed Datasets. Now, it has been more than 5 years since Apache Spark came into existence and after its arrival … Continue reading

Posted in Scala | Leave a comment

Partition-Aware Data Loading in Spark SQL

Knoldus Blogs Data loading, in Spark SQL, means loading data in memory/cache of Spark worker nodes. For which we use to write following code: val connectionProperties = new Properties() connectionProperties.put(“user”, “username”) connectionProperties.put(“password”, “password”) val jdbcDF = spark.read .jdbc(“jdbc:postgresql:dbserver”, “schema.table”, connectionProperties) In … Continue reading

Posted in Scala | Leave a comment