Chandra Prakash – Medium

Chandra Prakash

Decoding the Error — SparkException: [CANNOT_INVOKE_IN_TRANSFORMATION]

Jan 18

Decoding the Error — SparkException: [CANNOT_INVOKE_IN_TRANSFORMATION]

Jan 18

Published in
Dev Genius

Why to avoid multiple chaining of withColumn() function in Spark job.

Are you having multiple chaining of withColumn() in your Spark job? Let’s deep dive to understand the implication and how we can avoid it.

Oct 8, 2024

Why to avoid multiple chaining of withColumn() function in Spark job.

Oct 8, 2024

Implications of using synchronized blocks in Scala’s Futures

Sep 14, 2024

Sep 14, 2024

Published in
Towards Dev

Apache Spark : How to solve Reverse Mapping Problem in Spark.

Problem Statement : Let’s say we have a dataset of PairRDD[K, V] where K represents the Student Id and V represents the course names to…

Dec 31, 2021

Apache Spark : How to solve Reverse Mapping Problem in Spark.

Dec 31, 2021

Apache Spark: aggregateByKey vs combineByKey

In this article, we will first learn about aggregateByKey in Apache Spark and in next article (to be published later as both the topics are…

Dec 27, 2021

Apache Spark: aggregateByKey vs combineByKey

Dec 27, 2021

Java 8: Intersection Types, Lambda Serialization & java.io.NotSerializableException

This blog has been split into 2 series —

Mar 6, 2021

Java 8: Intersection Types, Lambda Serialization & java.io.NotSerializableException

Mar 6, 2021

Apache Spark: mapPartitions implementation in Spark in Java

In this blog, we will look at the use case of mapPartitions and it’s implementation in Spark in Java API. Before going forward, please…

Feb 27, 2021

Apache Spark: mapPartitions implementation in Spark in Java

Feb 27, 2021

Published in
Analytics Vidhya

Apache Spark : Secondary Sorting in Spark in Java

We all might have seen secondary sorting in Mapreduce/Hadoop and its key implementation. There are enough information and blogs available…

Feb 18, 2021

Apache Spark : Secondary Sorting in Spark in Java

Feb 18, 2021

Apache Spark : Given a list of user’s comments, determine the latest and last time a user commented.

For the above problem statement, we will be using stackoverflow dataset and its comments.xml dataset. You can download the sample dataset…

Feb 7, 2021

Feb 7, 2021

Apache Spark : DNA Base Count Problem :: How to count each frequencies of A, T, C, G, and N (the…

What is DNA Base Count Problem and how Spark can help us to compute the counts/frequencies. To solve this problem, I will be writing Spark…

Feb 6, 2021

Feb 6, 2021

Chandra Prakash

Chandra Prakash

Cloud Data Engineer : Spark ~ Scala ~ Flink ~ Kafka ~ Concurrency ~ Azure Cloud ~ Design Patterns https://www.linkedin.com/in/chandra-prakash-28932652/

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech