SQL is one of the essential skills for data engineers and data scientists. Apache Hive celebrates the credit to bring SQL into Bigdata toolset, and it still exists in …

source

8
Leave a Reply

avatar
8 Comment threads
0 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
0 Comment authors
deepak guptaHimanshu Sekhar PaulVigneshwaran RAzhagu NagappanJohn Humphreys Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
Dilipan Muthuramalingam
Guest

Hi can u please also explain how to perform partitioning with multiple columns, referring parquet files? (Columns are not part of the DDL)

Sushma Shamsundar
Guest

spark-sql command not working for me

anantwag19
Guest

If you can please point me to CombinebyKey ,Aggregate functions in Spark ? i have not seen any video on this from you , This will help ,As you explain the concepts in Easy way .

John Humphreys
Guest

Please come back to old videos and embed links, etc to the follow-up videos (either in the video or in the description/comments) :). Tracking down the next one is never fun.

Azhagu Nagappan
Guest

Hallo Prasanth, The following link gives more detailed insights to SparkSQL. It's the Phd thesis of Matei Zaharia, founder of Spark. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-12.pdf

Vigneshwaran R
Guest

Hello Sir, I have a doubt..

1) Is there any improvement in performance if we load data from RDBMS into memory as a Dataframe and converting it back to rdd for using "COGROUP" rather than directly loading the data as rdd ?

Himanshu Sekhar Paul
Guest

Sir , please upload all the vedios on a topic with minimum time gap between two conductives

deepak gupta
Guest

One query sir. These spark sql queries are transformation or actions. Means how would be the DAG of these commands.