In this blog I will focus on Incremental load OR updating the exiting records and inserting new record and dynamic partition table loading . In BI world delta load/incremental load is very common activity. We can implement the same on HIVE table using below steps. In a relation database we will perform updating the exiting records using unique index/primary key, but in Hive we have to do it in different way. There are many ways to do it, let us see how we can do it using below steps. I am assuming that you have data in HDFS which will be updating hourly/daily from your sqoop/flume/kafka. Design consideration: I am assuming that good temp space is maintained according to your data volume, so that we not facing temp space issue during the process. I am assuming that you have ingestion logic set to in your code for full load and incremental loads and data is available in HDFS. Partition column value s...
The way of you expressing your ideas is really good.you gave more useful ideas for us and please update more ideas for the learners.
ReplyDeleteHadoop Training in Chennai
Big data training in chennai
big data training in velachery
JAVA Training in Chennai
Python Training in Chennai
SEO Training in Chennai
Hadoop training in chennai
Big data training in chennai
big data training in velachery