Get Introduced With Sqoop Oracle by HDFS Import Extract

The Sqoop (1.99.3) can be connected to the Oracle database by retrieving the records through HDFS. For performing this task the Oracle driver needs to be added in the Sqoop lib directory. The database expert masters the art of adding the drivers in the lib directory. You can learn the ways how Sqoop can be connected to database. The driver can be connected with the classpath and then the Yarn and HDFS can be added in the Sqoop. Finally the sqoop job can be created and activated for the long run in the Import/Extract activities. Now, let us go through each step carefully and learn the command and the property related with them.

Insert Oracle Driver In Sqoop Classpath: The JDBC jar file of the Oracle can be copied into the Sqoop lib directory. Sometimes the files may not exist in the database. Then the jar file needs to be created.

Therefore the command can be like

Sqoop mkdir lib
sqoop cp~ /git / boneill/ data-lab / lib/ ojdbc6.jar. / lib

Insert Yarn And HDFS In Sqoop Classpath:

After inserting the driver, you need to add the Yarn and HDFS in the classpath of Sqoop. In the initial stage of the setup the classpath is managed by the property of common.loader within the server/conf/catalina.properties file. For introducing the Yarn cluster properly, you can use the following property of common.loader like this way:

common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${catalina.home}/../lib/*.jar,/Users/bone/tools/hadoop/share/hadoop/common/*.jar,/Users/bone/tools/hadoop/share/hadoop/yarn/lib/*.jar,/Users/bone/tools/hadoop/share/hadoop/mapreduce/*.jar,/Users/bone/tools/hadoop/share/hadoop/tools/lib/*.jar,/Users/bone/tools/hadoop/share/hadoop/common/lib/*.jar,/Users/bone/tools/hadoop/share/hadoop/hdfs/*.jar,/Users/bone/tools/hadoop/share/hadoop/yarn/*.jar

After adding, the Sqoop server needs a restart function with all the new jar files including the driver file of jar. A JDBC connection can be generated at this time. Now, the connection can be created through the following commands:

bin/sqoop.sh client
…
sqoop> create connection –cid 1
Generating transaction for connector by id 1
Kindly fill up values to create new connection item
Name: my_datasource
Connection configuration
JDBC Driver Class: oracle. jdbc. driver. OracleDriver
JDBC Connection String: jdbc: oracle:thin: @change.me :1521:service.name
Username: your.user
Password: ***********
JDBC Connection Properties:
Now, there are zero values within the map:
entry# HIT RETURN HERE!
Security related configuration options
Max connections: 10
Successful creation of the new connection has been generated with validation status FINE and persistent id 1

Build Sqoop Job : After connecting the Yarn and HDFS in the Sqoop classpath, the new sqoop job can be created in the following method:

sqoop> create job –xid 1 –type import
Creating job for connection with id 1
Please fill following values to create new job object
Name: data_import

Database configuration

Schema name: MY_SCHEMA
Table name: MY_TABLE
Table SQL statement:
Table column names:
Partition column name: UID
Nulls in partition column:
Boundary query:

Output configuration

Storage type:
0 : HDFS
Choose: 0
Output format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
Choose: 0
Compression format:
0 : NONE
…
Choose: 0
Output directory: /user/boneill/dump/

Throttling resources

Extractors:
Loaders:
New job was successfully created with validation status FINE and persistent id 3

The transparency factor can be maintained in the directory where the final output will be visible.

Activate The Sqoop Job: After the successful creation of the Sqoop job, you are required to run the job or activate it. Here is the method by which it can be done easily.

sqoop> start job –jid 1

Submission details

Job ID: 3

Server URL :http://localhost:12000/sqoop

Created by: bone

Creation date: 2014-10-14 13:27:57 EDT

Lastly updated by: bone

External ID: job_1413298225396_0001

http://your_host:8088/proxy/application_1413298225396_0001/ 2014-10-14 13:27:57 EDT: BOOTING – Progress is not available

Then you can easily watch the created Sqoop job through the Yarn . You may go to HDFS and search for your files in the final output directory. This will only be possible after a bit of churning activities take place. So what are you waiting for? Let’s get started with the Sqoop Oracle.

One thought on “Get Introduced With Sqoop Oracle by HDFS Import Extract”

Alexander Fernandise says:
September 15, 2015 at 1:51 pm
Thanks a lot for discussing and sharing the article. In this article get introduced with sqoop oracle by HDFC Import Extract. This article gives information about how insert oracle driver in sqoop classpath .This article are very much helpful for me. I hope that you will keep sharing such a post in future.If you are interested to know mote then you can go for remote DBA services.I would surely share this with colleagues.

About Speakbindas

One thought on “Get Introduced With Sqoop Oracle by HDFS Import Extract”

Leave a Reply Cancel reply

Receive free updates

Search

About

Latest Posts