The Sqoop (1.99.3) can be connected to the Oracle database by retrieving the records through HDFS. For performing this task the Oracle driver needs to be added in the Sqoop lib directory. The database expert masters the art of adding the drivers in the lib directory. You can learn the ways how Sqoop can be connected to database. The driver can be connected with the classpath and then the Yarn and HDFS can be added in the Sqoop. Finally the sqoop job can be created and activated for the long run in the Import/Extract activities. Now, let us go through each step carefully and learn the command and the property related with them.
Insert Oracle Driver In Sqoop Classpath: The JDBC jar file of the Oracle can be copied into the Sqoop lib directory. Sometimes the files may not exist in the database. Then the jar file needs to be created.
Therefore the command can be like
- Sqoop mkdir lib
- sqoop cp~ /git / boneill/ data-lab / lib/ ojdbc6.jar. / lib
Insert Yarn And HDFS In Sqoop Classpath:
After inserting the driver, you need to add the Yarn and HDFS in the classpath of Sqoop. In the initial stage of the setup the classpath is managed by the property of common.loader within the server/conf/catalina.properties file. For introducing the Yarn cluster properly, you can use the following property of common.loader like this way:
After adding, the Sqoop server needs a restart function with all the new jar files including the driver file of jar. A JDBC connection can be generated at this time. Now, the connection can be created through the following commands:
sqoop> create connection –cid 1
Generating transaction for connector by id 1
Kindly fill up values to create new connection item
JDBC Driver Class: oracle. jdbc. driver. OracleDriver
JDBC Connection String: jdbc: oracle:thin: @change.me :1521:service.name
JDBC Connection Properties:
Now, there are zero values within the map:
entry# HIT RETURN HERE!
Security related configuration options
Max connections: 10
Successful creation of the new connection has been generated with validation status FINE and persistent id 1
Build Sqoop Job : After connecting the Yarn and HDFS in the Sqoop classpath, the new sqoop job can be created in the following method:
sqoop> create job –xid 1 –type import
Creating job for connection with id 1
Please fill following values to create new job object
Schema name: MY_SCHEMA
Table name: MY_TABLE
Table SQL statement:
Table column names:
Partition column name: UID
Nulls in partition column:
0 : HDFS
0 : TEXT_FILE
1 : SEQUENCE_FILE
0 : NONE
Output directory: /user/boneill/dump/
New job was successfully created with validation status FINE and persistent id 3
The transparency factor can be maintained in the directory where the final output will be visible.
Activate The Sqoop Job: After the successful creation of the Sqoop job, you are required to run the job or activate it. Here is the method by which it can be done easily.
sqoop> start job –jid 1
Job ID: 3
Server URL :http://localhost:12000/sqoop
Created by: bone
Creation date: 2014-10-14 13:27:57 EDT
Lastly updated by: bone
External ID: job_1413298225396_0001
http://your_host:8088/proxy/application_1413298225396_0001/ 2014-10-14 13:27:57 EDT: BOOTING – Progress is not available
Then you can easily watch the created Sqoop job through the Yarn . You may go to HDFS and search for your files in the final output directory. This will only be possible after a bit of churning activities take place. So what are you waiting for? Let’s get started with the Sqoop Oracle.
993 total views, 5 views today