Home > Hadoop > SQL Server and Hadoop

SQL Server and Hadoop


Microsoft SQL Server connector for Hadoop facilitates data transfer between SQL Server and Hadoop data management system.

Some of the key features available with this connector are:

1. Data Transfer from SQL Server to Hadoop

  • Tables in SQL Server to delimited text files on HDFS
  • Tables in SQL Server to SequenceFiles files on HDFS
  • Tables in SQL Server to tables in Hive (Hive is DWH infrastructure on top of Hadoop)
  • Queries executed in SQL Server to delimited text files on HDFS
  • Queries executed in SQL Server to SequenceFiles files on HDFS
  • Queries executed in SQL Server to tables in Hive (Hive is DWH infrastructure on top of Hadoop)
2. Data Transfer from Hadoop to SQL Server
  • Delimited text files on HDFS to SQL Server
  • SequenceFiles on HDFS to SQL Server
  • Hive Tables to tables in SQL Server
Apache Hadoop
So, we have just got an connector which transfers data between two ends. Now, learn the basic of Hadoop.
What is Hadoop?
Apache Hadoop is a new way for enterprises to store and analyze huge data. And, it is an open-source software for reliable, scalable, distributed computing.
Hadoop was designed to solve a different problems in the OLAP and OLTP. It is faster and provides reliable analysis of both structured data and complex data.
Hadoop in detail:
Hadoop consists of two key services: reliable data storage using the Hadoop Distributed File System (HDFS) and high-performance parallel data processing using a technique called MapReduce.
How does it differs from the traditional data storage and processing system?
Hadoop runs on a collection of commodity, shared-nothing servers and it is also called as Hadoop cluster. It can be a single node or multiple node environment.
Travel to Hadoop environment
We  can add or remove servers in a Hadoop cluster as per resource availability. This system detects and compensates for hardware or system problems on any server. Hadoop has self-healing capability when the system changes or failures occurs. It constantly delivers data  and can run large-scale, high-performance processing jobs.
References and further readings
Cloudera is an active contributor to Hapdoop project
Advertisements
  1. Bhavesh
    November 28, 2011 at 5:33 am

    Hi,
    How can I bring data from SQL to HiveQL after creating tables in HiveQL?
    Pls suggest as soon as possible.

    • November 30, 2011 at 9:21 pm

      Microsoft released Hadoop connector for SQL Server.

      Steps to transfer data from SQL Server to Hadoop

      Step 1: Transfer SQL Server data to a delimited text file
      Step 2: Now, You can import delimited text file to hive easily.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: