Thursday 2 July 2015

Query existing HBase tables with SQL using Apache Phoenix

Spending a bit more time with Apache Phoenix and reading again my previous post I realised that you can use it to query existing HBase tables. That is NOT tables created using Apache Phoenix, but HBase - the columnar NoSQL database in Hadoop.I think this is cool as it gives you the ability to use SQL on an HBase table.

To test this, let's say you login to HBase and you create an HBase table like this:
> create 'table2', {NAME=>'cf1', VERSIONS => 5}

The table2 is a simple table in HBase with one column family cf1 and now let's put some data to this HBase table.
> put 'table2', 'row1', 'cf1:column1', 'Hello SQL!'

then maybe add another row
> put 'table2', 'row4', 'cf1:column1', 'London'

Now, in Phoenix all you will have to do is create a database View for this table and query it with SQL. The database View will be read-only. How cool is that, you don't even need to physically create the table or move the data to Phoenix or convert it, a database view will be sufficient and via Phoenix you can query the HBase table with SQL.

In Phoenix you create the view for the table2 using the same name. As you can see below the DDL used to create the view is case sensitive and if you created your HBase table name in lower case you will have to put the name in between double quotes.

So login to Phoenix and create the "table2" view like this:

> create view "table2" ( pk VARCHAR PRIMARY KEY, "cf1"."column1" VARCHAR );
And here is how you then query it in Phoenix:





SQL Query on Phoenix
Tremendous potential here, imagine all those existing HBase tables which now you can query with SQL. More, you can point your Business Intelligence tools and Reporting Tools and other tools which work with SQL and query HBase as if it was another SQL database.

A solution worth investigating further? It definitely got me blogging in the evenings again.

To find out more about Apache Phoenix visit their project page https://phoenix.apache.org/