Saturday 22 November 2008

Data Integration and the Cloud

Recently I read a Special Report from the magazine Economist about the computing "Cloud". The 1 year old, yet still buzzy and exciting new phenomena in the IT world.

The article is quite analytical and it is definitely worth reading. Here is the full link to the article called "Let it rise!":


http://www.economist.com/specialreports/displaystory.cfm?story_id=12411882

I have to warn you the article is quite lengthy but is worth every page of it.

After you read the article you get a feeling like returning back to the past in the supercomputers era of mainframes. That is what The Cloud is. A "thing" which lives on the internet, and where all the IT infrastructure lives and it is remote and you connect to it via the internet.

No more hardware and server rooms in the office, just smart PCs with browsers and the browser actually becoming the operating system, full of internet based applications for you to work with.

The Cloud is getting rid of all the dusty servers in the server room. All your applications are sucked by the Cloud. If you need word-processing to be done, you will just login to the internet and you use Google Notes or something. If you need spreadsheets, you can use Google Docs. Do you need to design and run a database use Oracle Apex. These are already existing and mature Cloud technologies. Do you need a ready-made CRM application without the complexity of installing it and maintaining it yourself on your PC or your servers, use Salesforce or SugarCRM.

All these and many more are available for you on the internet, in the form of SaaS which stands for Software as a Service. All you need to have is an internet browser and little money to pay your subscription to these SaaS services as you would pay for your mobile phone bill. Software is becoming a Pay-As-You-Go thing! There is no more licenses, product installation keys and certificates to worry about. You don't have to download anything and you don't have to install anything. Even better you don't need to upgrade to the new version each time, as this is done for you automatically. Sounds nice doesn't it?

The Cloud seems a heaven of applications, but how about data? What if I want to store gigabytes-terrabytes of data in the Cloud? How about backups?

Yep it is possible. Check Amazon AWS and S3 you pay pennies for gigabytes of storage per month. But how practical is this? Will all companies suddenly decide to part from their valuable and competitive data and leave it to the misty-airy hands of The Cloud? Will it happen quickly? I think it won't happen as quickly as it happened for the applications. You can guess reasons why. Usually we don't sell data but we sell applications, applications come and go but data stays, we don't like to pass the control of our data to someone else or somewhere else, are just a few reasons.

But what is the alternative? The alternative is to connect the cloud to your database or legacy data. Keep the data, but get The Cloud connect to it! Do we have to care where the data is when we want to analyse it? Do we have to know if it is on Oracle, MS SQL or MySQL or in an MS Excel spreadsheet? Do we have to be looking for the right tool to connect to the right database? In tools issue, I like what Oracle did with Oracle SQL Developer, you can connect it to any database.

To carry on, we should just be able to get to the data with no gimmicks. Also would be nice to have the data presented in an application in the Cloud. Right? If these are sort of your requirements then probably you are looking at one of those Data Integration solutions.

An integration solution which is not just buying Crystal Reports or Business Objects or Oracle OBIEE and installing it on one of your new servers and firing up more and more instances and doing lots of more configurations etc. But a solution where you can have all this in the Cloud, done by others at a Pay-As-You-Go cost, and all you get is a Login to start developing your reports and web applications.

Who wouldn't prefer not to own the complexity of maintaining a reporting server infrastructure but would like to own the data?

At the end all you need is a web page which connects to your database and you develop it as easy as writing an email. A web page which is an application, developed on the internet and stays on the internet but connects to your database in your data centre or your PC. Wouldn't that be cool?


Apatar a Data integration tool


Yes, there are independent tools out there for the integration of database systems with the Cloud. There are ways to get the cloud to connect to your whichever database system. I have been looking at one of them recently called Apatar You can find out more on www.apatar.com.

Apatar is an Open Source data integration and ETL tool. It is FREE and it lives in http://sourceforge.net. Available both for Linux and Windows. It works a treat. It is just 22MB of software. You can downoad Apatar from:

http://apatar.com/download.html


Once you installed it you just drag and drop objects to its screens. These objects could be databases and maybe applications you have in the Cloud. You configure the objects by double-clicking on them and providing database login and Cloud login info. At no time, you start pumping and synchronizing data between your database and your Cloud application (i.e. Salesforce). Look at some screen-shots from the Apatar website to understand how it works. You can even schedule this operations to happen at times of your choice with the tool.

It is great to see Cloud enabled great software like Apatar for FREE and this doesn't stop here. I found even more tools like apatar for free here:

Talend

Xaware

Wednesday 12 November 2008

Entity Relationship diagrams using Oracle SQL Developer Data Modeling

Finally, the early adopter release of Oracle SQL Developer Data Modelling tool is out. It is easy to use and I quickly managed to reverse engineer a database schema from an Oracle 10g database.

All you have to do is to connect to a database, specify which schema and which tables you want to include in your ER Model, and then just press the button. It even connects to Microsoft SQL Server 2000 database and extracts the ER Models. Pretty impressive!

I use ubuntu 8.04 and the install was easy, just unzip the downloaded file osdm-1.5.1-518-no-jre.zip to a directory and run the osdm.sh script. See instructions below.

It was about time for Oracle to come with a solution like this and for FREE, well almost. Still not clear! But I downloaded the early adopter version to try it.

You can download Oracle SQL Developer Data Modeling from: http://www.oracle.com/technology/products/database/sql_developer/files/Modeling.html

Here are the instructions to reverse engineer a schema using the tool.

1. Go to directory where you unziped the download (in my case the download osdm-1.5.1-518-no-jre.zip) like this:

cd /opt/osdm1/osdm/bin


2. Run the osdm.sh file like this

sh osdm.sh


3. When the application starts from the menu, go to:

File > Import > DB Catalog


4. The "DB Metadata Extraction Wizard" starts.

Just follow the instructions and provide your login credentials like you would do with any other tool connecting to a database and choose the tables you want to model.












Enjoy modeling in your
universe of discourse!


You can also read my other post on this topic which shows you how to create a database with OSDM here

Tuesday 11 November 2008

Microsoft SQL Server 2000 Data Dictionary view to view all objects in the database

See all objects in your Microsoft SQL database and their types.

This is one of my favorite data dictionary queries in MS SQL Server. It shows me quickly what makes up a database, how many tables, triggers, views etc.

I have been using it on MS SQL Server 2000 and it works!


select owner, cnt, object_type =
CASE xtype
when 'C' then 'CHECK constraint'
when 'D' then 'Default or DEFAULT constraint'
when 'F' then 'FOREIGN KEY constraint'
when 'L' then 'Log'
when 'FN' then 'Scalar function'
when 'IF' then 'Inlined table-function'
when 'P' then 'Stored procedure'
when 'PK' then 'PRIMARY KEY constraint (type is K)'
when 'RF' then 'Replication filter stored procedure'
when 'S' then 'System table'
when 'TF' then 'Table function'
when 'TR' then 'Trigger'
when 'U' then 'User table'
when 'Q' then 'UNIQUE constraint (type is K)'
when 'V' then 'View'
else 'unknown'
end
from
(
select su.name owner, so.uid uid, so.xtype xtype,count(*) cnt from sysobjects so, sysusers su
where so.uid = su.uid
group by su.name, so.xtype, so.uid
) sysobjects_info



Owner Count Object Type
---------- ------- ----------------------------------
dbo 12 CHECK constraint
dbo 363 Default or DEFAULT constraint
dbo 127 FOREIGN KEY constraint
xanthi 34 Stored procedure
dbo 241 Stored procedure
dbo 234 PRIMARY KEY constraint (type is K)
dbo 19 System table
dbo 10 Trigger
dbo 399 User table
dbo 6 unknown
dbo 11 View

Monday 10 November 2008

ORA-01000: maximum open cursors exceeded

When you get the ORA-01000: maximum open cursors exceeded did you ever want to see which SQL is causing this error? That is, try to find where the leak is?

Then try running the following sql statements on the dynamic dictionary views V$SESSION and V_$OPEN_CURSOR. This statement shows which SQL statement, which username and which machine are using up your cursors.


SQL> select s.machine, oc.user_name, count(1) as count, oc.sql_text
from sys.v_$open_cursor oc, v$session s
where s.sid = oc.sid
group by s.machine, user_name, sql_text
order by count desc;


MACHINE USER_NAME COUNT SQL_TEXT
----------------------- --------------- ------- ------------------------------------------------------------
Thessaloniki.com Athena 127 SELECT NAME, EAR_NAME, MASTER_NODE_ONLY, ENABLED, DEFINITION
Rome.com Athena 120 SELECT NAME, EAR_NAME, MASTER_NODE_ONLY, ENABLED, DEFINITION
Kavala.com Athena 43 SELECT NAME, EAR_NAME, MASTER_NODE_ONLY, ENABLED, DEFINITION
Kavala.com Athena 39 select i.obj#,i.ts#,i.file#,i.block#,i.intcols,i.type#,i.fla
Thessaloniki.com Athena 35 INSERT /*+ IDX(0) */ INTO "Athena_OWNER"."MLOG$_USERS" (dmlt
You must run this as SYSDBA. The output is from an Oracle 10g instance.