Thursday, 12 May 2016

Database Migration and Integration using AWS DMS



Amazon Web Services (AWS) recently released a product called AWS Data Migration Services (DMS) to migrate data between databases.

The experiment

I have used AWS DMS to try a migration from a source MySQL database to a target MySQL database, a homogeneous database migration.

The DMS service lets you use a resource in the middle Replication Instance - an automatically created EC2 instance - plus source and target Endpoints. Then you move data from the source database to the target database. Simple as that. DMS is also capable of doing heterogeneous database migrations like from MySQL to Oracle and even synchronous integrations. In addition AWS DMS also gives you a client tool called AWS Schema Converter tool which helps you convert your source database objects like stored procedures to the target database format. All things a cloud data integration project needs!

In my experiment and POC, I was particularly interested in the ability of the tool to move a simple data model as below, with 1-n relationship between tables t0(parent) and t1(child) like below.

(Pseudo code to quickly create two tables t0, t1 with 1-n relationship to try it. Create the tables both on source and target database)

t0 -> t1 Table DDL (Pseudo code)

CREATE TABLE `t0` (
  `id` int(11) NOT NULL,
  `txt` varchar(100) CHARACTER SET ucs2 DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `t1` (
  `id` mediumint(9) NOT NULL AUTO_INCREMENT,
  `t0id` int(9) DEFAULT NULL,
  `txt` char(100) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `t0id` (`t0id`),
  CONSTRAINT `t1_ibfk_1` FOREIGN KEY (`t0id`) REFERENCES `t0` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


In this experiment, I didn't want to see just a migration, a copy, of a table from source database to a target database. I was interested more to see how easy is to migrate a data model - with Primary Key and Foreign Key relationship in place -  from the source database to the target database with zero downtime and using their CDC (Changed data Capture) or Ongoing-Replication migration option and capabilities of AWS DMS. That is, zero downtime database migration.

Here are the results of the experiment.

AWS DMS is ubiquitous, you can quickly set-up an agent (Replication Instance) and define source & target endpoints and start mapping your tables to be migrated from source database to target database with the tool. All conveniently using the AWS console.

Once you setup your replication instance and endpoints, create a Migration Task (say Alpha) and do an initial full migration (load) from the source database to the target database. Do this with the foreign keys (FKs) disabled on the target. This is a recommendation in the AWS DMS Guide in order to dump the data super fast as it does it with parallel threads, at least this is the recommendations for MySQL targets.

Then you can create a second Migration Task (say Beta) using a different endpoint, but this time with the foreign keys enabled on the target. You can do this even before your full load with Alpha to avoid waiting times. Configure Beta interface/task to run forever and let it integrate and sync the delta which occurred during the initial load. You can even start the Beta interface from a cut-off timestamp point. It uses source MySQL database's binlogs to propagate the changes. If you don't create beta interface, that is to use a different endpoint for the target with the parameter which enables the FKs, the DELETE SQL statements on the source which occur during the migration will not propagate to the target correctly and the CASCADEs to the child tables will not work on the target. CASCADE is a property of the Foreign Key.

To reconcile, to find out if you have migrated everything, I had to count the rows in each table on source and the target databases to monitor and see if it all worked. To do that I used Pentaho Spoon CE to quickly create a job to count the rows on both source and target database and validate migration/integration interfaces.

Overall, I found AWS DMS very easy to use, it quickly helps you wire an integration interface in the Cloud and start pumping and syncing data between sources and targets databases be it on Premise or Cloud. A kind of Middleware setup in AWS style, in the Cloud. No more middleware tools for data migration, AWS now has it's own. 

Wednesday, 23 December 2015

LEVERAGE GEOGRAPHICALLY-DISTRIBUTED DEVELOPMENT

As technology advances at warp speed, there are certain tech methodologies that will go by the wayside to make room for more advanced and efficient versions; and how development projects are managed is a case, in point.  Companies in every industrialized nation of the world are embracing Geographically-Distributed Development or GDD, which has embedded itself an impressive and proof-positive IT strategy model.  Outdated procedures that have been utilized for the administration of virtually any type of development project have been limited to one or several building sites.  That was then; this is now.

Let’s take a look at the advantages that GDD offers:
decreased labor expenses
increased availability to skilled resources
reduced time-to-market, with round-the-clock flexible staffing
The beauty of GDD is that is allows enterprises, regardless of location, to respond to changes in business circumstances as they happen.  Any feedback can be presented, instantaneously, within a global framework.

In order for GDD to achieve its vast benefit potential, major barriers that might impede an enterprise’s successes must be reduced to a minimum or entirely eliminated within the GDD strategy.   It is crucial that increased expenses associated with communication and coordination logistics that occur on an international level within a globally-distributed market, be uncovered and targeted.  If communication and coordination-specific expenses are allowed to flourish, the very benefits of GDD can be sorely compromised.  Various challenges must be reckoned with:  1) cultural variances 2) language differences and 3) inaccessibility to time-sensitive information.  These can all jeopardize the progress of distributed projects.

GDD is oblivious to location. it is an IT strategy model without borders.  This allows development team-members to work collectively and cohesively within a city, across state lines or beyond continents.  A site or sites might be engaged with one particular software-development venture while one or more outsourcing companies work, simultaneously, towards the projected goal.   Outsourcing companies would contribute their efforts and expertise, like a fine-tuned engine, within the software’s project-development cycle.  Optimized efficiency and cost savings, via structured and coordinated local or global team-work, becomes refreshingly realized.

With GDD, thorough and clear communication is established between all team members and project coordination.  Business demands incorporate global-sourcing, service-oriented architecture, new compliance regulations, new development methodologies, reduced release cycles and broadened application lifetimes.  Because of this, highly-effective, unencumbered communication is mission-critical; and a necessity arises that begs for a solution that has the power to:
Provide management visibility of all change activities among distributed development teams 
Integrate and automate current change processes and best practices within the enterprise
Organize the distribution of dependent change components among platforms and teams
Protect intellectual property

Track and authenticate Service Level Agreements (SLAs)
Engaging an organization to efficiently manage and significantly optimize communication among all stakeholders in the change process is a priceless component of an Application Lifecycle Management (ALM) solution.  Multiple GDD locales present inherent challenges:  language and cultural divides, varying software-development methods, change-management protocol, security employment, adherence to industry mandates and client business requisites.  The good news is the ALM solution tackles these hurdles with ease!

Provide Management Visibility of all Change Activities among Distributed Development Teams

When a centralized repository allows for the viewing of all the activities, communications and artifacts that could be impacted by the change process, you have beauty in motion; and this is what ALM does.  Via ALM, users have the luxury of effortlessly viewing project endeavors by each developer, development group or project team--irrespective of location, platform and development setting.  This type of amenity becomes especially striking when one begins to compare this model-type with other distributed environments where work-in-progress is not visible across teams due to niche teams employing their own code repositories.

ALM provides the opportunity for development managers to not only track, but validate a project’s standing.  A project’s status can be verified which helps to guarantee the completion of tasks.  User-friendly dashboards will alert management if vital processes indicate signs of sluggishness or inefficiency.

ALM ensures that the overall development objectives will be met on a consistent basis.  This is accomplished through the seamless coordination between both remote and local development activities.  The ALM-accumulated data plays a crucial role with boosting project management, status tracking, traceability, and resource distribution.  Development procedures can be continually improved upon, thanks to generated reports that allow for process metrics to be collected and assessed.  Also, ALM allows regulatory and best-practices compliance to be effortlessly monitored and evaluated.  Compliance deals with structuring the applicable processes and creating the necessary reports.  ALM executes compliance strategy and offers visibility to the needed historical information, regardless of users’ geographic locations.

Integrate and Automate Current Change Processes and Best Practices within the Enterprise

In a perfect world, each and every facet of a company’s application development would be super easy; and with ALM it is.  By way of ALM, companies can establish the defined, repeatable, measureable and traceable processes based on best practices, with absolute perfection.  User-friendly point-and-click set-up functions enable one to create a collection of authorized processes that automate task assignments and movement of application artifacts.

ALM permits the streamlining of change management by means of its simplicity when dealing with changes and necessary proceedings.  This in turn means changes can be analyzed and prioritized.  The approval management functions demand that official authorizations must be secured before any changes are permitted to go forth.  The ATM’s automated logging functions totally un-complicate the tracking of software changes.  This is huge since changes can be tracked from the time a request is received up to the time when a solution is submitted to production.

Every member that is part of the global development team would be duly notified regarding any required assignments as well as any happenings that would have an impact on their efforts.

Organize the Distribution of Dependent Change Components among Teams and Platforms

It’s no secret that when there are changes within just one system of a cohesive enterprise, those changes can impact other systems.  ALM offers multi-platform support which ensures that modifications made on disparate platforms, by way of geographically-dispersed teams can be navigated through the application lifecycle jointly.  A Bill of Materials Process, or BOMP, serves as an on-board feature that permits users to create file portfolios that incorporate characteristics from various platforms.  This means those portfolios can travel through the lifecycle as a unit.  Additionally, some ALM solutions absolutely ensure that the parts within the assemblies are positioned with the suitable platforms at each state of the lifecycle. 

Protect Intellectual Property

An ALM solution is the perfect component that allows for access and function control over all managed artifacts.  Managers are in a position to easily determine and authorize any access to vital intellectual property due to ALM functioning on a role-based control system.  The role-based structure means administrative operations are streamlined which permits any system administrator to bypass assigning individual rights to a single user.  Additionally, a role-based system delivers a clearly-defined synopsis of access rights between groups of individuals.

Track and Authenticate Service Level Agreements

The overall project plan, absolutely, must remain on schedule while administering accountability for established deliveries; and this can be easily realized through ALM’s ability to track and authenticate tasks and processes.  The ALM solution caters to satisfying Service Level Agreement (SLA) requirements within an outsourcing contract.  As a result, project management is enhanced by ensuring performance of specific tasks.  Optimizing the user’s ability to track emphasized achievements is made possible due to the consistency between tasks that have been assigned to developers and tasks that are part of the project plan.   Invaluable ALM-generated reports will track response and resolution times; and service-level workflows automate service processes and offer flexibility.  This translates into an acceleration of processes to the respective resources to meet project deadlines.  The ability to track performance against service-level agreements is made possible due to the availability of reports and dashboards that are at one’s fingertips.

Enhance Your Geographically-Distributed Development

As stated, ALM is beauty in motion; and aside from promoting perfected-levels of communication and coordination, it utilizes management strategies designed to plow through any obstructions that have the potential to compromise success.   ALM’s centralized repository is purposed to present multiple ideas, designs, dialogue, requirements, tasks and much more to team-members who would require or desire instant access to data.  Development procedures and tasks can be precisely and efficiently automated and managed due to ALM’s cohesive workflow capabilities.  Vital intellectual property, all of it is embedded and safeguarded in a central repository.  Due to this caliber of reinforced protection, loss and unauthorized access is null and void.  When remote software development is in-sync with local development, project management becomes seamless, highly-coordinated and error-free.  Integration of the monitoring, tracking and auditing of reports and dashboards means management can successfully satisfy project deadlines.  It would behoove any enterprise who wishes to reap the rewards of GDD to fully embrace ALM as its solution, it is truly mission-critical.

Application Lifecycle Management Solutions

Application lifecycle management programs are able to easily deliver a plethora of enterprise software modifications and configuration management facilities.  ALM solutions have the ability to support the needs of multiple groups of geographically-distributed developers.  Business process automation services, designated to automate and enforce on-going service delivery processes throughout enterprise organizations, is a vital component of ALM solutions.  Within  those groups of geographically-distributed developers, the product continues to reveal the magnitude of its talents since it:   targets permission-based assignment and enforcement services, caters to role-based interfaces which allows support for developers, software engineers, project managers, IT specialists, etc, delivers enterprise application inventory-management services, oversees and coordinates  large software inventories and configurations, guards user access, manages differing versions of application code,  supports the existence of concurrent development projects, coordinates a diversity of release management facilities. 

Mike Miranda is a writer concerning topics ranging from Legacy modernization to Application life cycle management, data management, big data and more