Database Replication Moves Data Efficiently Over Internet

We now have this huge computer network called the Internet and we usually think of it in terms of Web pages, but it has potential for a lot more than just providing online marketing material for us to look at. Organizations with offices across multiple locations can make use of high-speed Internet connections and replicated databases to transfer data from one place to the other, and keep one another up-to-date on what’s going on.

Replicated databases, as the name implies, are sets of data that are kept in more than one place. Since it isn’t practical to simply copy entire databases, technologies exist to keep multiple databases in sync by copying only those parts of the database that have changed. As I’ve said in past articles, at NicomIT, we work mainly with Microsoft technologies, so this article deals primarily with Microsoft SQL Server databases, although similar types of technologies are used in other database systems. Oracle, for example, offers synchronization, and there is even the capability to send data from an Oracle database to a Microsoft SQL Server databases in a replicated fashion.

We usually use what’s known as a Virtual Private Network, or VPN, running over the Internet to transmit data in replicated scenarios. A VPN can be set up on any computer and offers a secure channel, via passwords and/or data encryption, for data to flow. So you get the ease and pervasiveness of the Internet with the security of the VPN. Replicated database applications can be scaled from very simple to very complex, depending on the size and frequency of data traffic.

There are basically three types of replication: snapshot, transactional, and merged. Snapshot replication consists of simply sending a read-only copy of a database from one place to another. This would be used where periodic updates are required (say once a day or once a week) and the destination is only interested in the resultant data, not the transactions that were processed.

Transactional Replication keeps replicated databases in sync by sending every transaction from one place to the other. These may be sent in batches on a periodic basis, or done immediately. This later scenario is the most common, and requires the different locations to be always connected together.

Merged Replication is also very common, and is used where there is a centralized server and decentralized clients that are not always connected. This type of technology is similar to what you use if you have a Palm Pilot or other such PDA (personal digital assistant) that
you occasionally have to synchronize with your office computer. On occasion the clients have to be synchronized with the server, and changes made at either end are brought across to the other end. There must be rules put in place to deal with conflicts which occur when changes are made to the same record by different people.

We’re finding Merged Replication to be the most common in our client base, where multiple sites and/or agents in the field keep track of data that must be brought back to the central office, and vice-versa. One of our clients, for example, moves inventory in vans and we are setting them up so that their inventory application is available to the people in the vans, yet head office is kept up-to-date via their centralized system. Other uses we have found for replicated databases are creating subsets of data for reporting and business intelligence purposes, and for doing real-time instantaneous backups of databases in offsite locations, so that no data is lost in the case of a fire or some other disaster.

I should also note that a computer application does not have to be built with replication in mind. Replication can be retrofitted to existing applications if the need arises.

The bottom line: Replicating databases is a secure way of keeping data synchronized across multiple locations, and the network you need to send the data from one place to the other is the ubiquitous Internet.