The term « data pipeline » is a reference to a sequence of processes that collect raw data and convert it into the format that is utilized by software applications. Pipelines can be batch or real-time. They can be installed in the cloud or on premises, and their tooling is open source or commercial.
Data pipelines are like physical pipelines that carry water from a river to your home. They move data from one layer into another (data lakes or warehouses) the same way physical pipes bring water from the river to a house. This enables data analytics and insight from the data. In the past, data transfer required manual procedures like daily file uploads or long waiting times for insights. Data pipelines can replace manual processes and allow organizations to transfer data more efficiently and without risk.
Accelerate development by using an online data pipeline
A virtual data pipeline can provide large infrastructure savings in terms of storage costs in the datacenter as well as remote offices and also hardware, network and management costs associated with deploying non production environments like test environments. Automating data refresh, masking, and access control by role as well as the ability to customize and integrate databases, can cut down on time.
IBM InfoSphere Virtual Data Pipeline (VDP) is a multi-cloud copy management solution that decouples test and development environments from production infrastructures. It uses patented snapshot and changed-block tracking technology to capture application-consistent copies of databases and other files. Users can instantly provision masked, near instant virtual copies of databases from VDP to VMs and mount them in non-production environments so that they can begin testing in just minutes. This is especially useful for accelerating DevOps and agile methodologies as well as speeding up time to market.
https://dataroomsystems.info/data-security-checklist-during-ma-due-diligence