# ELT Cloud Based Pipeline Architecture
*Sources: [ELT Architecture in the Azure Cloud | by Greg Marsh | Aptitive](https://blog.aptitive.com/elt-architecture-in-the-azure-cloud-50a90681036b) | [How to Build a Data Warehouse for the Insurance Industry | by Greg Marsh | Aptitive](https://blog.aptitive.com/how-to-build-a-data-warehouse-for-the-insurance-industry-adb6399ddb89)*
## Overview
Extract, Load, Transform or ELT is the process of intaking data from a variety of raw data sources, both structured and semi/unstructured and ultimately combining them (in the cloud for this case) to create a centralized data repository that becomes accessible to business reporting/intelligence tools.
To demonstrate, the diagram below displays two data flows: one using relational data and the other using semi-structured flat files. In the end the two data flows merge into a warehouse.

## Best Practices
1. **Isolate your source data in a "common landing area"**:
The first step of our process is to identify the source tables that we need to build out the warehouse and load the information in a staging database
2. **Denormalize and combine data into a data hub**:
After staging the data in the common landing area (CLA) or staging schema (STG), next, the data should trigger some sort of a [[Stored Procedure]] to combine the data into common tables.
3. **Create Star Schema Warehouse**:
Finally, the team loads the business layer into the data warehouse by assigning surrogate keys to the dimensions, creating references in the facts, and structuring the tables in a star schema. If designed correctly, any modern reporting tool, will be able to connect to the DW and generate high-performance reporting.

## Collection
This first step is to **collect**. There are a couple methods to do this; you can dump raw data files into a [[Data Lake]] or pull the data directly into a relational database via more traditional ETL methods.
There needs to a mechanism in place to orchestrate the transition from the collected input data into a staging area where all collected data gets housed, logged, and cataloged.
*Backlinks:*
```dataview
list from [[ELT Cloud Based Pipeline Architecture]] AND -"Changelog"
```