ETL Project Phases in Data Stage

Ratings:
(4)
Views: 0
Banner-Img
Share this blog:

- - > ETL project phases There are 3 phases Phase – I

Data profiling

 

  • Source System Analyses is done in this phase
  • There are 5 Types of Analysis
  • Ca - - >  Column Analysis
  • Pa - - >   primary key Analysis
  • Fa - - >  foreign key Analysis
  • Bl - - >   base line Analysis
  • Cd - - >   cross domain Analysis
  • The output we get is whether the data is Directly or Not.
  • If the data is dirty proceed to next phase

Phase – II

Data Quality or cleansing

  • There are 5 Stages
  1. Parsing
  2. Cording
  3. Standardize
  4. Matching
  5. Consolidate

Golden Copy is sent to the next phase Phase – III 

Data Transformation

ETL Process  Example If Hutch wants to introduce 10 /- recharge price. Then – the top level manager needs Some information   Screenshot_6 We have ETL Tools and ETL Programming Tools ETL Tools Extract source data from heterogeneous source (i.e from different source) ETL Programming Tools Extract data from only one External source. Characters  o f Data Work House

  • Subject (that is  w . r. t to customer or sales etc)
  • Integrate
  • Non – volatile (only read)
  • Historical Data 

  Active Data Base: (Historical data)         OLTP  (Time sensitive /30- 90 days) Screenshot_7   The data that is collected from different sources can will be of 30 – 90 days. Later On it is stored in Achieve Data Base which is an historical data.

  • ETL is a multilayer process.
  • Data Ware house is an data base that collects data from heterogeneous source as per

Business requirements required by an Toped level Manager.

  • Data Warehousing is an process Which has the Combination of ETL activities and

BI (Business Intelligence) activities .   Screenshot_8  

OLTP OLAP
Transition Analysis
Multi user Less user
Less in size Large in size
Volatile(Read/Write) Non volatile(Read)

Screenshot_9

Learn DataStage by Tekslate - Fastest growing sector in the industry.
Explore Online DataStage Training and course is aligned with industry needs & developed by industry veterans.
Tekslate will turn you into DataStage Expert.
  • In ETL, Extraction has to be done is an specified time.
  • Loading has also to be done is an Specified time.
  • If Extraction has to be done between 10pm – 11 to 5 pm and loading to be done

Between 12:00am to 2:00am - - > Extract Window: Specified time  given by the client to hit source and Extract  the data is known  as Extract Window - - > Load Window: Specified time  given by the client to hit the target and load the data is known as Load Window. - - > After Extraction the data collected should be stored in an area called as Stage Area. Screenshot_10   After Loading the data in to warehouse the permanent data will be deleted Known as flush.

You liked the article?

Like: 0

Vote for difficulty

Current difficulty (Avg): Medium

EasyMediumHardDifficultExpert
IMPROVE ARTICLEReport Issue

About Author

Authorlogo
Name
TekSlate
Author Bio

TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.