It is a Comprehensive ETL Tool, Which provides, end to end ERP Solutions.
Some of the Most popular ETL Tools are:
Do you want to master DataStage? Then enrol in "DataStage Training" This course will help you to master DataStage
Has more than 12 years of History
1st release was in 1997
1997 – VMARK – UK - - >
Mr. LEE SCHEFFLER - - > Father of Data stage
- - >Data Stage was called as Data Integrator during 1997 - - > Torrent (Data Integrator)
IBM has acquired Informix with Database is 2000.
Cit is the combination if Informix + Data Integrator
- - > Due to the Combination with DRCHESTRATE, Data Stage acquired a parallel combination
It Can be Configured only on UNIX flavours
- - > Up to Version 7.5.1, Server Components are configured only on UNIX flavours à
Accentual Data Stage PX To create a virtual environment (like UNIX) In XP to run the Data Stage.
Can perform only Data Transformation
- - >MKS Tool kit à Assential Suite Components
Release
(a) Profile Stage
(b) Quality Stage
(c) Audit Stage
(d)Meta Stage
(e)Data Stage PX
(f)Data Stage TX as Software
2005
- ->IBM has acquired entire ASCENTIAL - - > IBM Data Stage Enterprise Edition 7.5 *2 - - > (used by 50 % of users)
2006
- ->IBM Web sphere Data Stage & Quality Stage 8.0.1 - -> IDE (Integrated Environment) - -> (used by 40 % of users)
(a) Profile Stage
(b) Quality Stage
(c) Audit Stage
(d)Meta Stage
(e)Data Stage PX
2009
- ->IBM infasphure Data Stage & Quality Stage 8.0.1 - -> Improved web servicers & Server has changed. - -> (used by 10 % of users)
Reads the data from any Source and loads it to any Target.
Any SRC ↔ Any Target
Designed for one O.S, can be executed
- - >Platform generally can be either Software or Hardware.
Hard disk à CPU - - > RAM
can have 32–64 CPU that is Hard disk with multiple CPU‘S
Node - - > logical CPU (or) instance of (physical) CPU
àIt is an S/W which will Create virtual CPU’S
EX:- ETL
Hard disk - - > CPU - - > RAM
SMP
S is not using the max. capabilities of CPU, So Node config. is an S/W Which drives into different Nodes. That is Boost up the Capabilities & Energy level of CPU
- - > Horizontal Combining
- - > Combining primary rows with Secondary rows w. r. t Key column values
It is a technique of distributing the records across the nodes, based on partitioning techniques.
NOTE:
Note:-
- - > Key-based technique assures that the same key column values are collected at the same partition.
DNO= Primary key
E NO | E Name | DNO |
11 | a | 10 |
12 | b | 20 |
13 | c | 10 |
14 | d | 30 |
15 | e | 20 |
D NO | D Name | Loc |
10 | ACE | Hyd |
20 | Meter | Sec |
30 | Sales | Eng |
When combine, I.e, using a horizontal combination
That is Same key column values are collected at the same partition
The Portioned data is once again repatriated
Ex:
EName | Dno | Loc |
A | 10 | AP |
B | 20 | TN |
C | 10 | TN |
D | 20 | KN |
E | 30 | TN |
F | 10 | KN |
G | 20 | AP |
Reverse Partitioning is also called as Collecting
Simultaneously doing the extraction of Transforming and loading jobs.
A channel through which data moves from one stage to another stage
(Server jobs)
Sequential processing
EX:- for Suppose, We have 3 instructions
I1 – Fetch (F), Decode (D), Execute (E), Write lock (W)
I2 – F, D, E, W
I3 –F,D, E,W
- - > In sequential process
Running all transactions in parallel
T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8 |
F | D | E | W | ||||
F | D | E | W | ||||
F | D | E | W | ||||
F | D | E | W | ||||
F | D | E | W |
7.5*2 | 8.0.1 |
|
5 client components
|
OS-dependent(OS; the user will be data stage users) | OS independent(User can be created at datastage, but one dependent) |
File-based repository(Folder) | Database repository (default is DB/2) |
No web-based administration | Web-based administration |
|
5 architecture components
|
can perform phase 3,4 | Can perform phase 1,2,3,4 |
2 tier | N tier |
Note:--
Features of Manager in 7.5 *2, are integrated into a designer in 8.0.1
(a) In 7.5 * 2 user id and used to login for authentication, are created in the O.S, O.S wires will become D.S users
(b) In 8.0.1, they are created at the Data stage Environment
In 7.5 *2, everything is Stored in the folder in the form of files
Data is organized in 2 layers
4.(a) In 7.5 * 2 it is 2 –tier
S - - > server
C - - > machine
(b)In 8, We can have multiple Servers / Engine, Only 1 Repository
R- C1, C- C2, E1 – C3, E2 – C4, E3 – C4 ------En – Cn - - > n –tier components can be configured in n no of machines.
Manager
Admin
Web Console
Data profiling (CA, PA, FA, Baseline, Cross-domain)
You liked the article?
Like: 0
Vote for difficulty
Current difficulty (Avg): Medium
TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.