In Informatica, Transformations help to transform the source data according to the requirements of the target system and it ensures the quality of the data being loaded into the target.
Transformations are of two types: Active and Passive.
Table of Contents |
An active transformation Filter transformation can change the number of rows that pass through it from source to target i.e it eliminates rows that do not meet the condition in transformation.
The following are the list of active transformations used for processing the data
Want to acquire industry skills and gain complete knowledge of Informatica? Enroll in Instructor-Led live Informatica Training to become Job Ready! |
A passive transformation does not change the number of rows that pass through it i.e it passes all rows through the transformation.
Transformations can be Connected or Unconnected. The no of rows entering the transformation is the same as no rows exist the transformation. It doesn’t affect the no of rows
The following is the list of passive transformations used for processing data.
A port represents the column of a table (or) file. There is two basic types of ports.
Ort which can receive the data is known as an input port which is represented as :”I”
A port that can provide the data is know as output port which is represented as “o”.
Connected transformation is connected to other transformations or directly to the target table in the mapping.
A transformation which is part of mapping data flow direction is known as connected transformation. It is connected the source and connected to the target. All active and passive transformation can be refined as connected transformation. A connected transformation can receive multiple input ports, can provide the multiple output ports.
An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation.
A transformation which is not part of mapping data flow direction is known as unconnected transformations. It is nether connected to source nor connected to the target. An unconnected transformation can receive the multiple input ports, but provides “single output port”.
The following transformations can be defined as unconnected
Following are the list of Transformations available in Informatica:
In the following pages, we will explain all the above Informatica Transformations and their significances in the ETL process in detail.
Read these latest Informatica Interview Questions & Answers that help you grab high-paying jobs |
Aggregator transformation is an Active and Connected transformation. This transformation is useful to perform calculations such as averages and sums (mainly to perform calculations on multiple rows or groups). For example, to calculate total of daily sales or to calculate average of monthly or yearly sales. Aggregate functions such as AVG, FIRST, COUNT, PERCENTILE, MAX, SUM etc. can be used in aggregate transformation.
Expression transformation is a Passive and Connected transformation. This can be used to calculate values in a single row before writing to the target. For example, to calculate discount of each product or to concatenate first and last names or to convert date to a string field.
Filter transformation is an Active and Connected transformation. This can be used to filter rows in a mapping that do not meet the condition. For example, to know all the employees who are working in Department 10 or to find out the products that falls between the rate category $500 and $1000.
Joiner Transformation is an Active and Connected transformation. This can be used to join two sources coming from two different locations or from same location. For example, to join a flat file and a relational source or to join two flat files or to join a relational source and an XML source.
In order to join two sources, there must be atleast one matching port. at least one matching port. While joining two sources it is a must to specify one source as master and the other as detail.
The Joiner transformation supports the following types of joins:
Normal join discards all the rows of data from the master and detail source that do not match, based on the condition.
Master outer join discards all the unmatched rows from the master source and keeps all the rows from the detail source and the matching rows from the master source.
Detail outer join keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source.
Full outer join keeps all rows of data from both the master and detail sources.
Lookup transformation is Passive and it can be both Connected and Unconnected as well. It is used to lookup data in a relational table, view, or synonym. Lookup definition can be imported either from source or from target tables.
For example, if we want to retrieve all the sales of a product with an ID 10 and assume that the sales data resides in another table. Here instead of using the sales table as one more source, use Lookup transformation to lookup the data for the product, with ID 10 in the sales table.
Connected lookup receives input values directly from the mapping pipeline whereas UnConnected lookup receives values from LKP expression from another transformation.
Connected lookup returns multiple columns from the same row whereas UnConnected lookup has one return port and returns one column from each row.
Connected lookup supports user-defined default values whereas UnConnected lookup does not support user-defined values.
Normalizer Transformation is an Active and Connected transformation. It is used mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data.
Rank transformation is an Active and Connected transformation. It is used to select the top or bottom rank of data. For example, to select top 10 Regions where the sales volume was very high or to select 10 lowest priced products.
Router is an Active and Connected transformation. It is similar to filter transformation. The only difference is, filter transformation drops the data that do not meet the condition whereas router has an option to capture the data that do not meet the condition. It is useful to test multiple conditions. It has input, output and default groups. For example, if we want to filter data like where State=Michigan, State=California, State=New York and all other States. It’s easy to route data to different tables.
Sequence Generator transformation is a Passive and Connected transformation. It is used to create unique primary key values or cycle through a sequential range of numbers or to replace missing keys.
It has two output ports to connect transformations. By default, it has two fields CURRVAL and NEXTVAL(You cannot add ports to this transformation). NEXTVAL port generates a sequence of numbers by connecting it to a transformation or target. CURVED is the NEXTVAL value plus one or NEXTVAL plus the Increment By value.
Stored Procedure transformation is a Passive and Connected & UnConnected transformation. It is useful to automate time-consuming tasks and it is also used in error handling, to drop and recreate indexes and to determine the space in database, a specialized calculation etc.
The stored procedure must exist in the database before creating a Stored Procedure transformation, and the stored procedure can exist in a source, target, or any database with a valid connection to the Informatica Server. Stored Procedure is an executable script with SQL statements and control statements, user-defined variables and conditional statements.
Sorter transformation is a Connected and an Active transformation. It allows to sort data either in ascending or descending order according to a specified field. Also used to configure for case-sensitive sorting, and specify whether the output rows should be distinct.
Source Qualifier transformation is an Active and Connected transformation. When adding a relational or a flat file source definition to a mapping, it is must to connect it to a Source Qualifier transformation. The Source Qualifier performs the various tasks such as overriding default SQL query, filtering records; join data from two or more tables etc.
Update strategy transformation is an Active and Connected transformation. It is used to update data in target table, either to maintain history of data or recent changes. You can specify how to treat source rows in table, insert, update, delete or data driven.
XML Source Qualifier is a Passive and Connected transformation. XML Source Qualifier is used only with an XML source definition. It represents the data elements that the Informatica Server reads when it executes a session with XML sources.
Advanced External Procedure transformation is an Active and Connected transformation. It operates in conjunction with procedures, which are created outside of the Designer interface to extend PowerCenter/PowerMart functionality. It is useful in creating external transformation applications, such as sorting and aggregation, which require all input rows to be processed before emitting any output rows.
External Procedure transformation is an Active and Connected/UnConnected transformations. Sometimes, the standard transformations such as Expression transformation may not provide the functionality that you want. In such cases External procedure is useful to develop complex functions within a dynamic link library (DLL) or UNIX shared library, instead of creating the necessary Expression transformations in a mapping.
External Procedure returns single value, whereas Advanced External Procedure returns multiple values.
External Procedure supports COM and Informatica procedures whereas AEP supports only Informatica Procedures.
Following articles describing the ETL Tool Transformation Manager(TM) has been sourced from ETL Solutions Ltd. If you need more information, please visit their official website www.ETLsolutions.com or contact them at info@etlsolutions.com.
ETL Solution's product "Transformation Manager(TM)" is a stand-alone Windows® or Linux® software suite of meta-data driven code generator programs that provides for the authoring, testing, and debugging of data transforms between virtually all types of data, whether XML, databases, flat files, Java™ classes or spreadsheets, with flexible deployment options in both J2EE™ and Microsoft®.NET architectures.
References
http://www.learndatamodeling.com/
For Indepth Knowledge on DataWarehouse, Click On:
Types of schemas in DataWarehouse
DataWarehouse Interview Questions
Our course design of tutorials is practical and informative. At TekSlate, we offer resources to help you learn various IT courses. We avail both written material and demo video tutorials. For in-depth knowledge and practical experience explore Online DataWarehouse Training.
You liked the article?
Like: 0
Vote for difficulty
Current difficulty (Avg): Medium
TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.