Flat file source

Ratings:
(4)
Views: 0
Banner-Img
Share this blog:

Flat file source MSBI

Scenario: Remove duplicate records from file systems using either sort transformation or aggregate transformation.

Method 1: (Using aggregate transformation) open business intelligence development studio create a new package and rename it a flat file source with aggregate.

In control flow drag and drop data flow task and rename it as avoid duplicate records using aggregate ----------->

In data flow drag and drop flat file source ----------->

Double click on flat file source to configure it ----------->

Click new to create a new flat file connection manager ----------->

Provide connection manager name and description (storing details) ----------->

Click Browse ----------->

Navigate to the path ----------->

Select student details t2t and click open ----------->

Select columns from left pane ----------->

Click ok ----------->

Select columns in flat file source editor ----------->

Rename the output columns as mentioned below

Column 0 – SND

Column 1 – S Name

Column 2 – Qualification

Column 3 – Marks

Click ok ----------->

Drop and drag aggregate transformation and make a connection from flat file source to aggregate ----------->

Double click on aggregate to configure it ----------->

Select or check all available input columns (Sno, Sname, S Qualification, marks) ----------->

Note: Make sure that the operation is group by for all the selected columns

Click ok ----------->

Drag and drop OLEDB destination and make a connection from aggregate to OLEDB destination ----------->

Double click on OLEDB destination to configure it ----------->

Provide destination connection manager ----------->

Click new to create new destination table if it is not exists. ----------->

Rename the table as aggregated data ----------->

Select mappings ----------->

Click ok ----------->

Execute package

Desired to gain proficiency on MSBI? Explore the blog post on MSBI Training to become a pro in MSBI.

Method 2: Remove duplicate records and also sort the data using sort transformation.

Open business intelligence development studio

Create a new package and rename it as avoid duplicate using sort .dtsx. ----------->

In control flow drag and drop data flow task and rename it as DRT sort transformation In connection manager section right click and select new flat file connection. ----------->

Provide connection manager name and description ----------->

Click browse and select student details  .Txt and click open ----------->

Click ok ----------->

Drag and drop flat file source ----------->

Double click on flat file source and select student details flat file connection manager from drop down list. ----------->

Select columns and rename the output columns as mentioned below.

Column 0  -- SNO

Column 1 – SName

Column 2 – Qualification

Column 3 -  Marks

Click ok ----------->

Drag and drop sort transformation and make a connection from flat file source to sort ----------->

Double click on sort to configure it and select all input columns. ----------->

Check remove rows with duplicate sort values to remove duplicate records from the sorted data sets ----------->

Click ok ----------->

Drag and drop OLEDB destination and make a connection from sort to OLEDB destination ----------->

Double click on OLEDB destination ----------->

Provide destination connection manager if exists and click new to create a new destination table if not exists. ----------->

Rename the new table as sorted data ----------->

Click ok ----------->

Select mapping from left panel ----------->

Click ok

Flat file formats:

In SSIS flat file can be configured in 3 ways.

  1. Delimited: Every column is delimited by a special character except the last column which is delimited by a new line character (Carriage return or line feed) {CR} {LF}

Example: 1, Rama, MCA

2, Vijay, Batch

  1. Fixed width: The Columns are defined by fixed width
  2. Ragged Right: The columns are defined by fixed width except the last column which is delimited by new line character

Bulk insert task: Bulk insert task can be used to insert data from a text file or a flat file into SQL server table. Mainly it is advised to use in applications which required quick loading of large amounts of data.

Steps to configure bulk insert table

Prepare the following data with in a text file and rename it as student details. Txt the sample flat file can be used for the above mentioned methods (Method 1 and method 2)

1, Rama, MCA, 72

2, Vijay, Btech, 71

3, Siva, MCA, 70

4, Laxmi, MCA, 80

5, Arpitha, BSC, 90

Open Business intelligence development studio ----------->

Create a new package and rename it as bulk insert .dtsx ----------->

In control flow drag and drop the bulk insert task ----------->

In connection manager section, right click and select new flat file connection ----------->

Provide connection manager name and description ----------->

Click browse and select student details txt ----------->

Click open ----------->

Select advanced tab and rename the column names as mentioned below ----------->

Column 0 – SNO

Column 1 – S Name

Column 2 – Qualification

Column 3 – marks ----------->

Click ok ----------->

Open SSMS (SQL server management studio) to create destination table to store the data which is coming from student details txt ----------->

USE (Adventure works)

GO

(REATE TABLE [DBO] [Bulk Insert]

([SNO] [Varchar] (50),

[S Name] [Varchar] (50) NULL<

[Qualification] [Varchar] (50) NULL<

[Marks] [Varchar] (50) NULL. ----------->

Press F5to execute the query in SSMS ----------->

Double click on bulk insert task to configure it and set the following properties ----------->

Source connection

File – select student details from drop down list

Destination connection

Connection – select local Host, Adventure work

Destination table – select [Adv. works]. [dbo].

[Bulk Insert] from drop down list.

Column delimited – Select comma{,} option since student details flat file is comma {,} delimited select options tab or page

  1. Batch Size: Batch size specifies the no. of rows in a batch. Each batch is copied to the destination server as 1 trans direction. Server commits or role back in case of failure for each batch.

Note: The default value is zero i.e all the data in the specified data file is 1 batch

  1. Last Row: The last row option specifies the row at which the insertion of the input data stops.

The default is zero, indicates the last row in the specified data file.

  1. First Row: Specifies at which the insertion of the input data begins. The default is 1, indicate the 1st row in the specified data file.

TIPS to improve the performance of bulk insert task.

  1. It is advised not to perform any sort operation while loading the data from text file using bulk insert task.
  2. The data will be loaded faster when there are no indexes created on staging table or destination table.

Data Viewers:  Data viewers used to debug the package and also user or developer, can monitor the data which is flowing from source to other stages (transformation or destinations). In SSIS data viewers are categorized into 4 parts

Check out the top MSBI Interview Questions now!
  1. GRID:- Select Grid type and also select Grid tab

Select the columns to be displayed in the grid format.

Histogram: Select histogram type and click histogram tab.

Select the column which you want to model with the histogram.

  1. Histogram graph works only with 1 column at a time
  2. Histogram graph works only with numeric data
  3. Scatter plot (x,y) : Select scatter plot type and also click select scatter plot (x,y) tab and then select a column for the x-axis and a column for y-axis.
  4. Scatter plot (x,y) graph works only with numeric data.
  5. Column chart: Select column chart type and click column chart tab.

Select the column that you want to model in the column chart

  1. The graph only works with 1 column at a time.

 

Steps to configure Data Viewers

Open Business intelligence development studio ----------->

Create a new package and rename it as data viewers .dtsx ----------->

In control flow drag and drop data flow task ----------->

In data flow drag and drop OLEDB  source and configure it. ----------->

Drag and drop OLEDB destination and configure it ----------->

Make a connection from OLEDB source to OLEDB destination. ----------->

Double click on data flow path ----------->

Select data viewers option ----------->

Click add to add any data viewers ----------->

Select grid type ----------->

Select grid tab ----------->

Select the columns which you want to display in grid format and click ok ----------->

Click ok ----------->

Execute package

Note: Data viewers cannot be implemented on production environment. But data viewers can implemented only on development environment for debugging the package.

For an Indepth knowledge on MSBI click on:

You liked the article?

Like: 0

Vote for difficulty

Current difficulty (Avg): Medium

EasyMediumHardDifficultExpert
IMPROVE ARTICLEReport Issue

About Author

Authorlogo
Name
TekSlate
Author Bio

TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.

Stay Updated
Get stories of change makers and innovators from the startup ecosystem in your inbox