Showing posts with label ssis. Show all posts
Showing posts with label ssis. Show all posts

Wednesday, March 21, 2012

is that possible to prevent modifying the name of an IDTSInputColumn90?

Hi,

I'm building a custom SSIS data flow component and I create myself input and output columns from a custom property of my pipeline component. That's why I don't want the user to modify the name of the input and output columns by using the advanced editor.

Is there a way either to make input/output column properties - at least the name - readonly or to override any PipelineComponent method to throw an exception like we do when we want to prevent the user from adding/removing input/output colulmns with methods like PipelineComponent.DeleteOutputColumn?

Thanks for your help,

David

Have you tried SetInputColumnProperty? The name is a property, but I have vague recollection that you cannot use that method to set the name, and end up having to do it on the IDTSInputColumn90 reference directly.

Can you detect a change, and pick this up in Validate? If you can detect a change you could return VS_NEEDSNEWMETADATA, and then correct the problem in ReinitializeMetaData. Not as proactive as preventing a change, but maybe it is better than nothing?

|||

I've tried SetInputColumnProperty but it is called only for custom properties...and the name is not a custom property.

I any case, like you said, when the user modifies the name of an input column, Validate is called and I can detect the error...

Thanks

Monday, March 19, 2012

is SSIS wonderful?

Hi guys, I'm using sql 2000 now but eventually will upgrade existing server to sql 2005. currently we only have OLTP dbs. I would like to set up data warehouse & OLAP functionality to drive reporting from the system, is SSIS the product I need? can you please give me the suggestions?

regards

SSIS will certainly help you get your data warehouse going. It is certainly light years ahead of DTS in SQL Server 2000 in terms of functionality and operatability.|||

I know SQL 2005 can do pivot() in the query, isn't enough for reporting? Do I really need to use SSIS ?

|||

NewWorld wrote:

I know SQL 2005 can do pivot() in the query, isn't enough for reporting? Do I really need to use SSIS ?

Reporting really has nothing to do with SSIS. Though you can perform upfront calculations in SSIS and load them into a table to be later used for reports.

Pivot? Pivot is only a small piece of SSIS.|||

I've been using SSIS for a couple of months now and have grown to like it. I'm using it for exactly what you mention -- importing data from (multiple) OLTP databases into a relational data warehouse.

The visual designer is very useful since you can easily see how the data is flowing while running the package. And data viewers are a terrific debugging tool since you can see the data at different points in the data pipeline (much easier than debugging raw SQL scripts).

As far as OLAP goes, you would use SQL Server 2005 Analysis Services to define the structure of your OLAP database, and then use SSIS to get your data from the relational warehouse into the OLAP database (that's usually the last step of my SSIS package -- process the cube).

For reporting, you could use SQL Server 2005 Reporting Services to report on the data in the relational warehouse and/or the OLAP cubes.

Hope this helps, and good luck.

-Larry

|||SSIS - whats that?

You would be much better off creating your own app... lol

Anyways - after using SSIS for 6 months - it is improvement over DTS; BUT MS 2nd hack at ETL tool isnt quite the big time yet --> Some/most Built-in tasks/tools are mediocre or missing (FTP,ZIP,Email etc) and Data Flow although good concept doesnt have a match for SQL based solution --> for example to Update data - you would drop dead using OLE Update Command.

Although - I might sound synical - this is not just based on my own experience banging head against wall trying to figure out SSIS; our consultant with 20 years of DB & programming experience has similar opinion - its his idea using SQL based ETLs which we decided to follow.

Wonderful - joke rite?
Improvement - YES
Easy to get started - YES
Satisfies all your requirements --> Maybe - Maybe Not - BUT be ready for all kinds of errors and mishaps in the journey.

As usual - SSIS forum is the best place to find help in case you get into trouble|||

TheViewMaster wrote:


Anyways - after using SSIS for 6 months - it is improvement over DTS; BUT MS 2nd hack at ETL tool isnt quite the big time yet --> Some/most Built-in tasks/tools are mediocre or missing (FTP,ZIP,Email etc) and Data Flow although good concept doesnt have a match for SQL based solution --> for example to Update data - you would drop dead using OLE Update Command.

I'm going to have to disagree with you here. The data flow (where the OLE DB Command transformation is located) is designed to operate on a row-by-row basis. In a perfect world, how would you propose to implement an "update" feature? Are you looking for more of a GUI-based batch update in the CONTROL flow (as opposed to the Execute SQL task)?

|||

Lawrence Parker wrote:

The visual designer is very useful since you can easily see how the data is flowing while running the package. And data viewers are a terrific debugging tool since you can see the data at different points in the data pipeline (much easier than debugging raw SQL scripts).

Sorry mate - Data Flow data view is no match to SQL solution with using TEMP table to Load data and then Views, SPs, queries to process it which you can use anytime to monitor data movement. Dataview is only design time feature and doesnt help you much if package breaks down in the middle of a job.

Also - SQL solution although a bit more code will make packages easier to upgrade

My personal opinion - SSIS is not quite ready for ETL yet - do Extraxt Load Transform - and you are much better off|||I like being disagreed with on technical stuff - by defending my ideas on importing others we all improve.

OLE DB Command - row by row is way too slow - in a perfect world you would have something like UPDATE mytable FROM ... translated to SSIS.

Also - i forgot in me previous post -
I found a nugget today in Microsoft SQL language -
DELETE FROM x FROM x x ...|||

TheViewMaster wrote:

I like being disagreed with on technical stuff - by defending my ideas on importing others we all improve.

OLE DB Command - row by row is way too slow - in a perfect world you would have something like UPDATE mytable FROM ... translated to SSIS.

Also - i forgot in me previous post -
I found a nugget today in SQL language -
DELETE FROM x FROM x x ...

Right, but you cannot perform an "UPDATE mytable FROM anotherTable" in the data flow. (It shouldn't be expected that you could) You can do that today, and it is the preferred method, in the control flow via an Execute SQL task. Even if they did perform an "UPDATE mytable FROM anotherTable" in the data flow, you certainly wouldn't want that executing for EVERY row, would you? The control flow is the place for that operation. So again, are you asking for a GUI-based task in the control flow to perform batch updates?

|||Perhaps I am asking for wrong thing - but in SQL Stored Procedure all that could be done in one task -
Update existing
Insert new
Delete old
which is the bases of most of our ETL imports to our DB tables

At the moment we are using Data Flow to insert text files to db temp tables (In my case I sneak in occasional Derived column) and then do SQL processing.
I do like control of application (SSIS) to do the work, however benefits of SQL solution have outwighed my initial preference:
SQL has better maintainability - changes can be made to SPs, VIEWS without having to go through of hassle to recompile & deploy package
SQL - simpler processing model - TXT - TEMP table - VIEW - SP IMPORT table - VIEW - SP- Production table

However - SSIS benefits:
less TEMP tables and SQL code
ability to add info in Data Flow - derived column
customized filter duplicates Data Flow transform & able to do "INITCAP" more efficiently than in SQL

50/50|||

TheViewMaster wrote:

Perhaps I am asking for wrong thing - but in SQL Stored Procedure all that could be done in one task -
Update existing
Insert new
Delete old
which is the bases of most of our ETL imports to our DB tables

You're going to get me to 1,000 posts yet tonight, aren't you?! What you are talking about can be done in SSIS, though not inside one data flow. You need to think of the control flow as the foundation for all data work. Using your example, and assuming that order, here's how you'd do it.

Execute SQL task to update existing (update myTable FROM anotherTable) -> Data Flow (to insert new records) -> Execute SQL task (delete old)

Pretty easy, and perhaps faster than your stored procedure... Who knows. The data flow is a highly optimized engine for transporting data from point A (which could be one to many sources -- try that in your stored proc) to point B (which could be one to many destinations). Unfortunately updates aren't really part of data transportation, and is really at the mercy of the database engine.

|||1000 posts here we go - oh wait it's only my 135th post

Anyways - it s more complicated than

update

existing (update myTable FROM anotherTable) -> Data Flow (to

insert new records) -> Execute SQL task (delete old)

You need to determine which are new (lookup) - then which are changed (split) [referencing Jaime's article about SSIS - find columns in table A which are not in table B] - the SQL update, SQL delete (and trust me on this one - in SSIS books and docs they dont mention it at all - trial and error my friend to find out best way)

Somehow I think that INSERT in Store Proc is not that much slower that data flow import (and I would be rather in the mercy of DB engine than in the SSIS engine as the probablity is SSIS will be upgraded sooner)
|||

TheViewMaster wrote:


SQL has better maintainability - changes can be made to SPs, VIEWS without having to go through of hassle to recompile & deploy package

Now I *really* disagree with this statement. Using package configurations, you can create a package such that you'll never have to touch it again. I frequently use configurations to store sql so that I can make changes without touching the package. I can change databases, filenames, SQL, connection strings, etc... all with package configurations and without ever having to touch the package.

|||

TheViewMaster wrote:

1000 posts here we go - oh wait it's only my 135th post

You need to determine which are new (lookup) - then which are changed (split) [referencing Jaime's article about SSIS - find columns in table A which are not in table B] - the SQL update, SQL delete (and trust me on this one - in SSIS books and docs they dont mention it at all - trial and error my friend to find out best way)

Somehow I think that INSERT in Store Proc is not that much slower that data flow import (and I would be rather in the mercy of DB engine than in the SSIS engine as the probablity is SSIS will be upgraded sooner)

You did it! In response to the SQL delete, there are a TON of resources out there - Transact-SQL. Most anything you can do in a database, you can do in SSIS in one way or another... If you understand Transact-SQL, you can make SSIS do some pretty wonderful things!

I'll have to take this up another time, as my night has ended...

Phil

is SSIS wonderful?

Hi guys, I'm using sql 2000 now but eventually will upgrade existing server to sql 2005. currently we only have OLTP dbs. I would like to set up data warehouse & OLAP functionality to drive reporting from the system, is SSIS the product I need? can you please give me the suggestions?

regards

SSIS will certainly help you get your data warehouse going. It is certainly light years ahead of DTS in SQL Server 2000 in terms of functionality and operatability.|||

I know SQL 2005 can do pivot() in the query, isn't enough for reporting? Do I really need to use SSIS ?

|||

NewWorld wrote:

I know SQL 2005 can do pivot() in the query, isn't enough for reporting? Do I really need to use SSIS ?

Reporting really has nothing to do with SSIS. Though you can perform upfront calculations in SSIS and load them into a table to be later used for reports.

Pivot? Pivot is only a small piece of SSIS.|||

I've been using SSIS for a couple of months now and have grown to like it. I'm using it for exactly what you mention -- importing data from (multiple) OLTP databases into a relational data warehouse.

The visual designer is very useful since you can easily see how the data is flowing while running the package. And data viewers are a terrific debugging tool since you can see the data at different points in the data pipeline (much easier than debugging raw SQL scripts).

As far as OLAP goes, you would use SQL Server 2005 Analysis Services to define the structure of your OLAP database, and then use SSIS to get your data from the relational warehouse into the OLAP database (that's usually the last step of my SSIS package -- process the cube).

For reporting, you could use SQL Server 2005 Reporting Services to report on the data in the relational warehouse and/or the OLAP cubes.

Hope this helps, and good luck.

-Larry

|||SSIS - whats that?

You would be much better off creating your own app... lol

Anyways - after using SSIS for 6 months - it is improvement over DTS; BUT MS 2nd hack at ETL tool isnt quite the big time yet --> Some/most Built-in tasks/tools are mediocre or missing (FTP,ZIP,Email etc) and Data Flow although good concept doesnt have a match for SQL based solution --> for example to Update data - you would drop dead using OLE Update Command.

Although - I might sound synical - this is not just based on my own experience banging head against wall trying to figure out SSIS; our consultant with 20 years of DB & programming experience has similar opinion - its his idea using SQL based ETLs which we decided to follow.

Wonderful - joke rite?
Improvement - YES
Easy to get started - YES
Satisfies all your requirements --> Maybe - Maybe Not - BUT be ready for all kinds of errors and mishaps in the journey.

As usual - SSIS forum is the best place to find help in case you get into trouble|||

TheViewMaster wrote:


Anyways - after using SSIS for 6 months - it is improvement over DTS; BUT MS 2nd hack at ETL tool isnt quite the big time yet --> Some/most Built-in tasks/tools are mediocre or missing (FTP,ZIP,Email etc) and Data Flow although good concept doesnt have a match for SQL based solution --> for example to Update data - you would drop dead using OLE Update Command.

I'm going to have to disagree with you here. The data flow (where the OLE DB Command transformation is located) is designed to operate on a row-by-row basis. In a perfect world, how would you propose to implement an "update" feature? Are you looking for more of a GUI-based batch update in the CONTROL flow (as opposed to the Execute SQL task)?

|||

Lawrence Parker wrote:

The visual designer is very useful since you can easily see how the data is flowing while running the package. And data viewers are a terrific debugging tool since you can see the data at different points in the data pipeline (much easier than debugging raw SQL scripts).

Sorry mate - Data Flow data view is no match to SQL solution with using TEMP table to Load data and then Views, SPs, queries to process it which you can use anytime to monitor data movement. Dataview is only design time feature and doesnt help you much if package breaks down in the middle of a job.

Also - SQL solution although a bit more code will make packages easier to upgrade

My personal opinion - SSIS is not quite ready for ETL yet - do Extraxt Load Transform - and you are much better off|||I like being disagreed with on technical stuff - by defending my ideas on importing others we all improve.

OLE DB Command - row by row is way too slow - in a perfect world you would have something like UPDATE mytable FROM ... translated to SSIS.

Also - i forgot in me previous post -
I found a nugget today in Microsoft SQL language -
DELETE FROM x FROM x x ...|||

TheViewMaster wrote:

I like being disagreed with on technical stuff - by defending my ideas on importing others we all improve.

OLE DB Command - row by row is way too slow - in a perfect world you would have something like UPDATE mytable FROM ... translated to SSIS.

Also - i forgot in me previous post -
I found a nugget today in SQL language -
DELETE FROM x FROM x x ...

Right, but you cannot perform an "UPDATE mytable FROM anotherTable" in the data flow. (It shouldn't be expected that you could) You can do that today, and it is the preferred method, in the control flow via an Execute SQL task. Even if they did perform an "UPDATE mytable FROM anotherTable" in the data flow, you certainly wouldn't want that executing for EVERY row, would you? The control flow is the place for that operation. So again, are you asking for a GUI-based task in the control flow to perform batch updates?

|||Perhaps I am asking for wrong thing - but in SQL Stored Procedure all that could be done in one task -
Update existing
Insert new
Delete old
which is the bases of most of our ETL imports to our DB tables

At the moment we are using Data Flow to insert text files to db temp tables (In my case I sneak in occasional Derived column) and then do SQL processing.
I do like control of application (SSIS) to do the work, however benefits of SQL solution have outwighed my initial preference:
SQL has better maintainability - changes can be made to SPs, VIEWS without having to go through of hassle to recompile & deploy package
SQL - simpler processing model - TXT - TEMP table - VIEW - SP IMPORT table - VIEW - SP- Production table

However - SSIS benefits:
less TEMP tables and SQL code
ability to add info in Data Flow - derived column
customized filter duplicates Data Flow transform & able to do "INITCAP" more efficiently than in SQL

50/50|||

TheViewMaster wrote:

Perhaps I am asking for wrong thing - but in SQL Stored Procedure all that could be done in one task -
Update existing
Insert new
Delete old
which is the bases of most of our ETL imports to our DB tables

You're going to get me to 1,000 posts yet tonight, aren't you?! What you are talking about can be done in SSIS, though not inside one data flow. You need to think of the control flow as the foundation for all data work. Using your example, and assuming that order, here's how you'd do it.

Execute SQL task to update existing (update myTable FROM anotherTable) -> Data Flow (to insert new records) -> Execute SQL task (delete old)

Pretty easy, and perhaps faster than your stored procedure... Who knows. The data flow is a highly optimized engine for transporting data from point A (which could be one to many sources -- try that in your stored proc) to point B (which could be one to many destinations). Unfortunately updates aren't really part of data transportation, and is really at the mercy of the database engine.

|||1000 posts here we go - oh wait it's only my 135th post

Anyways - it s more complicated than

update

existing (update myTable FROM anotherTable) -> Data Flow (to

insert new records) -> Execute SQL task (delete old)

You need to determine which are new (lookup) - then which are changed (split) [referencing Jaime's article about SSIS - find columns in table A which are not in table B] - the SQL update, SQL delete (and trust me on this one - in SSIS books and docs they dont mention it at all - trial and error my friend to find out best way)

Somehow I think that INSERT in Store Proc is not that much slower that data flow import (and I would be rather in the mercy of DB engine than in the SSIS engine as the probablity is SSIS will be upgraded sooner)
|||

TheViewMaster wrote:


SQL has better maintainability - changes can be made to SPs, VIEWS without having to go through of hassle to recompile & deploy package

Now I *really* disagree with this statement. Using package configurations, you can create a package such that you'll never have to touch it again. I frequently use configurations to store sql so that I can make changes without touching the package. I can change databases, filenames, SQL, connection strings, etc... all with package configurations and without ever having to touch the package.

|||

TheViewMaster wrote:

1000 posts here we go - oh wait it's only my 135th post

You need to determine which are new (lookup) - then which are changed (split) [referencing Jaime's article about SSIS - find columns in table A which are not in table B] - the SQL update, SQL delete (and trust me on this one - in SSIS books and docs they dont mention it at all - trial and error my friend to find out best way)

Somehow I think that INSERT in Store Proc is not that much slower that data flow import (and I would be rather in the mercy of DB engine than in the SSIS engine as the probablity is SSIS will be upgraded sooner)

You did it! In response to the SQL delete, there are a TON of resources out there - Transact-SQL. Most anything you can do in a database, you can do in SSIS in one way or another... If you understand Transact-SQL, you can make SSIS do some pretty wonderful things!

I'll have to take this up another time, as my night has ended...

Phil

Is SSIS the solution to our problem?

Hi All,

A small part of our project needs to be customizable for each of our clients. Each of our clients have the concept of a "JOB"; The details of which are held in their own database. When our product is installed it must gather Job information from their database and match the appropriate fields to our generic Jobs table. Once this has been done, we can generate a stored procedure that will periodically poll the clients database for new/updated jobs and transfer the data to our database.

What we really need is to come up with something that has a generic UI that will allow us to connect to any data source, enumerate the databases and their tables so it will allow someone to perform the operations above.

My question is can SSIS assist us in achieving this goal?

Any help on this would be greatly appreciated.

Paul.

From a quick reading, it seems like your solution might not require the full extraction, transformation, and loading power of SSIS. Sounds more like an ADO.NET application that would use the appropriate data provider to connect to the selected source, query for schema information and then for Job information, use its internal logic to map fields, then generate the stored procedure.

-Doug

|||Cheers Doug, Very much appreciated...|||

Hi again,

Carrying on from what you said about using ADO.NET for our solution (which i have done) I just wanted to see if there was anybody out there who knows about the following. When you are setting up a database connection in any connection manager, for example the one in visual studio or the one in SQL server management studio. whenever you select a specific provider, a connection string property page seems to be presented for that particular provider. Now i assume that the connection manager somehow gets the connection properties page by calling a method or accessing some property in the provider assembly, buit i just can't find it. I would rather use the providers connection property page rather than create my own if i don't have to, so does anyone know how to access this page?

Any help on this, as always, is greatly appreciated.

Paul.

|||

From classic ADO, it used to be possible to invoke the Data Link Properties dialog and get the connection string back for any [native] OLE DB provider by referencing MSDASC.DLL, the Microsoft Data Access Service Components. Of course with the appropriate managed wrapper you could still do so, but I don't think there's a managed equivalent for .NET Framework Data Providers. However I don't write as much data access code as I used to...if someone corrects me and lets us know that there are managed components available for building connection strings, all the better! I see there's the ConnectionStringEditor class new in the 2.0 Framework that's a UITypeEditor, meaning you could invoke it for a connection string property from a property grid. The MSDN help topic includes a code sample.

-Doug

Is SSIS the solution to our problem?

Hi All,

A small part of our project needs to be customizable for each of our clients. Each of our clients have the concept of a "JOB"; The details of which are held in their own database. When our product is installed it must gather Job information from their database and match the appropriate fields to our generic Jobs table. Once this has been done, we can generate a stored procedure that will periodically poll the clients database for new/updated jobs and transfer the data to our database.

What we really need is to come up with something that has a generic UI that will allow us to connect to any data source, enumerate the databases and their tables so it will allow someone to perform the operations above.

My question is can SSIS assist us in achieving this goal?

Any help on this would be greatly appreciated.

Paul.

From a quick reading, it seems like your solution might not require the full extraction, transformation, and loading power of SSIS. Sounds more like an ADO.NET application that would use the appropriate data provider to connect to the selected source, query for schema information and then for Job information, use its internal logic to map fields, then generate the stored procedure.

-Doug

|||Cheers Doug, Very much appreciated...|||

Hi again,

Carrying on from what you said about using ADO.NET for our solution (which i have done) I just wanted to see if there was anybody out there who knows about the following. When you are setting up a database connection in any connection manager, for example the one in visual studio or the one in SQL server management studio. whenever you select a specific provider, a connection string property page seems to be presented for that particular provider. Now i assume that the connection manager somehow gets the connection properties page by calling a method or accessing some property in the provider assembly, buit i just can't find it. I would rather use the providers connection property page rather than create my own if i don't have to, so does anyone know how to access this page?

Any help on this, as always, is greatly appreciated.

Paul.

|||

From classic ADO, it used to be possible to invoke the Data Link Properties dialog and get the connection string back for any [native] OLE DB provider by referencing MSDASC.DLL, the Microsoft Data Access Service Components. Of course with the appropriate managed wrapper you could still do so, but I don't think there's a managed equivalent for .NET Framework Data Providers. However I don't write as much data access code as I used to...if someone corrects me and lets us know that there are managed components available for building connection strings, all the better! I see there's the ConnectionStringEditor class new in the 2.0 Framework that's a UITypeEditor, meaning you could invoke it for a connection string property from a property grid. The MSDN help topic includes a code sample.

-Doug

Is SSIS (SQL Server Integration Services) a new product or just the upgrade version of DTS?

Hi, all,

I am having a question about SSIS. Is it a totally new product in SQL Server 2005 or it is a upgrade version of previous version DTS? Thanks a lot for any guidance for that.

With best regards,

Yours sincerely,

It is the successor to (or replacement for) DTS, but it is not an upgrade. There is no code from DTS that made it into SSIS. Therefore yes, it is a totally new product.

-Jamie

|||

Hi, Jamie, thanks a lot for your kind guidance.

With best regards,

Yours sincerely,