Monday, March 19, 2012

Conditional Split to send top 10, top20, and top 30 in three different flows ?

I have dataflow that i wish to split. The recordset that i uses is sorted by the Sort Task. After that i'll like to take the Top 10 records and send in one direction, then ill like to take the top 20 and send in another direction and finally Top 30 in a third direction.

How can i do that ?

You can use a script component to give each record a number using the technique described here: http://www.sqlis.com/default.aspx?37

Then you can filter on the values that are created using a conditional split transform component.

Make sense?

-Jamie

|||Yes but it seems to be a bit of a workaround. I was looking for an easier way like using SQL select top 10 * from....|||

cgpl wrote:

Yes but it seems to be a bit of a workaround. I was looking for an easier way like using SQL select top 10 * from....

Yeah I agree, that would be nice. Perhaps you could request it at betaplace or http://lab.msdn.microsoft.com/productfeedback/Default.aspx?

-Jamie|||You could do it all in one script component. Add some extra outputs and then direct the row to the correct output based on the counter value.

I'd demonstrate if I knew what the equivalent of buffer.DirectRow(<outputindex>) was in the script component, but that would be if the syntax for a full component.|||

DarrenSQLIS wrote:

You could do it all in one script component. Add some extra outputs and then direct the row to the correct output based on the counter value.

I'd demonstrate if I knew what the equivalent of buffer.DirectRow(<outputindex>) was in the script component, but that would be if the syntax for a full component.

Oh yeah, Darren's right. And the syntax he's on about is <buffername>.AddRow()

-Jamie|||The easiest way to deal with that, would be if you could disable the "Random" feature in the Row Sample Task. Then you could do a Multicast to 3 different Row Sample Tasks|||

yep, I have the same problem. I have a dataset containing a record for each bill in the past month and I want to flag the top 10 customers according to the aggregated amount pr customer.

Like this in plain SQL:

select top 10 CustName, sum(Amount) as sum_Amount into #ek_temp

from erhverv_kreditnota group by CustName order by sum_Amount

go

update erhverv_kreditnota

set top10 = 'Top 10'

where CustName in (select CustName from #ek_temp)

go

Any way to do this without scripting?

|||

cgpl wrote:

The easiest way to deal with that, would be if you could disable the "Random" feature in the Row Sample Task. Then you could do a Multicast to 3 different Row Sample Tasks

Yeah another excellent idea, you would assume that selecting the first N rows is easier than selecting a random number. Did you submit the idea to Microsoft?

-Jamie|||

jesal wrote:

yep, I have the same problem. I have a dataset containing a record for each bill in the past month and I want to flag the top 10 customers according to the aggregated amount pr customer.

Like this in plain SQL:

select top 10 CustName, sum(Amount) as sum_Amount into #ek_temp

from erhverv_kreditnota group by CustName order by sum_Amount

go

update erhverv_kreditnota

set top10 = 'Top 10'

where CustName in (select CustName from #ek_temp)

go

Any way to do this without scripting?

As explained, no, not really. Is there any reason why you don't want to do this in a script?

-Jamie|||I did!

http://lab.msdn.microsoft.com/productfeedback/viewfeedback.aspx?feedbackid=2bc4ee52-aec6-409d-b39c-6e2fb4945799|||Out of interest, would you want to send:
Rows 1-10 to output1
Rows 1-20 to output2
Rows 1-30 to output3

or

Rows 1-10 to output1
Rows 11-30 to output2
Rows 31-60 to output3

?

I might have a crack at this myself...it sounds quite interesting.

-Jamie|||

I would like the first scenario:

Rows 1-10 to output1
Rows 1-20 to output2
Rows 1-30 to output3

-

There are two reasons why I would like to do this without a script. First I don’t have that much experience with scripting. Second I think that a simple and common task like this should be available “off the shelves”.

I tried scripting based on Jamie’s answer to the thread “Script Transformation” (http://forums.microsoft.com/msdn/ShowPost.aspx?PostID=60464).

1. My problem is that the number of output rows is different from the number of input rows – so I need to limit the script to run on the first 10-30 rows, not the entire recordset.

2. Jamie suggests the syntax <buffername>.AddRow() and Outputbuffer (in the “Script Transformation” thread). I used the default declaration (ByVal Row As Input0Buffer) – but as far as I can see this buffer contains both the input and output and I cannot find Row.AddRow?

Any suggestions to a script that redirects the first 10 rows from input to output?

- Jeppe

|||

jesal wrote:

I would like the first scenario:

Rows 1-10 to output1
Rows 1-20 to output2
Rows 1-30 to output3

-

There are two reasons why I would like to do this without a script. First I don’t have that much experience with scripting. Second I think that a simple and common task like this should be available “off the shelves”.

I tried scripting based on Jamie’s answer to the thread “Script Transformation” (http://forums.microsoft.com/msdn/ShowPost.aspx?PostID=60464).

1. My problem is that the number of output rows is different from the number of input rows – so I need to limit the script to run on the first 10-30 rows, not the entire recordset.

2. Jamie suggests the syntax <buffername>.AddRow() and Outputbuffer (in the “Script Transformation” thread). I used the default declaration (ByVal Row As Input0Buffer) – but as far as I can see this buffer contains both the input and output and I cannot find Row.AddRow?

Any suggestions to a script that redirects the first 10 rows from input to output?

- Jeppe

Jeppe,
This should do it: http://blogs.conchango.com/jamiethomson/archive/2005/07/27/1877.aspx

-Jamie|||Thank you Jamie!

Your "SSIS Nugget: Select Top N in a data-flow" worked great. I have one question though - I couldnt get the "Output0Buffer" to work, but when I changed the SynchronousInputID for the Output to 0 it worked, what is the connection between the declaration of the Output0Buffer and the Synchronous setting?

Thansk again

- Jeppe

No comments:

Post a Comment