Friday, November 18, 2005

Find out the number of unique products in the list (First formula Explained)

For those who are wondering what all those excel formula meant, here is the explanation:
Before I start explaining the complex format, here is how sumproduct was meant to do

Assuming there are 2 sets of numbers as follows:

    A     B
1   6   10
2   3   5

When you entered the Excel Function =SUMPRODUCT(A1:A2,B1:B2) into one of the cells, say D1, you get a result of 75. What the excel function does is that it will multiply A1 with B1 (6 x 10) and A2 with B2 (3 x 5). After which, it will add up the 2 results and give you 75 (60 + 15).

    A
1. Excel
2. VBA
3. Internet Marketing
4. What is a Blog
5. What is a Blog
6. Internet Marketing
7. What is a Blog
8. VBA

In our unique list, the formula initially started as simple as this =sumproduct(1/countif(A1:A8,A1:A8)).
What the countif [countif(A1:A8,A1:A8)]does is actually a shortcut way of populating the formula [=countif(A1:A8,A1)] in B1 and copy the formula all the way down to B8. The results would have been: 1 count of the word Excel (which appears in A1) is found in the list, 2 counts of VBA is found in the list, etc. As the function is inversed, the result turns out to be like this:

   A              B
1. Excel           1/1
2. VBA           1/2
3. Internet Marketing           1/2
4. What is a Blog           1/3 (1/3 because there is 3 counts of the phrase "What is a blog")
5. What is a Blog           1/3
6. Internet Marketing           1/2
7. What is a Blog        1/3
8. VBA           1/2


As a result, the number of unique records are returned,

Excel = 1/1 = 1
VBA = 1/2 + 1/2 = 1
Internet marketing = 1/2 + 1/2 = 1
What is a Blog = 1/3 + 1/3 + 1/3 =1

The total is 4 unique records.

This works perfect provided that the list do not contains blank records. If not, the result 1 divided by 0 would have returned a #div/0! error To circumvent that, the following are added. 1 is changed to [A1:A8<>""] to tell the formula to ignore blank records.

Likewise, the following [(A1:A100="")] is added to make sure that denominator will return a value greater than 0. In this case, when the list contains more than a blank record, the denominator is more than 0 because of this segment of the formula + (A1:A100=""). If the list do not contain any 0, it will still return a number based on the formula COUNTIF(A1:A100,A1:A100). Either way, the denominator will return a value of more than 0.

No comments: