Tech and Media Labs
This site uses cookies to improve the user experience.




Count

Jakob Jenkov
Last update: 2016-02-02

The term count used in mathematical analysis means "the number of records (observations)" in a data set. A count may refer to both the total number of records in a data set, or the count of a subset of the records in the data set. I will illustrate both types of counts in this tutorial.

The count of records is both interesting by itself (e.g. the total number of customers), and as part of composite calculations (e.g. how big percentage of our customers are from a specific country).

To illustrate count operations on a data set I will use the following example data set:

Item Amount Order Id Customer Id
Hard disk99.9579023
Monitor195.9579145
Mouse19.9579223
Keyboard29.9579323
Hard disk79.9579476
Mouse17.9579534
Keyboard24.9579634
Monitor249.9579767
USB Storage 49.9579867
Hard disk119.9579987

Total Count

The term "total count" usually refers to the total number of records in the data set. For the example data set above, the total count is 10.

In other tutorials in this mathematical analysis trail, I will use the following notation for count:

count(data)

This is a functional notation where count is a function performed on data which is the data set.

Subset Count

A count operation may count a subset of the records which match a certain criteria. For instance, in the above example data set the count of orders of a keyboard is 2, and the count of orders of a hard disk is 3. Similarly, the total number of customers is 7.

I will be using this notation for subset count in other tutorials in this mathematical analysis trail:

count(data, criteria)

The criteria part means the criteria by which the subset is selected. This criteria will typically be expressed in text, like:

count(data, "customers with more than 1 order");

count(data, "customers that bought a keyboard");

This notation is not directly executable by a computer. A computer cannot easily make sense of the textual selection criteria. In a real computer program you would have to specify the selection criteria using a syntax which a computer could understand.

Exactly what this syntax would be, depends on what tools you are using to analyze the data. If you were using a relational database, the syntax could be SQL. If you are keeping all data in memory and analyzing it with code, it could be another function etc. Use your imagination here.

Jakob Jenkov




Copyright  Jenkov Aps
Close TOC