The term sum used in mathematical analysis means "the sum of values stored in the records of a data set". A sum may refer to both the sum of values in all records in a data set, or the sum of values in a subset of records in the data set. I will illustrate both types of sums in this tutorial.
The sum of values in records is both interesting by itself, but also as part of composite calculations (e.g. average customer life time value - the average "sum" of total orders from a customer in their time as customer with you).
To illustrate sum operations on a data set I will use the following example data set:
|Item||Amount||Order Id||Customer Id|
The total sum of a value in a data set is the sum of that value from all records in the data set. For the example data set above, the total sum of order amounts is 888.5 .
In other tutorials in this mathematical analysis trail I will use the following notation for total sum:
This is a functional notation, where the name of the function is
sum, and the parameters passed
sum function are the data set (
data) and the name of the
property of each record
to sum. For instance:
The subset sum of a value in a data set is the sum of that value from a subset of the records in the data set. For the example data set above, the subset sum of orders made by customer with customer id 23 is 149.85 .
In the other tutorials in this mathematical analysis trail I will use the following notation for subset sum:
sum(data, property, criteria)
data parameter to the
sum function is the data set. The
property is the
name of the value to sum from each record. The
criteria is the criteria used to select what records
to sum the values for. For example:
sum(data, "amount", "records by customers with more than 3 orders")
In this example the
property to sum is the "amount" properties. The records to sum from is
"records by customers with more than 3 orders". This criteria is not directly executable by a computer. In a real
program you might have to use a criteria syntax that is executable by a computer, like SQL or a lambda expression
of some kind. Exactly what syntax to use depends on what tools you are using to keep the data set in.