Sum

 Jakob Jenkov Last update: 2016-02-02

The term sum used in mathematical analysis means "the sum of values stored in the records of a data set". A sum may refer to both the sum of values in all records in a data set, or the sum of values in a subset of records in the data set. I will illustrate both types of sums in this tutorial.

The sum of values in records is both interesting by itself, but also as part of composite calculations (e.g. average customer life time value - the average "sum" of total orders from a customer in their time as customer with you).

To illustrate sum operations on a data set I will use the following example data set:

Item Amount Order Id Customer Id
Hard disk99.9579023
Monitor195.9579145
Mouse19.9579223
Keyboard29.9579323
Hard disk79.9579476
Mouse17.9579534
Keyboard24.9579634
Monitor249.9579767
USB Storage 49.9579867
Hard disk119.9579987

Total Sum

The total sum of a value in a data set is the sum of that value from all records in the data set. For the example data set above, the total sum of order amounts is 888.5 .

In other tutorials in this mathematical analysis trail I will use the following notation for total sum:

```sum(data, property)
```

This is a functional notation, where the name of the function is `sum`, and the parameters passed to the `sum` function are the data set (`data`) and the name of the `property` of each record to sum. For instance:

```sum(data, "amount")
```

Subset Sum

The subset sum of a value in a data set is the sum of that value from a subset of the records in the data set. For the example data set above, the subset sum of orders made by customer with customer id 23 is 149.85 .

In the other tutorials in this mathematical analysis trail I will use the following notation for subset sum:

```sum(data, property, criteria)
```

The `data` parameter to the `sum` function is the data set. The `property` is the name of the value to sum from each record. The `criteria` is the criteria used to select what records to sum the values for. For example:

```sum(data, "amount", "records by customers with more than 3 orders")
```

In this example the `property` to sum is the "amount" properties. The records to sum from is "records by customers with more than 3 orders". This criteria is not directly executable by a computer. In a real program you might have to use a criteria syntax that is executable by a computer, like SQL or a lambda expression of some kind. Exactly what syntax to use depends on what tools you are using to keep the data set in.

 Tweet Jakob Jenkov
Featured Videos