Package picalo :: Module Trending
[show private | hide private]
[frames | no frames]

Module picalo.Trending

The Trending module contains functions that highlight trends in data. Since fraud is most often found in changes over time, this module is useful in looking at trends over time.
Function Summary
Table average_slope(table, ycol, xcol)
Computes the average of the slopes between the points given.
Table cusum(table, col)
Calculates a cusum, a cumulative difference in the values of a list at each row in the table.
Table handshake_slope(table, ycol, xcol)
Computes the slope between every point given.
Table highlow_slope(table, ycol, xcol)
Computes a slope based on the minimum Y and the X that goes with it and the maximum Y and the X that goes with it.
Table regression(table, ycol, xcol)
Computes the regressionline for the points given.
  slope(X1, Y1, X2, Y2)
Calculated the slope between the two points given.

Variable Summary
tuple __functions__ = ('cusum', 'highlow_slope', 'average_slop...

Function Details

average_slope(table, ycol, xcol=None)

Computes the average of the slopes between the points given. If xcol is None, it is generated starting as 0, 1, 2, 3, etc.

Example:
>>> from picalo import *
>>> table = Table([('col000', unicode), ('col001', int), ('col002', int)], [
['Dan',10,8],
               ['Sally',12,12],
               ['Dan',11,15], 
               ['Sally',12,14], 
               ['Dan',11,16], 
               ['Sally',15,15], 
               ['Dan',16,15], 
               ['Sally',13,14]])

>>> results = Trending.average_slope(table, 2, 1)
>>> results.view()
+-----------------+
|  Average Slope  |
+-----------------+
| -0.559523809524 |
+-----------------+
Parameters:
table - The table to calculate the slope on
           (type=Table)
ycol - The y column to use. The maximum and minimum are taken from this column.
           (type=str)
xcol - The x column to use. This is optional.
           (type=str)
Returns:
The average slope between points
           (type=Table)

cusum(table, col)

Calculates a cusum, a cumulative difference in the values of a list at each row in the table. The cusum calculation gives a sense of the overall direction of a curve.

Example: >>> table = Table([('col000', int), ('col001', int)], ([5,6], [3,2], [4,6])) >>> cusum = Trending.cusum(table, 0) # cusum the first column (5, 3, 4) >>> cusum.view() +--------------+ | col000_cusum | +--------------+ | 0 | | -2 | | -1 | +--------------+
Parameters:
table - The table to be cusumed.
           (type=Table or TableArray)
col - The column name or index to cusum.
           (type=str)
Returns:
A new table containing a single column for the cusum value.
           (type=Table)

handshake_slope(table, ycol, xcol=None)

Computes the slope between every point given. If xcol is None, it is generated starting as 0, 1, 2, 3, etc.

For example: Assume 5 points. The slopes from points 1 to 2, 1 to 3, 1 to 4, 1 to 5, 2 to 3, 2 to 4, 2 to 5, 3 to 4, 3 to 5, and 4 to 5 are calculated. The sum of those slopes are divided by the total number of points to get an idea of the general trend.
Parameters:
table - The table to calculate the handshake slope on.
           (type=Table)
ycol - The y column to use.
           (type=str)
xcol - The x column to use. This is optional.
           (type=str)
Returns:
A picalo table containing one cell: the calculated slope.
           (type=Table)

highlow_slope(table, ycol, xcol=None)

Computes a slope based on the minimum Y and the X that goes with it and the maximum Y and the X that goes with it. Returns the X that goes with the minimum Y, the minimum Y, the X that goes with the maximum Y, the maximum Y, and the slope.

If xcol is None, it is generated starting as 0, 1, 2, 3, etc.

Example:
>>> table = Table([('col000', unicode), ('col001', int), ('col002', int)], [
['Dan',10,8],
               ['Sally',12,12],
               ['Dan',11,15], 
               ['Sally',12,14], 
               ['Dan',11,16], 
               ['Sally',15,15], 
               ['Dan',16,15], 
               ['Sally',13,14]])

>>> results = Trending.highlow_slope(table, 2, 1)
>>> results.view()
+------+------+------+------+-------+
| MinX | MinY | MaxX | MaxY | Slope |
+------+------+------+------+-------+
|   10 |    8 |   11 |   16 |   8.0 |
+------+------+------+------+-------+
Parameters:
table - The table to calculate the slope on
           (type=Table)
ycol - The y column to use. The maximum and minimum are taken from this column.
           (type=str)
xcol - The x column to use. This is optional.
           (type=str)
Returns:
A table with the first record giving the x and y of the minimum y, the x and y of the maximum y, and the slope betwee the two points.
           (type=Table)

regression(table, ycol, xcol=None)

Computes the regressionline for the points given. Returns the slope, intercept, correlation, and r-squared value of the regression line for the Points If xcol is None, it is generated starting as 0, 1, 2, 3, etc.

Example:
>>> from picalo import *
>>> table = Table([('col000', unicode), ('col001', int), ('col002', int)], [
['Dan',10,8],
               ['Sally',12,12],
               ['Dan',11,15], 
               ['Sally',12,14], 
               ['Dan',11,16], 
               ['Sally',15,15], 
               ['Dan',16,15], 
               ['Sally',13,14]])

>>> results = Trending.regression(table, 1)
>>> results.view()
+----------------+---------------+-----------------+------------------+
|     Slope      |   Intercept   |   Correlation   |     RSquared     |
+----------------+---------------+-----------------+------------------+
| 0.619047619048 | 10.3333333333 | 0.0428888771398 | 0.00183945578231 |
+----------------+---------------+-----------------+------------------+
Parameters:
table - The table to calculate the simple regression on
           (type=Table)
ycol - The y column to use
           (type=str)
xcol - The x column to use. This is optional.
           (type=str)
Returns:
A table containing the slope, intercept, correlation, and rSquared
           (type=Table)

slope(X1, Y1, X2, Y2)

Calculated the slope between the two points given.

Variable Details

__functions__

Type:
tuple
Value:
('cusum', 'highlow_slope', 'average_slope', 'regression', 'handshake_s\
lope')                                                                 

Generated by Epydoc 2.1 on Mon Aug 20 05:38:17 2007 http://epydoc.sf.net