You might ask how Picalo compares to different competing products. We do not believe that Picalo has any direct competition (i.e. products exactly like it), but there are several products that are related.

Following are comparisons to:

 

MS Excel

Coming soon...

 

MS Access

Coming soon...

 

ACL and IDEA

ACL and IDEA are the primary applications used by auditors around the world. They have great histories, and they have played an important role in fraud detection. However, they are based in audit procedures, which limits their usefulness in general-purpose data analysis and especially in fraud detection.

Picalo has many features in common with these two applications, and it is certainly different in many ways. The following table describes some of these similarities and differences.

Picalo ACL and IDEA
Primary user is the general-purpose data analyst. The first specific user we're targetting is the control compliance and fraud investigator. Primary user is the general auditor.
Analyses eventually become Detectlets, which allow non-technical detectors to perform analyses written by technically-skilled analysts around the world. Detectlets guide the user through data selection, analysis, and results interpretation using the familiar wizard interface. Analyses are generally done manually using menu options. When users write scripts and share them with others, the analyses remain script based and usually require technical auditors to run them.
Open source, which means you can download updates and install the software for free. It also means that, eventually, thousands of people around the world may participate in development of the application. Come join us! Share your expertise with the world! We may compete on work, but let's collaborate on the tools. Expensive and based in traditional licensing schemes.
Newer and less tested. While the routines are pretty stable, the user interface has quirks at times. Should you trust it? In short, no. You should never trust any application, including ACL and IDEA! You should always print control totals and verify your analysis. Overall, though, the internal analysis routines are quite tested and should provide correct results. When bugs are found, we'll correct them and release an update immediately. More tested and used by tens of thousands of auditors worldwide. Updates are less frequent, but the applications are more mature. In particular, the user interfaces of these applications are smoother (for now :).
Includes routines for fraud analysis. Unique features not found in ACL and IDEA include fuzzy text matching (much better than simple soundex), selection of outliers using many different methods, trending analysis using different methods, time analysis smoothing, and detectlets. Skim through the manual to see many of its abilities. Includes routines for general auditing, with some application to fraud analysis. Unique features not found in Picalo include different methods of sampling (useful for auditing but absolutely wrong in fraud detection), more complex graphing, data import wizards, and a more mature interface.
Written in Python, an interpreted language. The advantage of this language is we can add features to the application very quickly. The disadvantage is the table structures and analyses run slower than in compiled languages. At some point in the future, we'll rewrite the Table data structure in C++ to increase speed considerably. But until then, procedures will probably run a little slower in Picalo than in ACL or IDEA. Written in compiled languages (we think -- we can't see their source code). The advantage of this approach is it gives faster execution of analyses. The disadvantage is application feature changes are harder to implement, forcing a slower release cycle.
Scripting is based in the excellent and popular Python language, which provides thousands upon thousands of available libraries to read files, scrape web pages, take out your garbage, scan seized hard drives, analyze email, connect to servers, and do about anything else you can think of. Python itself continues to move forward with new features and libraries outside of the Picalo program. A web search on "Python Language" produces over 65 million web pages right now. Scripting is based in custom languages created by the respective companies. These languages exist only within ACL or IDEA. In our opinion, why recreate the wheel when powerful, easy-to-learn, OO languages like Python are readily available? Custom languages will never be able to overtake fast-moving, scripting languages like Python because their user bases and development teams are vastly smaller.

 

Numarray

Those who have followed the development of Python over the years have seen Numarray (formerly Numeric) become its preferred matrix-type data structure. So why another toolkit?

First, realize that Picalo is not mutually exclusive to Numarray. In fact, (although it is not currently linked), Picalo could easily serve as an excellent front end to Numarray. It allows direct access to the Python interpreter, and it provides a nice wxWidgets table interface to your data. If someone would like to make this link (replacing Picalo's native table structure with Numarray), please feel free!

However, their are definite differences between Picalo and Numarray. These differences arise mostly because of the differing goals of the two projects. Numarray's primary goal is to support scientific applications that require huge matrix operations. Picalo's primary goal is to support the analysis of business data. This leads to the following differences:

Picalo Numarray
Goal is to support analysis of business data (i.e. relational, SQL-type data) Goal is to support fast matrix operations for scientific data
Tables can hold arbitrary data types, although columns are usually of a single type. Column 1 might hold names, column 2 might hold id numbers, column 3 might hold birthdates, and so forth. 2x2 arrays (matrices) are of a single type -- either all integer, all floating point, etc.
Columns are named for the data they contain. Cells can be accessed via column name, such as mytable[5]['name'] to get the name in the 6th record. Arrays are grids and are accessed via their numerical indices, such as myarray[5][5].
Tables can be joined (think SQL join), stratified, sorted, filtered by value. Operations like selection of outliers, trending, and summarization are supported. Mathematics is a primary goal of the library. Complex matrix mathematics like addition, subtraction, multiplication, sine, cosine, and matrix reduction are supported.
Tables can be saved and loaded from TSV, CSV, relational database connections, and XML documents. Picalo also contains a native file format. Saving and loading is beyond the scope of the project.
Picalo includes a user-oriented GUI written with wxWidgets. Numarray is a data type used in Python programs and scripts.
Disclaimer: I hope I have not disrespected Numarray in this analysis. The comparison is given to clarify for users how the toolkits are different -- not to put one above the other.
 

Home | FAQ | Contact Us
Copyright © 2008 Picalo.org. All rights reserved.