Archive for August, 2006
Writing about the origins of TM1, I realized that I knew very little about the developments that led up to the invention and introduction of this seminal tool. So I decided to take a detour in this account of the origins of spreadsheet OLAP. I called Manny Perez, who filled in some of the details, as follows:
In the mid-80s, Manny was managing a departmental IT group at Exxon International Company. Oil supply and demand planning were an overwhelmingly manual process, with paper, pencil and calculators. The only computer assistance was a rudimentary mainframe system that helped add up numbers from different offices.
The system was expensive and had few features; it seemed to Manny that it could be improved. With a degree in mathematics, it was clear to him that this was a multidimensional problem, and he started to play with solutions. At first, he thought a relational database might work, but it soon became apparent that there was no way to attain the necessary speed by simply manipulating relational tables. The data could be stored in a relational model, but to perform calculations it needed to reside in memory.
Using the advanced technology of the time, Manny put together a solution using IBM’s Time Sharing Option (TSO), which made it unnecessary to get the full use of the mainframe. He wrote it in PL/1, a high-level language supporting sophisticated data structures. Manny’s program kept the data in memory, and allowed users to pivot the “cube” and get the necessary totals. There was no spreadsheet component – the spreadsheet had not yet been invented – and the system was more of a productivity aid than a complete solution. Nonetheless, this in-house system was certainly one of the first OLAP systems ever implemented.
Innovation blues
The users liked the new system, which was much more flexible, easy to use and cheaper to run than the existing system. But to Manny’s surprise, the his own IT department opposed it. The data model was unorthodox. Relational databases were the tool of choice. He had created a nameless beast without precedent, which had no place in the corporate IT world. A conflict ensued between the users and the IT department. The users prevailed and the system went into production.
This was Manny’s first experience with the political landscape that TM1 was to inhabit for many years. An in-memory multidimensional model was the right tool for the job. But it was outside of the mainstream, unblessed by research labs and universities, unknown to IBM and other large vendors. Self-respecting IT departments wanted nothing to do with it. But users, eager for useful tools to assist their work, embraced it and found every possible way to bring it in to their organizations.
Manny was not the first or the last inventor to face such challenges. These are the eternal challenges of innovators; probably the first person to tame fire was driven from the communal cave. The inventors of the relational database went through a similar time in the wilderness. Institutions are conservative and skeptical of the new, no matter how apparent the benefits.
The apple on the head
Manny, though, saw the usefulness of this new model and the users’ enthusiasm. Frustrated with the opposition in his job, Manny got the idea that this could be his way to become an entrepreneur. He could create a similar system on TSO elsewhere, and sell it to others. But mainframe time, even time-shared, was too expensive and the risks were high. The plunge would be very difficult.
Just then, Manny saw Visicalc, and he saw his opportunity. The spreadsheet was the logical front-end for his database. What is more, it ran on the new microcomputers, which were cheap enough that he could develop on one. And he could sell his product to microcomputer users. Even where these users were part of large organizations, they had a freedom to choose software that other computer users did not have. They just might buy the product.Manny bought an early IBM PC with 256K of RAM and two floppy disks. He programmed the first version in Microsoft Pascal.
It had a spreadsheet of its own linked to a multi-dimensional database, and a cube browser. It allowed the user to create multiple cubes and share dimensions. With this product he started Sinper Corporation, and the rest has followed.
Manny says that from that point until now, everything else has been driven by user demand. We have to accept his word for this, but it diminishes nothing. The ability to listen to, understand and respond to users is a rare skill.
Followers of this site will see that we have veered a bit from our announced plans — material seems to come in in a reverse order, as it were. But not to worry, we will cover it all, just stay tuned. As the poet said, “Theory is gray, but green is the tree of life.”
August 19th, 2006
I wish I had a penny for every person who has asked me what Applix TM1 is and why we like it so much. Even enthusiastic users often know little of its origins. I thought readers of this website might want to know a little more –. I meant to write a short piece, but it just grew — so I’ve split it into parts. Let me know what you think. . . . David
In the beginning there was the spreadsheet
TM1 began as a part of the spreadsheet “revolution” in 1984. Dan Bricklin had invented the spreadsheet with VisiCalc in 1979; Lotus 1-2-3 followed in 1983. This new type of computer application exponentially increased the ability of users to apply computer power to problems of their choice without expert intervention. Modelled on a simple grid of paper, the spreadsheet was easy to grasp and easy to use. It could be applied to models large and small, to all manner of reports, fair-sized databases and large calculations involving many variables and formulas.
Unlike many key innovations in computing, the spreadsheet had no roots or theory from the academy, or from the research labs of AT&T, IBM, Xerox or the likes. It had no direct predecessors. It used the unusual but easy declarative programming model, replacing thousands of lines of FORTRAN with a series of formulas that a high school student could understand.
Success on the bleeding edge
On the bleeding edge of technology, the spreadsheet could easily have faded back to wait for its moment like many other brilliant ideas. But it had some unique advantages. It was extremely accessible to users, immediately useful, and – perhaps most important — appeared just as as microcomputers were coming on the scene. Bricklin’s website tells the story well.
The spreadsheet became the tool of choice for a wide variety of bright and creative people in many fields – financial managers, of course, but also planners, business analysts, economists and scientists.
Trouble in Paradise
But as spreadsheet use expanded, and people started to use it for more and bigger projects, the simple paradigm of the ruled sheet of paper became unwieldy.
- Large sheets were slow, and pushed the limits of memory. To get around this, users split single spreadsheets into galaxies of sheets linked by external formulas and macros. But this sacrificed transparency; complex systems became large and bug prone.
- The classic spreadsheet has two dimensions but often the data it represents often has three or more. A simple table showing accounts and months is easy. But how does a two-dimensional spreadsheet show a book of income statements for many companies that transact business with each other – or break down a statement by product and region?
- The spreadsheet combines data and program elements in a single object. How can the user (or the auditor) easily tell the difference, and how to find errors buried deep in the sea of formulas? All of these problems have solutions — but they require labor and user discipline.
- Spreadsheets are usually used by a team, passed around from one to the other, with changes being made in different places. This easily results in a situation where everybody has a slightly different version of the sheet, and the truth is hard to find.
The next step
Many developers at the time looked for a logical next step for the spreadsheet.
Manuel (Manny) Perez, the “father” of TM1 — now CTO at Applix — came up with a unique combination of features that addressed all of these problems.
Manny created a product that integrated a database into the spreadsheet, keeping an extremely tight, formula-driven link between the two. The database was not a conventional relational database, but a special kind of multidimensional database designed for the types of things spreadshets often do. Each value in the database was identified by a set of strings, one from each of its user-defined dimensions. For example, a budget number was identified as the intersection of “2006”, “January”, “Sales”, “Striped paint”, and “Boston”. The dimensions could specify how to “consolidate” numbers. Using this information, if the user asked for “Eastern Region,” the database would calculate a total of the numbers for “Boston”, “New York” and “Washington”, for instance.
These things may seem routine to OLAP users today, but at the time the term OLAP had not been coined and only a few products did anything similar. A good industry history can be found here.
Already in its early versions, TM1 included a multitude of useful features for its user-developers, including the ability to work with more than one “cube” (or multidimensional table), the ability to store strings in cubes, and the “process” command, which uses a spreadsheet to import ASCII data files into the database.
Spreadsheet OLAP
Manny’s TM1 spreadsheet had all the usual functionality, but additionally included a set of special functions to retrieve data from and send it to the database. These formulas worked with strings, so a value could be retrieved based upon the text contained elsewhere in the spreadsheet. And they worked just like all other spreadsheet formulas, so that new data could be brought in by simply changing the value of a cell and recalculating.
This innovation solved most of the problems with spreadsheets until then. It separated data from formulas. It made many very large spreadsheets unnecessary by storing their data in the database. And of course it was multidimensional from the ground up.
Additionally, Manny consistently implemented all aspects of use and administration to work from spreadsheets, making the product very friendly to spreadsheet users.
A move to client-server technology
In 1989, Manny took a further leap, implementing the database side as a server for spreadsheet clients, thus making TM1 a multi-user tool for group collaboration protected by security. This solved the last spreadsheet problem mentioned above — the multiple versions of the truth contained in differet documents.
While early releases included the proprietary spreadsheet, TM1 soon made the leap to enabling Lotus and then Excel spreadsheets to serve as clients. Following this came a series of enhancements and improvements, each remarkable in its own way. Manny kept a very close line of communication with TM1’s key power users and seemed to be exactly on time and in tune with the evolving needs of the product’s base.
Spreadsheet roots
TM1 continued to be closely connected to its spreadsheet roots.
- Like the spreadsheet from which it sprang, TM1 is a productivity tool for power users – in some ways, a quintessential “horizontal” tool useful in a wide variety of applications.
- Like the spreadsheet, TM1 found an enthusiastic home in financial applications.
- Like the spreadsheet, TM1 enabled clever users to produce one-off, applications of tremendous power.
- Finally, like the spreadsheet, TM1 embodied new concepts which were not originally pioneered by the academies or research institutes.
TM1 quickly caught on with a small group of sophisticated spreadsheet programmers, and developed a very loyal following. Many thought TM1 was the “next big thing”; the logical successor to Lotus 1-2-3, but time was to prove them wrong.
– In the next section, we will discuss some of the further history of TM1 and current developments. We will also reflect on some of TM1’s obstacles and successes, and muse on the many ways in which it was ahead of its time.
August 8th, 2006
Following our post on the new PALO open source package, Jedox president Kristian Raue agreed to answer a few questions about the company’s plans for the product.
How is PALO planning to relate to the broader world of spreadsheet OLAP?
By July 2006 more than 10,000 people had downloaded Palo and we have 50 new downloads each day. So I think we have already had a large impact in the spreadsheet OLAP market. With the availability of a 32- and 64-bit version and with the availability of a Linux version for Palo our target market is potentially bigger than the market of other well-known players in the spreadsheet OLAP field. With Palo 1.5 later this year, we will have access rights, element attributes, an Embedded Transaction Engine and also an advanced engine that promises even more speed. And after 1.5 we will have a 2.0 release in 2007, hopefully 1st Quarter 2007.
What is your corporate outlook? Do you intend to make most of your money on the client software? Are you planning an “enterprise” Excel client or something of the kind?
Jedox makes money by selling Worksheet-Server, which is our OLAP-enabled multi-user Excel-to-Web solution for corporate use. We also earn money by selling support and consultancy work regarding Palo and Worksheet-Server implementations. At this time we don’t plan on making money by selling some advanced or enterprise version of the Palo Excel-Client or Palo Server. With this business model closely bound to Open-Source technologies we managed to grow more than 100% in revenue each year.
What is your relation to the German spreadsheet OLAP maker MIS-AG?
There a basically two relationships to MIS AG. In 2002 I sold my shares of Intellicube AG, which was the initial developer of OnVision, to MIS AG. And second, Peter Raue, formerly president of MIS AG, was my brother. He unfortunately passed away in 2004.
Is the codebase of PALO completely new, or does it derive from Alea?
It is completely new, we have never seen or touched the code base of ALEA (or Applix TM1).
Do you intend to relate in some way to Microsoft beyond Excel? What about Analysis Services?
I loved Excel from the very first day that it appeared back in 1987. Apart from that, there is not much relation to Microsoft. You can connect Palo and MS AS using Cubeware.
Can we expect to see some unique features in the product?
Yes. In some ways, Palo is unique already today for example with its Linux version. In September we will have the ETE (Embedded Transaction Engine), which will allow the server to trigger other processes when server events occur. Such processes could include, for example, PHP scripts or user programs. It will be very interesting to see where clever users and developers go with such unprecedented capabilities. With 15 years of intensive experience in the spreadsheet OLAP market we will deliver more unique features in future releases.
What about the clients you are producing? Are you planning to develop for Open Office too?
For Palo there a APIs available in all directions (C, C++, .NET, PHP and Java at www.jpalo.net). We used the .NET API to build our freeware Excel client. So far, we are not supporting Open Office yet, but that might change in the future or somebody else volunteers to do it.
How will Jedox respond if a group decides to produce a free web client that competes with your commercial one?
We would love to see such a development, which would expand the visibility of Palo. By the way, we also like Google Spreedsheets and Excel 12 Server. These products help us develop the market for Excel-to-Web solutions.
How hard is it for a developer to put a web client together using open source tools like PHP?
Easy. Have a look at the demo source code for PHP that you can download with the Palo SDK.
What about involving a community of developers? Are you intending to involve others outside of the company, or do you intend to do the development in-house?
MOLAP is about speed. So we currently decided to develop the core engine of Palo ourselves to make sure it is fast. For all other aspects of Palo (clients, ETL, Reporting tools, etc) we are very open and supportive to others outside the company. Look at the jpalo project for example, it is developed by Tensegrity.
Have you had feedback or reviews so far?
Very positive feedback from our customers so far. In October the first book about Palo will appear. Palo is already used successfully in large companies and organizations, for example check this.
What about documentation?
We are continuously improving the documentation and welcome suggestions about missing parts. Also the book - which will probably be in German and probably also in English and French - will help a lot. If you have a problem with missing information, simply use the Palo Forum. This forum closely monitored by our team. If you think you have found a bug, please report it to our bug tracker.
In our review, we cited an apparent lack of open source infrastructure for PALO, noting that there is no site on Savannah or Sourceforge, or automated bug tracking. Do you have any comments on this?
Both the forum and the public bug tracker are available since the very first day that Palo came out.
August 3rd, 2006
Every now and then we see a request for the OLAP Council benchmark, last updated in 1998; since nobody else seems to have all the materials in one place, we have decided to put them here as a service.
- The specification can be downloaded here.
- The data generation program is here.
While it may be somewhat dated now, this is still the only such benchmark there has been. Our friends Erik Thomson and George Spofford of DSS Labs produced it for the OLAP Council, which was an industry association of many of the leading OLAP companies. Thomsen and Spofford, who worked with us for a while, have maintained a unique focus on theory in an industry not noted for its enthusiastic support for theory.
The diversity of OLAP products ensured that results of benchmark runs could be — and were — conditioned and spun in various ways by the sponsoring companies. Nonetheless, the benchmark continues to represent a set of basic tasks which OLAP products can be expected to perform, and which can be measured and timed.
The OLAP Council is no longer, and with it, the function of certifying benchmark auditors, so this is hardly a current project. We are making available on our site the benchmark specification, and also a little program that generates data files to run the specification.
August 1st, 2006
This article says that almost nobody reads an Internet news post more than 36 hours old.Well, that article is almost a month old, but it’s still very timely.
This by way of saying that now that this site has begun to take on a life, it seems that we’reproducing new content about once a week. We’re no fireballs, but hopefully the infrequency is balanced by the quality and subject matter — how are we doing? Send us email at Vector Space.
August 1st, 2006