Posts filed under 'Enterprise software'
Writing about the origins of TM1, I realized that I knew very little about the developments that led up to the invention and introduction of this seminal tool. So I decided to take a detour in this account of the origins of spreadsheet OLAP. I called Manny Perez, who filled in some of the details, as follows:
In the mid-80s, Manny was managing a departmental IT group at Exxon International Company. Oil supply and demand planning were an overwhelmingly manual process, with paper, pencil and calculators. The only computer assistance was a rudimentary mainframe system that helped add up numbers from different offices.
The system was expensive and had few features; it seemed to Manny that it could be improved. With a degree in mathematics, it was clear to him that this was a multidimensional problem, and he started to play with solutions. At first, he thought a relational database might work, but it soon became apparent that there was no way to attain the necessary speed by simply manipulating relational tables. The data could be stored in a relational model, but to perform calculations it needed to reside in memory.
Using the advanced technology of the time, Manny put together a solution using IBM’s Time Sharing Option (TSO), which made it unnecessary to get the full use of the mainframe. He wrote it in PL/1, a high-level language supporting sophisticated data structures. Manny’s program kept the data in memory, and allowed users to pivot the “cube” and get the necessary totals. There was no spreadsheet component – the spreadsheet had not yet been invented – and the system was more of a productivity aid than a complete solution. Nonetheless, this in-house system was certainly one of the first OLAP systems ever implemented.
Innovation blues
The users liked the new system, which was much more flexible, easy to use and cheaper to run than the existing system. But to Manny’s surprise, the his own IT department opposed it. The data model was unorthodox. Relational databases were the tool of choice. He had created a nameless beast without precedent, which had no place in the corporate IT world. A conflict ensued between the users and the IT department. The users prevailed and the system went into production.
This was Manny’s first experience with the political landscape that TM1 was to inhabit for many years. An in-memory multidimensional model was the right tool for the job. But it was outside of the mainstream, unblessed by research labs and universities, unknown to IBM and other large vendors. Self-respecting IT departments wanted nothing to do with it. But users, eager for useful tools to assist their work, embraced it and found every possible way to bring it in to their organizations.
Manny was not the first or the last inventor to face such challenges. These are the eternal challenges of innovators; probably the first person to tame fire was driven from the communal cave. The inventors of the relational database went through a similar time in the wilderness. Institutions are conservative and skeptical of the new, no matter how apparent the benefits.
The apple on the head
Manny, though, saw the usefulness of this new model and the users’ enthusiasm. Frustrated with the opposition in his job, Manny got the idea that this could be his way to become an entrepreneur. He could create a similar system on TSO elsewhere, and sell it to others. But mainframe time, even time-shared, was too expensive and the risks were high. The plunge would be very difficult.
Just then, Manny saw Visicalc, and he saw his opportunity. The spreadsheet was the logical front-end for his database. What is more, it ran on the new microcomputers, which were cheap enough that he could develop on one. And he could sell his product to microcomputer users. Even where these users were part of large organizations, they had a freedom to choose software that other computer users did not have. They just might buy the product.Manny bought an early IBM PC with 256K of RAM and two floppy disks. He programmed the first version in Microsoft Pascal.
It had a spreadsheet of its own linked to a multi-dimensional database, and a cube browser. It allowed the user to create multiple cubes and share dimensions. With this product he started Sinper Corporation, and the rest has followed.
Manny says that from that point until now, everything else has been driven by user demand. We have to accept his word for this, but it diminishes nothing. The ability to listen to, understand and respond to users is a rare skill.
Followers of this site will see that we have veered a bit from our announced plans — material seems to come in in a reverse order, as it were. But not to worry, we will cover it all, just stay tuned. As the poet said, “Theory is gray, but green is the tree of life.”
August 19th, 2006
I mentioned the new open source OLAP server PALO (OLAP backwards) in an earlier post. This TM1 workalike is still at an early stage, but any new offering in this area - especially if it is free - is of great interest to spreadsheet OLAP users.
In this post; I will give readers some idea of what this software offers, what it promises and what it lacks. I’ve put together some screenshots and added some thoughts on this entry.
Some of the people behind PALO were earlier involved with MIS AG’s spreadsheet product Alea, and PALO looks quite similar. There’s a long history here which may be worth telling elsewhere, but what matters in this context is that the designers of this software “get it” about how OLAP and spreadsheets should work together.
What’s more, the product is not cheap, it’s actually free - released under the orthodox Gnu Public License (GPL) which restricts distribution only by forbidding the recipients to restrict further redistribution, including source code.
But don’t plan to replace your corporate TM1 applications with a free replacement anytime soon - PALO is missing many
key TM1 features, including security, advanced calculation engine, fast imports, views, subsets and others. Some of the missing features are in the Roadmap, others may take a long time to come.
And while the server is available under the GPL, clients come under a different set of licenses and restrictions, including the free but proprietary Excel client and several completely commercial clients as well, including a web server called the Worksheet Server.
Still, this is a very interesting entry in the public domain world. The server runs on Linux and Windows; the most “native” client, of course, is Excel, but it comes with a Linux Eclipse add-in to browse tables as well. You can download and install Palo from here.
Palo has a few flashy screen shots on its web site, but real spreadsheet OLAP users will want to know a bit more. I’ve put together a few screen shots to show how it looks.
PALO is a server-based system, with an Excel front end on the model pioneered by Applix TM1. The server holds the data; spreadsheets connect to the server over the network and bring in data as needed.
Here are some screenshot showing a simple session:
Connecting to a local server:

Selecting a view:

Simply selecting “paste” provides the following excel spreadsheet.

The “PALO.DATAC” formula for the value contains a reference for the server and to the dimension elements that identify the correct intersection.
The string “Gross Profit” is also a reference to a database value, in this case a dimension element. It uses the “PALO.ENAME” formula, which pulls out a dimension element.

If we double-click on “Gross Profit” we get a “drill-down” as follows”

Similarly, if we double-click on “Variance” we get the following:

Looking at this view, we can see as well that the “title elements” - products, region, month and year are also defined by “PALO.ENAME” formulas.
Double-click on one of these, and we get a selection box like the following:

This is a normal outline view, and we can open it by clicking on the “+” signs, to produce a view as follows:

Selecting, for example, France, we get the following:

We can make “stacked slices by dragging the “Product” dimension to the “Column titles” area as follows:

This will give us a spreadsheet like this:

And double-clicking on “All products” will open that dimension out as follows:

Dragging and dropping a few other dimensions and elements around, we look at the same data in a completely different way something like the following:

This sort of an OLAP view is very effective in uncovering data problems - in this case, it seems the test database hasn’t been updated since 2004.
The nice thing about these spreadsheets is that they are not “magic” - we can save them and retrieve them, and on recalculation they will again retrieve the latest data from the server. We can insert a column, add a regular spreadsheet formula and format it, and everything will behave the way we expect, like this:

If we take every dimension down to the bottom level, we can type in new values which are then sent to the database. So if we want to look at budgets and enter actuals, we can work with a view like this:

If we put the cursor on January Actuals for Desktop L, we can type in the number “323″, press the Enter key, and get this screen:

The number has been sent to the server, and we are looking at a view of server data again. Of course, in the absence of security, there’s no way to make sure the user doesn’t “correct” actuals to values she likes better either
These slice spreadsheets can be saved and retrieved, and when recalculated with the PALO add-in, they will reflect the current status of the database from the server.
Aside from spreadsheets, though, there is no way to store views. And there is no way to store subsets either, although subsets for views can be manipulated with some basic operations. Here’s an element selection window:

Shortcomings
In short, the product looks like it’s got the basics. But it has critical shortcomings too. Of course, are all those missing features. But looking at the roadmap, it seems they are well understood and at least some of them will be here soon. Other weaknesses concern me more.
- A key shortcoming is documentation. I tried to connect from my Excel client to a linux server without success — the connection software asks for a login and a port number, but there’s no documentation on how to set these. Indeed, while there’s documentation on the API and the source code is open, the user side is almost completely undocumented.
- There’s little evidence that the developers of PALO are taking advantage of the internet infrastructure for open source development, or trying to attract a community of contributors outside of their walls. There’s no site on Sourceforge or Savannah, which host much other GPL software. There is no wiki, no automated bug reporting such as Buzilla or launchpad; their linux software is not packaged for Red Hat or Debian. Prospective users may wonder whether the software will develop rapidly or have a long life if it depends on the fortunes of the Jedox company, and does not foster feedback and contributions from a broad community..
- There are unresolved questions about how the free software model of the server will blend with the commercial angles of the clients. Is anyone thinking of a client for Open Office Calc, for example? How will Jedox respond if a group decides to produce a free web client that competes with their commercial one
In sum, PALO has enough features to make it interesting. It probably won’t challenge the corporate sweet spot of its commercial competitors, and it hasn’t shown a deep involvement with the open source community.
On the other hand, the availability of a free spreadsheet OLAP product may open up new horizons for this genre, especially in smaller and non-profit environments. And it might attract the interest of open source heavy hitters like Sun or IBM, which would turn a very interesting corner indeed.
Barring that, the future of PALO will depend upon how the community of users, developers and implementors take to it, and how well they are able to take advantage of its open source paradigm. We will be following this with interest.
July 25th, 2006
Let me start this post by saying that Vector Space has no involvement or business interest in DQS, a startup with a product in beta. But when we heard about it from a former colleague, we thought it was worth a deeper look. Perhaps you’ll find it interesting too.
DQS proposes a new way of bringing together the information resources of large enterprises.
To understand why we’re interested in DQS, we must peek at the dark side of IT. Consider the following common scenario:
Our consultants go to a meeting with a major client.
“I need to analyze corporate operations,” says the customer. “Sales, cost of sales, profitability by channel, that sort of thing, month by month.
“Our company makes 123 different products, it sells them in 54 regions, with 37 sales channels, and produces them at 19 plants. Although it’s a lot of information, we’re already capturing it all in our systems. All I need you to do is pull it together and put it on my computer screen so I can work with the big picture. Can you do it by the end of the month?”
One might imagine every serious enterprise already has this information. After all, how does anyone run a company without knowing what they make on each product? Changes in sales for their channels? Seasonal and regional variations in demand?”
One might also imagine that such a facility is easily created.
But imagination would be wrong, on both counts.
Companies are run without this vital information every day.
And applications that draw information from all over the enterprise pose daunting challenges, because of the way IT systems are typically structured.
The information archipelago
Every large company has many systems. Just a little digging usually shows that corporate information resources reside in a patchwork of purpose-built systems.
- There are systems for different business functions: a sales system, a production system, another for inventory;
- Local offices create their own systems to meet their special needs, especially where the company operates in several countries;
- Where there have been mergers and acquisitions – which is to say at most large companies — legacy systems of the merger parties continue for years, resulting in further division even within the same function. So it is not unusual to find multiple sales systems, multiple production systems within one company.
To make things worse, these systems run on different hardware using different operating systems and different database technologies. Even where they run on software from the same vendor – SAP, for example — implementation can be different and incompatible. And in many cases, similar concepts – product, for example – can be represented by different codes, or tracked at different levels of detail.
Each of the incompatible legacy systems embodies years of staff effort, and represents a complex study in its own right, leading to major staffing headaches. Someone working on one system can only move to another after a significant period of retraining and apprenticeship.
Integration can be shockingly difficult.
Bridge-building vs. consolidation
For analysis – and for every other enterprise function — communication among systems is a constant issue. Some IT departments spend a majority of their resources building bridges connecting their data archipelago, especially in the wake of a merger.
Analytics consultants like ourselves spend no end of time (and customer money) working on what we call ETL – Extraction, Transformation and Loading – in other words, systems reconciliation.
Seeking a detour around the data swamp, many companies initiate a new master system, designed to replace the many existing systems. But since the old systems are needed for ongoing business, a new system is – at least in the short run – simply one more system for the stew.
Mega-system risks
Building a new comprehensive system is also very risky. Big-ticket items costing in the many millions of dollars, such facilities typically require a large dedicated staff and multiple years to complete. To achieve success, they require impeccable design and execution, and a very stable development team.
Even under the best of circumstances, enterprise systems projects only begin to display measurable results towards the end. But they can only succeed with a reliable, unfailing, continuous commitment by top management and investors over long months and years. Impressive persuasive powers are required to maintain such a commitment years in advance of results.
A new system of this size is critically vulnerable to management changes to further mergers and acquisitions and to the financial fortunes of the organization. And the costs of such projects can depress the bottom line dramatically.
The DQS idea
The founders of DQS have a different idea. A great deal of useful business logic and experience is embodied in the well-tested and highly-debugged legacy systems of every company. The problem is not the systems, which typically work very well, but their mutual incompatibility.
What if, instead of building bridges system by system, we designed a method for systems to communicate in general? After all, the incompatibilities between systems are not endless – they can be enumerated and solved in a general way.
What if each computer in the enterprise could be retrofitted with a program whose only purpose is to communicate with the others, using a common communication protocol? And what if these communication links could be made extremely easy to access, so that the information resources of the enterprise appear to reside on one virtual system even though they really live on multiple systems?
John Walsh, Chairman and Chief Strategy Officer wrote to me in a recent email:
We are in beta now with our enterprise version of what we refer to as an Optimized Service Oriented Architecture. We have spent the last four and a half years patenting and developing the technology. We were granted a special patent reserved for technologies that are “in the national interest”. We were told by the patent office that only 10 of these patents have been awarded in past 40 years. We have several other patents pending.
The technology enables you to quickly create an inventory of re-usable objects – data structures, software, and processes – that can be used in designing, testing and implementing applications and processes. The technology is language and operating environment agnostic. We do not rely on XML or Web Services standards.
We have a run time environment that creates a linked network out of disparate computers.
The interface allows users to collaborate and share distributed data, software, processes and computers without regard to geography.
Our goal is to “industrialize” Information Technology. Our models are the manufacturing and construction industries in which products are designed and then assembled from components built by hundreds or thousands of different vendors located all around the world.
This is the premise of DQS. It looks promising to us, and we’re curious to see how it plays out. We’ll bring more to you as we learn more ourselves – stay tuned.
June 27th, 2006