Year: 2010

Dave Kellogg on Palantir

I recently began reading the blog written by Dave Kellogg who is the CEO of Mark Logic, a company devoted to XML-based content management. I think I came to notice them when I discovered what cool technology EMC got when it bought X-hive which has now become Documentum xDb/XML Store. Mark Logic and X-hive was of course competitors in the XML Database market. In a recent blog post he reflects on the Palantir product after attending their Government Conference.

The main scope of his blog post is around different business models for a startup and that is not my expertise and I don’t have any particular opinion around that although I tend to agree and it was interesting to read his reflections of how other companies such as Oracle (yet another competitor to Mark Logic and xDb) have approached this.

Instead my thinking is based around his analysis of the product that Palantir offers and how that technology relates to other technology. I think most people (including Kellogg) mainly view Palantir as a visualisation tool because you see all these nice graphs, bars, timelines and maps displaying information. What they tend to forget is that there is huge difference between a tool that ONLY do visualisation and one that actually let you modify the data (actually modifying contextual data around them such as metadata and relations) within those perspectives. There are many different tools around Social Network Analysis for instance. However, many of them assumes that you already have databases full of data just waiting to be visualised and explored. Nothing new here. This is also what many people use Business Intelligence toolkits for. Accessing data in warehouses that is already their, although the effort of getting there from transactions oriented systems (like in retail) is not small in any way. However, the analyst using these visualisation-heavy toolkits access data read-only and only adds analysis of data already structured.

Here is why Palantir is different. It provides access to raw data such as police reports, military reports, open source data. Most of it in unstructured or semi-structured form. When it comes into the system it is not viewable in all these fancy visualisation windows Palantir has. Instead, the whole system rests on a collaborative process where people perform basic analysis which includes manual annotations of words in reports. This digital marker pen allows users to create database objects or connect to existing ones. Sure this is supported by automatic features such as entity extraction but if you care about data quality you do not dare to put them in automatic mode. After all this is done you can start exploring the annotated data and linkages between objects.

However, I do agree with Dave Kellogg that if people think BI is hard, this is harder. The main reason is that you have to have a method or process to do this kind of work. There are no free lunches – no point of dreaming about full automation here. And people need training and mindset to be able to work efficiently. Having played around with TIBCO Spotfire lately I feel that there is a choice between integrated solutions like Palantir which has features from many software areas (BI, GIS, ECM, Search etc) or using dedicated toolkits with your own integration. Powerful BI with data mining is best done in BI-systems whereas they probably never will provide the integration between features that vendors like Palantir offers. An open architecture based on SOA can probably make integration in many ways easier.

Why iPhone OS (iPad) is ECM…

I like Twitter. It exposes me for a lot of interesting thoughts from interesting and smart people that I follow. Today I read a post called  Why the iPad Matters – Its the Beginning of the End by Carl Frappaolo. It talkes a lot of why the iPad brings a new promise for content delivery – a complete digital chain. It made me think about one of the things which is unique with the iPod/iPhone/iPad – it is the lack of a folder-based file system exposed to users. Surprisingly (maybe) it is the lack of it that makes the whole user experience much better.

So how does this relate to ECM then? Well, I guess many of us ECM-evangelists (or “Ninjas” I heard today) have been in endless meetings and briefings explaining the value of metadata and the whole “context-infrastructure” around each object in an ECM-system that can hold fine-grained permissions, lifecycles, processess, renditions and so forth. I have even found myself explaining the ECM concept using the iTunes as an analogue. You tag the songs with metadata and access them through playlists which is in essence virtual folders where each song can be viewable in many playlists. That is the same concept as the “Show in folder” flag in Documentum. Metadata can even power Smart Playlists which in essence is just a saved search query – something we have added as a customization in Documentum Digital Asset Manager (DAM). So in essence the iTunes Library (should be call it a repository 🙂 is a lightversion of an ECM-system. Before continuing I really wonder why I have to customize Documentum to get the GUI-features that iTunes provide…?

So iTunes abstracts away the folder-based file system on a Mac or Windows PC but as long as you are using Mac OS X or Windows the file system is still there right. Some people even get really frustrated by iTunes and just can’t get around their head that there is no need to move files around manually when synching them to iPhone OS-powered devices. And here comes the beauty, in these devices there are no folder-based file system to access. Just the iPod App for music, the Photos App for photos and so forth. All your content is suddenly displayed in context and filtered out based on metadata and that App’s specific usage.

To some degree that means that the whole concept of iPhone OS-based devices not only can make content delivery digital but it can provide a much better user interface that is powered by all these ECM-features that we love (and have a hard time explaining). Suddenly we have an information flow entirely based on metadata instead of folder names and file names. Maybe that will make ECM not only fun but also able to much more quickly explain the dreaded “What’s in it for me question?”.

Now, can someone quickly write an iPad App for Documentum so I can make my point 🙂 It will be a killer app, believe me!

CPU, Cores and software licenses

In an article in ComputerWorld there is a good discussion around license models for different software vendors. There seem to be a mix of per socket pricing and some notion of a CPU and that each CPU corresponds to a number of processor cores. In EMC:s case for instance a CPU-license corresponds to 2 Cores and Oracle has a similar model. The number of processor cores is steadily increasing and soon it will be common with 6-8 cores per socket on server hardware. I agree with the article that that these models need some kind of revision. This is especially true if you sign longer contracts where this development can lead to some interesting issues. Server hardware need to be replace sooner or later because of power, storage or just performance reasons. It is not uncommon that the idea is to get fewer but more powerful servers in order to save power and cooling.

The interesting effect then is even if you can consolidate software applications on fewer hardware they each overstep their licenses in terms of server cores. What about virtualisation then? Well, that is of course also the future so power can be load-balanced between applications more easily. However, that means that the license model must allow for using virtualisation to throttle down to any number of cores per licensed application. In Oracle’s case again that usually means a requirement to run their own virtualisation product even if you have a VMWare investment.

The Long Tail of Enterprise Content Management

Question: Can we expect a much larger amount of the available content to be consumed or used by at least a few people in the organisations?

Shifting focus from bestsellers to niche markets
In 2006 the editior-in-chief of Wired magazine Chris Andersson published his book called ”The Long Tail – Why the Future of Business is Selling Less of More”. Maybe even the text printed on the top of the cover saying ”How Endless Choice is Creating Unlimted Demand” is the best summary of the book. This might have been said many times before but I felt a strong need to put my reflections into text after reading this book. It put a vital piece of the puzzle in place when seeing the connections to our efforts to implement Enterprise 2.0 within an ECM-context.

Basically Chris Andersson sets out to explain why companies like Amazon, Netflix, Apple iTunes and several others make a lot of money in selling small amounts of a very large set of products. It turns out that out of even millions of songs/books/movies nearly all of them are rented or bought at least once. What makes this possible is comprised out of these things:

Production is democratized which means that the tools and means to produce songs, books and movies is available to almost everybody at a relatively low lost.
– Demoractization of distribution where companies can broker large amount of digital content because there is a very low cost for having a large stock of digital content compared to real products on real shelves in real warehouses.
– Connecting supply and demand so that all this created content meets its potential buyers and the tools for that is search functions, rankings and collaborative reviews.

What this effectivly means is that the hit-culture where everything is focused on a small set of bestsellers is replaced with vast amounts of small niches. That has probably an effect of the society as a whole since the time where a significant amount of the population where exposed to the same thing at the same time is over. That is also reflected in the explosion of the number of specialised TV-channels and TV/video-on-demand services that lets views choose not only which show to watch but also when to watch it.

Early Knowledge Management and the rise of Web 2.0
Back in the late 90-ies Knowledge Management efforts thrived with great aspirations of taking a grip of the knowledge assets of companies and organisations. Although there are many views and definitions of Knowledge Management many of them focused on increasing the capture of knowledge and that the application of that captured knowledge would lead to better efficiency and better business. However, partly because of technical immaturity many of these projects did not reach its ambitous goals.

Five or six years later the landscape has changed completely on the web with the rise of Youtube, Flickr, Google, FaceBook and many other Web 2.0 services. They provided a radically lowered threshold to contribute information and the whole web changed from a focus on consuming information to producing and contributing information. This was in fact just democratization of production but in this case not only products to sell but information of all kind.

Using the large-scale hubs of Youtube, Flickr and Facebook the distribution aspect of the Long Tail was covered since all this new content also was spread in clever ways to friends in our networks or too niche ”consumers” finding info based on tagging and recommendations. Maybe the my friend network in Facebook in essence is a represention of a small niche market who is interested in following what I am contributing (doing).

Social media goes Enterprise
When this effect started spreading beyond the public internet into the corporate network the term Enterprise 2.0 was coined by Andrew McAfee. Inside the enterprise people where starting to share information on a much wider scale than before and in some aspects made the old KM-dreams finally come into being. This time not because of formal management plans but more based on social factors and networking that really inspired people to contribute.

From an Enterprise Content Management perspective this also means that if we can put all this social interaction and generated content on top of an ECM-infrastructure we can achieve far more than just supporting formal workflows, records management and retention demands. The ECM-repository has a possibility to become the backbone to provide all kind of captured knowledge within the enterprise.

The interesting question is if this also marks a cultural change in what types of information that people devoted their attention to. One could argue that traditional ECM-systems provide more of a limited ”hit-oriented” consumption of information. The abscense of good search interfaces, recommendation engines and collaboration probably left most of the information unseen.

Implications for Enterprise Content Management
The social features in Enterprise 2.0 changes all that. Suddenly the same effect on exposure can be seen on enterprise content just as we have seen it on consumer goods. There is no shortage of storage space today. The amount of objects stored is already large but will increase a lot since it is so much easier to contribute. Social features allows exposure of things that have linkages to interests, competencies and networks instead of what the management wants to push. People interested in learning have somewhere to go even for niche interests and those wanting to share can get affirmations when their content is read and commented by others even if it is a small number. Advanced searching and exploitation of social and content analytics can create personalised mashup portals and push notifcations of interesting conent or people.

Could this long tail effect possibly have a difference on the whole knowledge management perspective? This time not from the management aspect of it but rather the learning aspect of it. Can we expect a much larger amount of the available content to be consumed or used by at least a few people in the organisations? Large organisations have a fairly large number or roles and responsibilities to there must reasonably be a great difference in what information they need and with whom they need to share information with. The Long Tail effect in ECM-terms could be a way to illustrate how a much larger percentage of the enterprise content is used and reused. It is not necessarily so that more informtion is better but this can mean more of the right information to more of the right people. Add to that the creative effect of being constantly stimulated by ideas and reflections from others around you and it could be a winning concept.

Sources

Andersson, Chris, ”The Long Tail – Why the Future of Business is Selling Less of More”, 2006
Koernan, Brendan I, ”Driven by Distraction – How Twitter and Facebook make us more productive workers” in Wired Magazine March 20