Category: ECM

EMC World 2010: My presentation around using Documentum in a SOA-platform

Yesterday on Monday May 10 at 11 am I gave a speech at the Momentum 10 conference here at EMC World 2010 in Boston. The presentation was focused around our experiences of building an experimentation platform for next-generation information and knowledge management (IKM) for a large operational level military HQ. Contemporary conflicts are complex and dynamic in character and requires a new approach to IKM in order to be able to handle all those complexities based on a sound management of our digital information. At the core of our platform is EMC Documentum integrated over an Enterprise Service Bus (ESB) from Oracle. The goal is to maintain access and tracability on the information while removing stove-piped systems.

I have got quite a few positive reactions both from customers and EMC-people after the session which of course is just great. For instance see these notes from the session. All the presentations will be available for download for all participants but that will most likely take some time. So in the meantime you can download my presentation here instead:

Presentation at EMC World 2010 in Boston

Looking forward to comments are reflections. The file is quite big but that is because my presentations is high on screenshots and downsampling them to save file size will make it too hard to see what they are showing. Try zooming in to see details.

EMC World 2010: At Blogger’s Lounge

Sitting at the lounge now relaxing after another cup of great latte. Relaxing after what felt like a really good presentation earlier today at EMC World 2010. Responses so far have been very positive and it feels great of course. We think we have so many cool ideas and it is great to be able to show it off to people with a deep interest in Enterprise Content Management.

Alexandra Blogger's lounge at EMC World 2010

Now it is soon time for the keynot by Mark Lewis who seem to be in charge of the newly renamed Information Intelligence Group (formerly Content Management & Archiving Division).

EMC World 2010: DFS Real World Examples, Best Practices

I had planned to go to a session around the Documentum Roadmap but it was totally full so we had to go to another session. We split up and went to the BPM Fundamentals and the Documentum Foundation Services (DFS) Best Practices session by Michael Mohen instead. I am not a developer so this is a little from the 500ft level

He started by discussed the complementary nature between DFS and CMIS depending on how focused development is to only Documentum or not. CMIS is of course the new standard recently approved by OASIS. He argued that some applications like Records Management is still best done using DFS but I guess that also has to do with how people want CMIS to develop. As I understand it is not intended to contain ALL feature and the COMPLETE set of features in all ECM-systems and rather focus on the interoperabiltiy aspect of building ECM-apps based on multiple repositories.

When it comes to Content Transfer when using DFS the key considerations are latency, size of the file, formats and caching needs. Some of the ways to do content transfer is:

  • HTTP
  • Base64
  • UCF
  • MTOM

Most use UCF or MTOM  but it is important to remember that BOCS/ACS requires UCF to work. The message is to don’t be afraid to mix between HTTP, MTOM and others. In our solution we do use a mix but because we sometimes have rather large content size this of course an issue.

Notable changes in D6.5/D6.6

  • JBoss 4.2.0 is the new methods server
  • Apache Tomcat support
  • Aspect Support
  • LWSO support
  • Native 64-bit support and UCF Improvements
  • Kerberos is coming D6.6

Remote and local calls in Java – .Net does only provide remote calls

There are some applications that customers may not be aware of such as DFS Utilities developed by John sweeney, EMC and DFSX (Extension)

  • Provides utility classes
  • Based on DFS Object MOdel
  • Java-based 1.5 or greater
  • Only EAR-files today

Test Harness is JMeter extension which has custom JMeter Sampler built to invoke DFS using the Java Productivity Layer

Responsetimes collected for:

  • CreateObject
  • Get Object
  • Checkout object
  • Check in Object
  • Delete Object

Over a WAN DFS speeded up DFC especially when you have 300-400 ping times…use DFS because it is state-less. Relevant when using satellite links and such.

Sizing Calculator is soon available for DFS. It is an Excel spreadsheet. The sheet is sased on WSDL and SOAP so if we are using other designs results may vary of course.

In a speed test etween UCF and MTOM upload speeds under 50 Mb were similar. However, UCF was slightly faster. The cool part of UCF is that it is asynchronous which for instance mean that you can show one page of a document and continue loading the rest of it.

When it comes to ESB-implementations the message was that the majority of implementions is point-point for clients apps. However some have SAML for added security in their ESB implementation which affects speed a bit.

It seems that DFS is used a lot in a .Net environment and together with Sharepoint.

MOSS and DFS Examples

.Net 3.3

SDF and xCP

Webpart with an inbox rendered and Xform inside Sharepoint.

Another example is the use of DFS and Windows Explorer where some want custom integration for the Windows Desktop and essentially provides something like the old Document Desktop client. It is called DFS Explorer.

DFS Adobe Flex Example

There is an white paper available to provide a quickstart…read more about the session at the community page.

Adobe does not talk directly to DFS but through Java. Restful would much easier to use for Flex as well as most AJAX-implementations.

Best Practices

  • Leverage the SDK (.Net/Java interop layers)
  • Use UCF for BOCS/ACS
  • If you expected your query to exceed 500 you must cache and cycle through results.
  • DFS is better on WAN with poor latency.

A feature which is not well documented is to set requiresAuthentication=”false” on your annotated services implementation to browse through repositories and basic info such as data dictionary.

There is also a less known Services Catalog Viewer which is an optional install

  • Explore services available within the internet
  • DSCR is registry for consumer discover.
  • UDDI v2 standard
  • Standard Web app
  • Default port is 9010
  • Judy open source UDDI

You can also compare this with the notes from last conference by Word of Pie.

Next stop: EMC World 2010 in Boston

It is time again to enjoy the company of fellow ECM-people at EMC’s conference which is in Boston, MA this year. Although most of the conference are focused around their storage hardware there is a good “sub-conference” called Momentum where all the Documentum people gather to share experiences. I have said this before but this has so far been by far the best tech conference I have attended. Most sessions are very interesting and EMC is a fairly open company so you usually leave with a decent idea of where they are going for the next year. For us this is critical because sometimes what is in the next release dictates what kind of experiments we can run in our Battle Lab at Joint Concept Development and Experimentation Centre (JCDEC) back in Sweden.

I will try to blog and twitter as usual and I am registrered at the Blogger’s Lounge this year as well. Looking forward to some great vanilla latte there while trying to scribble down the latest from the sessions. At this wiki you can see who will be blogging from EMC World this year. Be sure to check it out because social media is great tool to get not only facts but also comments from people in the business. I guess the ECN Online Documentum community also will be a good place to find news from the conference.

And finally, I will be speaking about our experiences of integrating Documentum in a SOA-architecture to support an operational level military HQ. The speech will be at 11 am on Monday. Please stop and say hi if you can!

Can BPM meet Enterprise 2.0 over Adaptive Case Management?

The project that I am running at JCDEC involves a lot of internal “marketing” targeting both at end users and people in charge of our IT-projects. Lately I have found myself explaning the difference between Workflow processes using Documentum Process Engine and Taskspace and what EMC’s new clients Centerstage Pro and Media Workspace. My best argument so far has been that BPM/Workflow is well suited for formal repeatable process in the HQ while Enterprise 2.0 clients takes care of ad-hoc and informal processess. Keith Swensson explains the Taylorism-based Scientific Management-concept as the foundation of Business Process Management in this blogpost in a good way. He continues to provide a bridge over to ad-hoc work that nowadays is done by what is called Knowledge Worker. Documentum Centerstage is a tool that is intended for the Knowledge Worker which also can be seen as the Enterprise 2.0 way of working.

However, Keith continues to steer us over to a concept called Adaptive Case Management which is supposed to address those more agile and dynamic ways of working as a contrast to slow-changing well-defined business processess that is deployed in traditional BPM-systems. To my understanding this focuses a lot on the fact that the user itself (instead of a Process designer) needs to be able to control templates, process steps and various other things in order to be able to support more dynamic work such as criminal investigations or medical care.

However, Adaptive Case Management is also a concept (I understand) in the book called “Mastering the Unpredictable”. The idea is to focus on the unpredictable nature of some work situations but also reflect a bit over to what degree things are unpredictable or not. In this presentation by Jacob Ulkeson the argument is that the main bulk of work is unpredictable and therefore also means that Process Modeling using traditional BPM most likely won’t work.

Some people have opinions that there is no need to redefine BPM and that all these three letter acronyms does not contribute much to the understanding of the problem and the solutions. I think I disagree and the reason for that is that there are no silver bullet products that covers everything that you need. Most organisations start somewhere and rolls out systems based on their most pressing needs. I believe that these systems have some similarities in what they are good and bad at. Having bought an ECM, BI, CRM or ERP-system usually says something about what business problems have been addressed. As SOA-architectures matures and the ambition to reduce stove-pipes increases it actually means that the complementary character of these systems matter. It also matters which of these vendors you choose because the consolidation efforts into a few larger vendors means choosing from different approaches.

To me all of this means an opportunity to leverage the strong points of different kind of platforms. Complex sure but if you have the business requirements it is probably better than building it from scratch. So I think when companies quickly rolls out Enterprise 2.0 platforms from smaller startup vendors they soon discover that they risk creating yet another stove-pipe but in this case consisting of social information. Putting E 2.0 capabilties on top of an ECM-platform than makes a lot of sense in order to be able to integrate social features with existing enterprise information. The same most likely goes for BI, CRM etc.

When it comes to BPM the potential lies in extending formal processess with social and informal aspects. However, it is likely that the E 2.0-style capabilities make new ways of working evolve and emerge. Sooner or later they need to be formalised maybe into a project or a community of interest. Being able to leverage the capabilties of the BPM-platform in terms of monitoring and some kind of best practice in form of templates is not far-fetched. To some degree I believe that Adaptive Case Management-solutions sometimes should be used instead of just a shared Centerstage Space because you need this added formal aspects but still want to retain some flexibility. Knowledge Worker-style work can then be done on top of a BPM-infrastructure while at the same time utilising the ECM-infrastructure for all content objects involved in the process. Having a system like Documentum that is good at content-centric human workflow processes makes a lot of sense.

So is the Documentum xCP a way to adress this middle-ground between Process Modeling-based processes and Knowledge Worker-style support in CenterStage? The mantra is “configure instead of coding” which implies a much more dynamic process. I have not played around with xCP yet – we have so far only deployed processes developed from scratch instead of trying out the case management templates that comes with the download.

Not all companies want to do this but I think some will soon see the merits of integrating ECM, BI, E.2.0 and BPM/ACM-solutions using SOA. The hard part I belive is to find software and business methods support for the agile and dynamic change management of these systems. The key to achieve this is to be able to support various degrees of ad-hoc work where on one the user does everything herself and on the other way a more traditional developer coding modules. Being able to more dynamically change/model/remodel not only processess but also the data model for content types in Documentum is a vital capability to be able to respond to business needs in a way that maintains trust in the system. This is not a task by IT but something done by some kind of Information and Knowledge Management (IKM) specialist. They can get some proper means of doing their work using this SOA-based integration of different sets of products.

So employ E 2.0-style features in Task Management clients and make sure that E 2.0 style clients include tasks from BPM/ACM in their activity streams or unified inboxes. Make sure that all of this is stored in an ECM-platform with full auditing capabilities which needs to be off-loaded to a data warehouse so it can be dynamically analysed using interactive data visualisation, statistics and data mining. I hope we can show a solutions for that in our lab soon.

Dave Kellogg on Palantir

I recently began reading the blog written by Dave Kellogg who is the CEO of Mark Logic, a company devoted to XML-based content management. I think I came to notice them when I discovered what cool technology EMC got when it bought X-hive which has now become Documentum xDb/XML Store. Mark Logic and X-hive was of course competitors in the XML Database market. In a recent blog post he reflects on the Palantir product after attending their Government Conference.

The main scope of his blog post is around different business models for a startup and that is not my expertise and I don’t have any particular opinion around that although I tend to agree and it was interesting to read his reflections of how other companies such as Oracle (yet another competitor to Mark Logic and xDb) have approached this.

Instead my thinking is based around his analysis of the product that Palantir offers and how that technology relates to other technology. I think most people (including Kellogg) mainly view Palantir as a visualisation tool because you see all these nice graphs, bars, timelines and maps displaying information. What they tend to forget is that there is huge difference between a tool that ONLY do visualisation and one that actually let you modify the data (actually modifying contextual data around them such as metadata and relations) within those perspectives. There are many different tools around Social Network Analysis for instance. However, many of them assumes that you already have databases full of data just waiting to be visualised and explored. Nothing new here. This is also what many people use Business Intelligence toolkits for. Accessing data in warehouses that is already their, although the effort of getting there from transactions oriented systems (like in retail) is not small in any way. However, the analyst using these visualisation-heavy toolkits access data read-only and only adds analysis of data already structured.

Here is why Palantir is different. It provides access to raw data such as police reports, military reports, open source data. Most of it in unstructured or semi-structured form. When it comes into the system it is not viewable in all these fancy visualisation windows Palantir has. Instead, the whole system rests on a collaborative process where people perform basic analysis which includes manual annotations of words in reports. This digital marker pen allows users to create database objects or connect to existing ones. Sure this is supported by automatic features such as entity extraction but if you care about data quality you do not dare to put them in automatic mode. After all this is done you can start exploring the annotated data and linkages between objects.

However, I do agree with Dave Kellogg that if people think BI is hard, this is harder. The main reason is that you have to have a method or process to do this kind of work. There are no free lunches – no point of dreaming about full automation here. And people need training and mindset to be able to work efficiently. Having played around with TIBCO Spotfire lately I feel that there is a choice between integrated solutions like Palantir which has features from many software areas (BI, GIS, ECM, Search etc) or using dedicated toolkits with your own integration. Powerful BI with data mining is best done in BI-systems whereas they probably never will provide the integration between features that vendors like Palantir offers. An open architecture based on SOA can probably make integration in many ways easier.

Why iPhone OS (iPad) is ECM…

I like Twitter. It exposes me for a lot of interesting thoughts from interesting and smart people that I follow. Today I read a post called  Why the iPad Matters – Its the Beginning of the End by Carl Frappaolo. It talkes a lot of why the iPad brings a new promise for content delivery – a complete digital chain. It made me think about one of the things which is unique with the iPod/iPhone/iPad – it is the lack of a folder-based file system exposed to users. Surprisingly (maybe) it is the lack of it that makes the whole user experience much better.

So how does this relate to ECM then? Well, I guess many of us ECM-evangelists (or “Ninjas” I heard today) have been in endless meetings and briefings explaining the value of metadata and the whole “context-infrastructure” around each object in an ECM-system that can hold fine-grained permissions, lifecycles, processess, renditions and so forth. I have even found myself explaining the ECM concept using the iTunes as an analogue. You tag the songs with metadata and access them through playlists which is in essence virtual folders where each song can be viewable in many playlists. That is the same concept as the “Show in folder” flag in Documentum. Metadata can even power Smart Playlists which in essence is just a saved search query – something we have added as a customization in Documentum Digital Asset Manager (DAM). So in essence the iTunes Library (should be call it a repository 🙂 is a lightversion of an ECM-system. Before continuing I really wonder why I have to customize Documentum to get the GUI-features that iTunes provide…?

So iTunes abstracts away the folder-based file system on a Mac or Windows PC but as long as you are using Mac OS X or Windows the file system is still there right. Some people even get really frustrated by iTunes and just can’t get around their head that there is no need to move files around manually when synching them to iPhone OS-powered devices. And here comes the beauty, in these devices there are no folder-based file system to access. Just the iPod App for music, the Photos App for photos and so forth. All your content is suddenly displayed in context and filtered out based on metadata and that App’s specific usage.

To some degree that means that the whole concept of iPhone OS-based devices not only can make content delivery digital but it can provide a much better user interface that is powered by all these ECM-features that we love (and have a hard time explaining). Suddenly we have an information flow entirely based on metadata instead of folder names and file names. Maybe that will make ECM not only fun but also able to much more quickly explain the dreaded “What’s in it for me question?”.

Now, can someone quickly write an iPad App for Documentum so I can make my point 🙂 It will be a killer app, believe me!

The Long Tail of Enterprise Content Management

Question: Can we expect a much larger amount of the available content to be consumed or used by at least a few people in the organisations?

Shifting focus from bestsellers to niche markets
In 2006 the editior-in-chief of Wired magazine Chris Andersson published his book called ”The Long Tail – Why the Future of Business is Selling Less of More”. Maybe even the text printed on the top of the cover saying ”How Endless Choice is Creating Unlimted Demand” is the best summary of the book. This might have been said many times before but I felt a strong need to put my reflections into text after reading this book. It put a vital piece of the puzzle in place when seeing the connections to our efforts to implement Enterprise 2.0 within an ECM-context.

Basically Chris Andersson sets out to explain why companies like Amazon, Netflix, Apple iTunes and several others make a lot of money in selling small amounts of a very large set of products. It turns out that out of even millions of songs/books/movies nearly all of them are rented or bought at least once. What makes this possible is comprised out of these things:

Production is democratized which means that the tools and means to produce songs, books and movies is available to almost everybody at a relatively low lost.
– Demoractization of distribution where companies can broker large amount of digital content because there is a very low cost for having a large stock of digital content compared to real products on real shelves in real warehouses.
– Connecting supply and demand so that all this created content meets its potential buyers and the tools for that is search functions, rankings and collaborative reviews.

What this effectivly means is that the hit-culture where everything is focused on a small set of bestsellers is replaced with vast amounts of small niches. That has probably an effect of the society as a whole since the time where a significant amount of the population where exposed to the same thing at the same time is over. That is also reflected in the explosion of the number of specialised TV-channels and TV/video-on-demand services that lets views choose not only which show to watch but also when to watch it.

Early Knowledge Management and the rise of Web 2.0
Back in the late 90-ies Knowledge Management efforts thrived with great aspirations of taking a grip of the knowledge assets of companies and organisations. Although there are many views and definitions of Knowledge Management many of them focused on increasing the capture of knowledge and that the application of that captured knowledge would lead to better efficiency and better business. However, partly because of technical immaturity many of these projects did not reach its ambitous goals.

Five or six years later the landscape has changed completely on the web with the rise of Youtube, Flickr, Google, FaceBook and many other Web 2.0 services. They provided a radically lowered threshold to contribute information and the whole web changed from a focus on consuming information to producing and contributing information. This was in fact just democratization of production but in this case not only products to sell but information of all kind.

Using the large-scale hubs of Youtube, Flickr and Facebook the distribution aspect of the Long Tail was covered since all this new content also was spread in clever ways to friends in our networks or too niche ”consumers” finding info based on tagging and recommendations. Maybe the my friend network in Facebook in essence is a represention of a small niche market who is interested in following what I am contributing (doing).

Social media goes Enterprise
When this effect started spreading beyond the public internet into the corporate network the term Enterprise 2.0 was coined by Andrew McAfee. Inside the enterprise people where starting to share information on a much wider scale than before and in some aspects made the old KM-dreams finally come into being. This time not because of formal management plans but more based on social factors and networking that really inspired people to contribute.

From an Enterprise Content Management perspective this also means that if we can put all this social interaction and generated content on top of an ECM-infrastructure we can achieve far more than just supporting formal workflows, records management and retention demands. The ECM-repository has a possibility to become the backbone to provide all kind of captured knowledge within the enterprise.

The interesting question is if this also marks a cultural change in what types of information that people devoted their attention to. One could argue that traditional ECM-systems provide more of a limited ”hit-oriented” consumption of information. The abscense of good search interfaces, recommendation engines and collaboration probably left most of the information unseen.

Implications for Enterprise Content Management
The social features in Enterprise 2.0 changes all that. Suddenly the same effect on exposure can be seen on enterprise content just as we have seen it on consumer goods. There is no shortage of storage space today. The amount of objects stored is already large but will increase a lot since it is so much easier to contribute. Social features allows exposure of things that have linkages to interests, competencies and networks instead of what the management wants to push. People interested in learning have somewhere to go even for niche interests and those wanting to share can get affirmations when their content is read and commented by others even if it is a small number. Advanced searching and exploitation of social and content analytics can create personalised mashup portals and push notifcations of interesting conent or people.

Could this long tail effect possibly have a difference on the whole knowledge management perspective? This time not from the management aspect of it but rather the learning aspect of it. Can we expect a much larger amount of the available content to be consumed or used by at least a few people in the organisations? Large organisations have a fairly large number or roles and responsibilities to there must reasonably be a great difference in what information they need and with whom they need to share information with. The Long Tail effect in ECM-terms could be a way to illustrate how a much larger percentage of the enterprise content is used and reused. It is not necessarily so that more informtion is better but this can mean more of the right information to more of the right people. Add to that the creative effect of being constantly stimulated by ideas and reflections from others around you and it could be a winning concept.

Sources

Andersson, Chris, ”The Long Tail – Why the Future of Business is Selling Less of More”, 2006
Koernan, Brendan I, ”Driven by Distraction – How Twitter and Facebook make us more productive workers” in Wired Magazine March 20