Interesting thoughts around the Information Continuum

In a blog post called ”The Information Continuum and the Three Types of Subtly Semi-Structured Information” Mark Kellogg discusses what we really mean with unstructured, semi-structured and structured information. In my project we have constant discussions around this and how to look upon the whole aspect of chunking down content into reusable pieces that in itself needs some structured in order to be just that – reusable. At first we were ecstatic over the metadata capabilities in our Documentum platform because we have made our unstructured content semi-structured which in itself is a huge improvement. However, it is important to see this as some kind of continuum instead of three fixed positions.

One example is of course the PowerPoint/Keynote/Impress-presentation which actually is not one piece. Mark Kellogg reminded me of the discussions we have had around those slides being bits of content in a composite document structure. It is easy to focus on the more traditional text-based editing that you see in Technical Publications and forget that presentations have that aspect in them already. To be honest when we first got Documentum Digital Asset Manager (DAM) in 2006 and saw the Powerpoint Assembly tool we became very enthusiastic about content reuse. However, we found that feature a little bit too hard to use and it never really took off. What we see in Documentum MediaWorkSpace now is a very much remamped version of that which I look forward to play around with. I guess the whole thing comes back to the semi-structured aspect of those slides because in order to facilitate reuse they somehow need to get some additional metadata and tags. Otherwise it is easy the sheer number of slides available will be too much if you can’t filter it down based on how it categories but who has created them.

Last year we decided to take another stab at composite document management to be able to construct templates referring to both static and dynamic (queries) pieces of content. We have made ourselves a rather cool dynamic document compsotion tool on top of our SOA-platform with Documentum in it. It is based on DITA and we use XMetaL Author Enterprise as the authoring tool to construct the templates, the service bus will resolve the dynamic queries and Documentum will store and transform the large DITA-file into a PDF. What we quickly saw was yet another aspect of semi-structured information since we need a large team to be able to work in parallell to ”connect” information into the finished product. Again, there is a need for context in terms of metadata around these pieces of reusable content that will end up in the finished product based on the template. Since we depend of using a lot of information coming in from outside the organisation we can’t have strict enforcement of the structure of the content. It will arrive in Word, PDF, Text, HTML, PPT etc. So there is a need to transform content into XML, chunk it up in reusable pieces and tag it so we can refer to it in the template or use queries to include content with a particular set of tags.

This of course bring up the whole problem with the editing/authoring client. The whole concept of a document is be questioned as it in itself is part of this Continuum. Collaborative writing in the same document has been offered by CoWord, TextFlow and the recently open source Google tool Etherpad and will now be part of the next version of Microsoft Office. Google Wave is a little bit of a disrupting force here since it merges the concept of instant messaging, asynchronous messaging (email) and collaborative document editing. Based on the Google Wave Federation protocol it is also being implemented in Enterprise Applications such as Novell Pulse.

So why don’t just use a wiki then? Well, the layout tools is nowhere as rich as what you will find in Word processors and presentation software and since we are dependent on being able to handle real documents in these common format it becomes a hassle to convert them into wiki format or even worse try to attach them to a wiki page. More importantly a wiki is asynchronous in nature and that is probably not that user friendly compared to live updates. The XML Vendors have also went into this market with tools like XMetaL Reviewer which leverages the XML infrastructure in a web-based tool that almost in real-time allow users to see changes made and review them collaboratively.

This lead us into the importance of the format we choose as the baseline for both collaborative writing and the chunk-based reusable content handling that we like to leverage. Everybody I talk to are please with the new Office XML-formats but say in their next breath that the format is complex and a bit nasty. So do we choose OpenOffice, DITA or what? What we choose as some real impact on the tool-end of our solutions because You probably get most out of a tool when it is handling its native format or at least the one it is certified to support. Since it is all XML when can always transform back and forth using XSLT or XProc.

Ok, we have the toolset and some infrastructure in place for that. Now comes my desire to not stove-pipe this information in some close system only used to store ”collaborative content”. Somehow we need to be able to ”commit” those ”snapshots” of XML-content that to some degree consitutes a document. Maybe we want to ”lock it” down so we know what version of all of that has been sent externally or just to know what we knew at a specific time. Very important in military business. That means that it must be integrated into our Enterprise Content Management-infrastructure where it in fact can move on the continuum into being more unstructured since it could even be stored as a single binary document file. Some we need to be able to keep the tracability so you know what versions of specific chunks was used and who connected them into the ”document”. Again, just choosing something like Textflow or Etherpad will not provide that integration. MS Office will of course be integrated with Sharepoint but I am afraid that implementation will not support all the capabilities in terms of tracability and visualisation that I think you need to make the solution complete. Also XML-content actually like to live in XML-databases such as Mark Logic Server and Documentum XML Store so that integration is very much need more or less out of the box in order to make it possible to craft a solution.

We will definitely look into Documentum XML Technologies more deeply to see if we can design an integrated solutions on top of that. It looks promising especially since a XProc Pipeline for DITA is around the corner.

Reblog this post [with Zemanta]

EMC World 2010: Chiming in with Word of Pie about the future of Documentum

We have got a written reaction to Mark Lewis’ keynote held at EMC World 2010 in Boston. I both feel and have the passion around Enterprise Content Management and it is great that Laurence Hart spent so much time and effort on talking to people to craft this post. Someone need to say things even if they are not always easy to hear. So I will try to not repeat what he said in this blog post but rather try to provide my perspective which comes from what I have learned about Information and Knowledge Management over the past years. ECM and Documentum is a very critical component to move that IKM vision from the Powerpoint stage into reality. In our case an experimentation platform that allows to put our ideas to improve the ”business” of staff work in a large military HQ into something people can try, learn and be inspired from. Also, this turned out to be a long blog post which calls for an summary on top:

The Executive Summary (or message to EMC IIG) of this blog post:

  • Good name change but make sure You live up to your name.
  • A greater degree of agility is very much needed but do not simplify the platform so much that implementing an ECM-strategy is impossible.
  • Case Management is not the umbrella term, it is just one of many solutions on top of Documentum xCP
  • The whole web has gone Social Media and Rich Media. The Enterprise is next. Develop what You have and stay relevant in the 2010-ies!
  • Be more precise when it comes to the term ”collaboration”. There is a whole spectrum to support here.
  • Be more bold and tell people that Documentum offers an unique architectural approach to informtion management – stop comparing clients.
  • Tell people that enabling Rich Media, Case Management, E 2.0 and (Team) Collaboration on one platform is both important and possible.
  • I am repeating myself here: You want to sell storage, right? Make sure Video Management is really good in Documentum!

The name change

Before I start I just need to reflect on the name change from Content Management and Archiving into Information Intelligence Group (IIG). I agree with Pie…the had to be changed to make it more relevant in 2010 and a focus on information (as in information management which is more than storage ILM) is the right way to go. The intelligence part of it is of course a bit fun because of my own profession but still it implies doing smart things with information and that should include everything from building context with Enterprise 2.0 features to advanced Content and Information Analytics. You have the repository to store all of that – now make sure you continue to invest in analytics engine to generate structure and visualisation toolkit to make use of all the metadata and audit trails. Maybe do something with TIBCO Spotfire.

Documentum xCP – lowering the threshold and creating a more agile platform

Great. Documentum needs to be easier to deploy, configure and monitored. Needed to get know customers on board easier and make existing ones be able to do smarter things with it in less time. However, it is easy to fall into the trap of simplifying things to much here. To me there is nothing simple around implementing Enterprise Content Management (ECM) as a concept and as a method in an organization. One major problem with Sharepoint and other solutions is that they are way to easy to install so people actually are fooled into skipping the THINKING part of implementing ECM and think it is just ”next-next-finish”. All ECM-systems needs to be configured and adapted to fit the business needs of the organisation. Without that they will fail. xCP can offer a way to do that vital configuration (preceeded by THINKING) a lot more easier and also more often. We often stress how it is important to have the technical configuration move as close to any changes in Standard Operating Procedures (SOP) as possible. If Generals want to change the way they work and the software does not support it they will move away from using the software. Agility is the key.

In our vision the datamodel needs to be much more agile. Value lists need to updated often – sometimes based on ad hoc folksonomy tagging. Monitoring of the use of metadata and tags will drive that. Attributes or even object types need to be updated more often. Content need to be ingested quickly while providing structure later on (think XML Store with new schemas here). xCP is therefore a welcome thing but make sure it does not compromise the core of what makes Documentum unique today.

The whole Case Management thing

Probably the thing that most of us reacted against in the Mark Lewis Keynote was the notion that ECM-people in reality just have done Case Management all the time. I recently spend some time reflecting on that in another blog post here called ”Can BPM meet Enterprise 2.0 over Adaptive Case Management?”. There is clearly a continuum here between supporting very formal process flows and very ad-hoc Knowledge Worker-style work. They clearly seem different and while they likely meet over Adaptive Case Management but to me it makes no sense to have that term cover the whole spectrum – even for EMC Marketing 🙂

I immediately saw that Public Sector Investigative work is often used as an example of Case Management. Case Management in especially done by law enforcement agencies is fundamentally different from work done by Intelligence Agencies because in Case-based Police investigations there is usually some legal requirement to NOT share information between cases unless authorised by managers. This is of not the case (!) for all Case Management applications but from a cultural perspective it is important that Case Management-work by the Police is not a line of business that should be used as an example of information sharing. It is even so that the underlying concept actually is at ends with any concept of unified enterprise content management strategy where information should be shared. That is why workgroup-oriented tools such as i2 Analyst’s Workstation have become so popular there.

The point here is that it is important to not disable sharing in the architectural level because again it is what constitutes a good ECM-system that content can be managed in a unified way. Don’t be fooled by requirements for that – use the powerful security model to make it possible. Then Law Enforcement Agencies can use it as well. However, there must be more to ECM than Case Management – as Word of Pie suggests it is just ONE of many solutions on top of the Documentum xCP platform. A platform which is agile enough to quickly build advanced solutions for ECM on top.

Collaboration vs Sharing and E.20

So, Collaboration is used everywhere now but the real meaning with it actually varies a bit. First there are two kind of collaboration modes:

  • Synchronous (real-time)
  • Asynchronous (non-real time – ”leave info and pick up later)

Obviously neither Documentum nor Sharepoint is in real-time part of the business. For that you will need Lotus Sametime, Office Communications Server, Adobe Connect Pro or similar products. However, Google Wave provides a bit of confusion here since it integrates instant messaging and collaborative document editing/writing.

However, I am bit bothered by the casual notion of anything as a collaboration tool like Sharepoint and for that sake eRoom is getting. To further break this down I believe there is a directness factor in collaboration. Team collaboration has a lot of directness where you collaborate along a given task with collegues. That is not the same as many of the Social Media/Enterprise 2.0 features which does not have a clear recipient of the thing you are sharing. And sharing is the key since you basically are providing a piece of information in case anyone wants/needs it. That is fundamentally different from sending an email to project members or uploading the latest revision to the project’s space. Andrew McAffe has written about this concept and uses the concept of a bullseye representing strong and weak ties to illustrate this effect.

My point is that it is important that tools for team collaborations from an information architecture standpoint can become part of the more weaker indirect sharing concept. That is the vehicle to utilze the Enterprise 2.0 effect in a large enterprise. Otherwise we have just created another set of stove-pipes or bubbles of information that is restricted to team members. I am not saying that all information should be this transparent but I will argue that based on a ”responsibility to provide”-concept (see US Intel Community Information Sharing Policy) restricting that sharing of information should be exception – not the norm.

Sure as Word of Pie points out in his article ”CenterStage, the Latest ex-Collaboration Tool from EMC” there are definitely things missing from the current Centerstage release compared to both Sharepoint and EMC’s old tool eRoom. However, as Andrew Goodale points out in the comments I also think it is a bit unfair because both eRoom and at least previous versions of Sharepoint (which many are using) actually lacks all these important social media features that serves to lower the threshold and increase participation by users. They also provide critical new context around the information objects that was not available before in DAM, WebTop or Taskspace. Centerstage also provides a way to consume them in terms of activity streams, RSS-feeds and faceted search. Remember that Centerstage is the only way to surface those facets from Documentum Search Server today.

So, I am also a bit disappointed that things are missing in Centerstage that should be there and I also really want to stress the importance of putting resources into that development. Those features in there are critical for implementing all serious implementations of an ECM-strategy and the power of Documentum is that they all sits in the same repository architecture with a service layer to access them. Maybe partner with Socialcast to provide a best practice implementation to support a more extensive profile page and microblogging. Choose a partner for Instant Messaging in order to connect the real-time part of collaboration into the platform. Again, use your experience from records management and retention policies to make those real-time collaboration activities saved and managed in the repository.

Be bold enough to say you are an Sharepoint alternative – but for the right reasons

I’m not an IT-person, I come into this business with a vision change the way a military HQ handles information so I see Enterprise Content Management more as a concept than a technology platform. However, when I have tried to execute our vision it becomes very clear that there is a difference between technology vendors and I like to think that difference comes from internal culture, experience, and vision of the company. It is the ”why” behind why the platform looks like it does and has the features it has. So as long you are not building everything from scratch for yourself it actually matters a lot which company you chose to deliver the platform to make your ECM vision happen. That means that there IS a difference between Documentum and Sharepoint in the way the platform works and we need to be able to talk about that. However, what I see now is that most people focus on the client side of it and try to embrace it is a popular collaboration tool. Note that I say tool – not platform. All those focuses on the client side of it where the simplified requirement is basically a need for a digital space to share some documents in. However, the differentiator is not whether Centerstage or Sharepoint meets that requirement – both do. The differentiator is whether you have a conceptual vision on how to manage the sum of all information that an organization have and to what degree those concepts can be implemented in technology. That is where the Documentum platform is different from other vendors and why it is different from Sharepoint. Sharepoint is sometimes a little bit to easy to get started with which unfortunately means there is no ECM-strategy behind the implementation and when the organisation have thousands of Sharepoint sites (silos) after a year or so that is when that choice of platform really starts to differ.

This week at EMC World has been a great one as usual and there is no shortage of brilliant technical skills and development of features in the platform. What I guess bothers me and some other passionate ECM/Documentum-people is the message coming out from the executive level at IIG. In the end, that is where the strategic resource decision are made and where the marketing message being constructed. I think now there is a lot more to do on the vision and marketing level than actually needs to be done on the platform itself. The hard part seem to be proud of what the platform is today, realize it’s potential to remain the most capable and advanced on the market and use that to stay relevant in many applications of ECM – not just Case Management.

Rich Media – A lot of content to manage and storage to sell

One of the strong points of Documentum is that it can manage ALL kind of content in a good way and that includes of course rich media assets such as photos, videos and audio files. Don’t look upon this as some kind of specialised market only needed by traditional ”creative” markets. This is something everybody needs now. All companiens (and military units for that sake) have an abundance of digital still and video cameras where a massive amount of content needs to be managed just as all the rest of the content. There is a need for platform technologies that actually ”understands” that content and can extract metadata from it so that this content can be navigated and found easily. It is also important to assist users in repurposing this content so it can be displayed easily without consuming all bandwith and also easily be included in presentations and other documents. This is also very much relevant from a training and learning perspective where screencams and recorded presentations has so much potential. It does not have to be a full Learning Management System but at least an easy way to provide it. Maybe have a look at your dear friend Cisco and their Show and Share application. Oh, it is marketed as a Social Video System – the connections to Centerstage (and not just MediaWorkspace) is a bit too obvious. Make sure you can provide Flickr and Youtube for the Enterprise real soon. People will love it. Again, on one very capable platform.

Media Workspace is a really cool application now. Even if it does not have all the features of DAM yet (either) it is such a sexy interface on Documentum. The new capabilites of handling presentations and video are just great. Be sure to look more at Apple iPhoto and learn how to leverage (and create) metadata to support management of content based on locations, people and events. A piece of cake on top of a Documentum repository. Now it is a bit stuck in the Cabinet/Folder hierarchy as the main browsing interface.


I agree with Word of Pie that there is a lack of vision – an engaging one that we all can buy into and sell back home to our management. In my project we seem to have such a vision and for us Documentum is a key part of that. I just hoped that EMC IIG would share that to a greater degree. From our responses back home in Sweden and here at EMC World people seem to both want and like it (have a look at my EMC World presentation and see what you think). We can do seriously cool and fun stuff that will make management of content so much more efficient which should be of critical importance for every organisation today. At least in the military one thing is for sure and that is that we won’t get more people. We really have to work smarter and that is what a vision like this will provide a roadmap towards.

So be proud of what you do best EMC IIG and make sure to deliver INTEGRATED solutions on top of that. For those who care that will mean a world of difference in the long run and will gather looks of envy for those who did not get it.


EMC World 2010: What is New and What’s Coming in Documentum xCP?

This session was presented by John McCormick on Tuesday morning.

The three pillars are:

  • Information Governance
  • xCP
  • Information Access

EMC wants to help customers to get maximum leverage from their information and Deliver the leading application composition platform for information management and case processing.

Intelligence Case Management:

Data, People, Content, Collaboration, Reporting, Policies, Events, Communication, Process

Case Management: Argues that it is a discipline of information management which is:

  • Non-deterministic
  • Driven by Human Decsionmaking
  • Driven by Content status

xCP Product Priniciples

  • Enable Intelligent business decisions (content and business process analytics)
  • Composition and configuration over coding
  • Enable performance through responsiveness and usability
  • Delight application builders and systems integrations
  • Beyond Documents: People, process and information in context
  • Leverage the private cloud
  • Build a future-proof product (move to declarative composition model)

The goal is collapse all the existing products that makes up xCP into fewer ones.

It is about reusable components, compositions tools, xCelerators

Resusable components:

  • Activities (templates)
  • Forms
  • UI


  • Process Builder
  • Forms Builder
  • Taskspace for the UI

What is coming next…

There are different version numbering for xCP and the Documentum platform and this is how they relate:

  • xCP 1.5 – D 6.6 (June 2010?)
  • xCP 1-6 – D 6.7
  • xCP 2-0 – D7 (next-gen Case Management)

Focus for Documentum 6.6

  • Real-world performance testing
  • Composer 6.6 (dependency checking, simplfifcation
  • Taskspace is getting better in 6.6
  • Improved manageability (workflow agents behaves more gracefully)
  • Forms Enhancement (conditional required fields, better relationship management)
  • ATMOS Integration

Documentum 6.7

  • Final release of D6 family (Q1 2011)
  • Licence Management improvements
  • Improved Search ( integration of DSS)
  • Public Sector Readiness (Section 508 improvements for Taskspace)
  • Composer Improvements (xCP application (and no manual installs and version ingestions)

6.5 SP2/SP3 and 6.6 ready for Documentum Search Server (DSS)

Integration of cloud storage ATMOS D 6.6

As soon as DSS is out the whole platform is supported on a virtualized environment.

vSphere integration & Certification (D 6.7)

Documentum 7 (xCP 2.0) Sneak peak – Increased Business Agility

  • Composition is simpler
  • Deployment is faster
  • Case workers are more productive

Improving the tooling

  • Single Composition Tool – xCP Composition Tool probably based on Eclipse
  • Modeling view
  • Compose a page/screen

Deployments is Faster

  • Leverage the private cloud
  • Everything is virtualized
  • Deploy to an already installed environment directly from xCP composition tool to a VMWare instance

User Experience

  • Better insights into cases
  • Better viewing experience
  • Integrated capture
  • There will be a new Web Services based UI
  • Easy to search and add content to a case
  • Easier inline viewing

EMC World 2010: DFS Real World Examples, Best Practices

I had planned to go to a session around the Documentum Roadmap but it was totally full so we had to go to another session. We split up and went to the BPM Fundamentals and the Documentum Foundation Services (DFS) Best Practices session by Michael Mohen instead. I am not a developer so this is a little from the 500ft level

He started by discussed the complementary nature between DFS and CMIS depending on how focused development is to only Documentum or not. CMIS is of course the new standard recently approved by OASIS. He argued that some applications like Records Management is still best done using DFS but I guess that also has to do with how people want CMIS to develop. As I understand it is not intended to contain ALL feature and the COMPLETE set of features in all ECM-systems and rather focus on the interoperabiltiy aspect of building ECM-apps based on multiple repositories.

When it comes to Content Transfer when using DFS the key considerations are latency, size of the file, formats and caching needs. Some of the ways to do content transfer is:

  • HTTP
  • Base64
  • UCF
  • MTOM

Most use UCF or MTOM  but it is important to remember that BOCS/ACS requires UCF to work. The message is to don’t be afraid to mix between HTTP, MTOM and others. In our solution we do use a mix but because we sometimes have rather large content size this of course an issue.

Notable changes in D6.5/D6.6

  • JBoss 4.2.0 is the new methods server
  • Apache Tomcat support
  • Aspect Support
  • LWSO support
  • Native 64-bit support and UCF Improvements
  • Kerberos is coming D6.6

Remote and local calls in Java – .Net does only provide remote calls

There are some applications that customers may not be aware of such as DFS Utilities developed by John sweeney, EMC and DFSX (Extension)

  • Provides utility classes
  • Based on DFS Object MOdel
  • Java-based 1.5 or greater
  • Only EAR-files today

Test Harness is JMeter extension which has custom JMeter Sampler built to invoke DFS using the Java Productivity Layer

Responsetimes collected for:

  • CreateObject
  • Get Object
  • Checkout object
  • Check in Object
  • Delete Object

Over a WAN DFS speeded up DFC especially when you have 300-400 ping times…use DFS because it is state-less. Relevant when using satellite links and such.

Sizing Calculator is soon available for DFS. It is an Excel spreadsheet. The sheet is sased on WSDL and SOAP so if we are using other designs results may vary of course.

In a speed test etween UCF and MTOM upload speeds under 50 Mb were similar. However, UCF was slightly faster. The cool part of UCF is that it is asynchronous which for instance mean that you can show one page of a document and continue loading the rest of it.

When it comes to ESB-implementations the message was that the majority of implementions is point-point for clients apps. However some have SAML for added security in their ESB implementation which affects speed a bit.

It seems that DFS is used a lot in a .Net environment and together with Sharepoint.

MOSS and DFS Examples

.Net 3.3

SDF and xCP

Webpart with an inbox rendered and Xform inside Sharepoint.

Another example is the use of DFS and Windows Explorer where some want custom integration for the Windows Desktop and essentially provides something like the old Document Desktop client. It is called DFS Explorer.

DFS Adobe Flex Example

There is an white paper available to provide a quickstart…read more about the session at the community page.

Adobe does not talk directly to DFS but through Java. Restful would much easier to use for Flex as well as most AJAX-implementations.

Best Practices

  • Leverage the SDK (.Net/Java interop layers)
  • Use UCF for BOCS/ACS
  • If you expected your query to exceed 500 you must cache and cycle through results.
  • DFS is better on WAN with poor latency.

A feature which is not well documented is to set requiresAuthentication=”false” on your annotated services implementation to browse through repositories and basic info such as data dictionary.

There is also a less known Services Catalog Viewer which is an optional install

  • Explore services available within the internet
  • DSCR is registry for consumer discover.
  • UDDI v2 standard
  • Standard Web app
  • Default port is 9010
  • Judy open source UDDI

You can also compare this with the notes from last conference by Word of Pie.