Getting social with EMC Documentum?

EMC World 2014 and the sub conference for EMC:s Information Intelligence Group called Momemtum is hours from kicking off here in Las Vegas. In the light of last week’s announcement from Cisco that it is dropping their WebEx Social (formerly Cisco Quad) product and instead partnering with Jive Software make me want to examine this space in relation to EMC Documentum. We are all talking Enterprise Social or Enterprise 2.0 of course and when it comes to Enterprise Content Management systems that is something we like to see as a social layer on top of established technologies to store unstructured information that makes interaction and collaboration more seamless.

What is this enterprise social thing then?

I would argue that what we need is support for asynchronous or non-realtime collaboration added to the features we can offer enterprise users. However, not only offering that as some separate silo of information often run in the cloud but as an integrated solution (with an on-premise option for some of us) where we can collaborate asynchronously around documents that we already have stored in ECM-systems like EMC Documentum.

  • Enterprise Social = Asynchronous collaboration = Non-realtime collaboration

Why does it matters for enterprise content management?

ECM is all about storing unstructured information in a way that provides context around it so it can be used efficiently in our business processes. That context includes metadata based on various taxonomies, security, lifecyles, workflow capabilities, alternative formats but also extracting information from the content that allows for ”discovered metadata”, classification and of course search. To me Enterprise Social is just about adding an additional set of context around information based on people collaborating around it not seldom without any formalized workflow support. That context is then so helpful in providing additional perspectives or views on the information based on what people have thought if it rather than what it is actually containing.

What kind of feature are we talking about then?

The social features that we usually expect when it comes to enterprise social applications are not just the pieces of added collaborative context but also the visualization views based on those attributes which includes:

  • Comments
  • Sharing (links to content in ECM or general links which is social bookmarking in a sense)
  • Likes
  • Favorites
  • Tags (ad hoc metadata)
  • Tag clouds
  • Status updates
  • Questions/answers
  • Wikis
  • Blogs
  • @messaging
  • Private messaging (hence replacing email)
  • Document collections (again based on links to ECM)
  • Activity Streams (both based on social actions but also from ECM like recently uploaded documents, workflow tasks, versioned documents etc)
  • Ideas and Thanks
  • Related content (others have also viewed or similar content to this)

What about synchronous or real-time collaboration?

Having some kind of asynchronous collaboration features in place of course make it natural to include integration to realtime collaboration support with features like:

  • Presence
  • Text chat
  • Audio chat (or VOIP)
  • Video chat
  • Desktop sharing
  • Meeting rooms with text, audio, video, presentation sharing

An ability to provide a seamless integration from seeing a comment around a document from a specific user, noticing the green presence icon next to the name and seamlessly launch a text or audio chat is of course powerful an helpful instead of relying in three different applications to do the same (ECM, Social and chat).

What have EMC Doumentum done in the social area before then?

Quite a lot to be honest but also to some degree to much in terms of not having a coordinated product effort over the years. First of all eRoom was/is a product based on just that, collaboration. It provided a way to set up spaces for collaboration often used for projects. After it was bought an integration effort was made to make eRoom use the same repository technology as the rest of the Documentum stack which made a lot of sense for those wanting to eliminate the information silo. A popular feature in that also included data tables even though their social nature is not that obvious but providing ”basic Excel” on a web page is a useful feature.

Second, application-based on Web Development Kit (WDK) and especially Digital Asset Manager included a fairly cool set of additions called Documentum Collaborative Services. What it did was to provide that asynchronous collaboration layer inside both the user interface and in the information model of Content Server. It included features like:

  • Rooms (basically supercharged folders)
  • Rich media formatted texts that appeared as ”banners” on top of every folder view (could be used to explain the purpose of the folder)
  • Discussion threads
  • Data tables
  • Calendars

After that came the first true effort to provide a modern social layer for Documentum in its Centerstage client which provided a modern AJAX-based interface with features like:

  • comments
  • tags
  • tag clouds
  • activity streams
  • new way of previewing content inspired by coverflow in iTunes.
  • wikis
  • blogs
  • discussions

In addition to that it also feature a new search capability based on Xplore and a nice facetting based on discovered metadata. From a technology standpoint so called Centerstage spaces included document libraries, discussion components which placed in a Spaces cabinet which in other clients looked like folders. Even more powerful these collaboration objects could be set up as a result from a an activity in a process.

Finally we are now in the era of the next-generation client called Documentum D2 with an even more modern architecture with configurable workspaces that can be set up differently for different groups of people. It actually includes some social capabilities like being able to view collaborative workspaces created in Centerstage but also doing comments and favorites on pieces of content. Another collaborative feature is support for annotation on/in documents which are even more enhanced if using any of the third party provided viewers that work with Documentum D2. Finally the feature which allows you to subscribe to changes to a document is back again (was available in Webtop/DAM) which actually is a powerful collaboration feature.

So there have been a fair set of collaborative features in the Documentum product over the years. The catch is however, that not all are available today in the current set of products.

What about Documentum and other provides of social platforms?

In May 2011 there was an announcement from EMC and Cisco about teaming up around social features by providing an integration between Documentum and what was then called Cisco Quad. In essence it meant being able to connect to documentum to document library sets within Cisco Quad using the CMIS interface. Cisco Quad then provided seamless integration to Cisco’s support for real-time collaboration such as WebEx Meeting and Jabber.

Around the same time it was announced that VMWare acquired the microblogging platform called Socialcast which provides an internal activity streams to customers but without features likes wikis and blogs. We have also seen a few official signs of that Socialcast technology is making its way into the platform and even seen prototypes of Socialcast-power activity streams within Documentum D2. CMS-wire have in fact publicly requested Chairman Joe Tucci to give Socialcast to the Information Intelligence Group which has been discussed again this weekend in a new article from CMS-Wire.

Finally it is worth mentioning that EMC the company is using social software from Jive Software to power its own socially-enables community called ECN which means that experience around that product is within EMC, although not necessarily in the Information Intelligence Group where Documentum belong. The hard part is to be able to both surface Documentum content in Jive and provide social interface components in D2 and xCP.

What is IBM and Alfresco doing then?

I spent last week at IBM Impact 2014 and had the opportunity of speaking with some executives around both their Enterprise Content Management software based on IBM Filenet and their collaboration software IBM Connections and IBM Sametime.  They currently provide an integration between both their Content Navigator product and Case Manager where social features like comments can be made on content stored in Filenet. There is also presence and chat integration from Content Navigator to Sametime as well as a possibility to go to Sametime Meetings directly from Connections. A lot of the integration is based on CMIS but there is of course possible to use REST-integration for all these three clients. I guess that is also the interesting observation that even if the integration exists there are still three (or four if you count Case Manager) web interfaces for ECM, Social and Real-time collaboration.

Since Cisco is now abandoning WebEx Social an integration with a product like IBM Connections (and therefore also IBM Sametime) make a lot of sense if this thing with VMWare Socialcast does not plays off. All these have REST-interfaces which would make integration feasible.

I guess I have to mention Alfresco as well here since it has a tendency to surface as an alternative to Documentum and Filenet from time to time. Alfresco is sometime marketed as Social ECM and the community edition offer a basic set of social features on top of the documents. You can favorite, like and comment (without notifications) and there is an activity stream to let you see what has happened in sites you are a member of. In addition to that each site contains support for discussions, wikis, bookmarks and data lists (similar to data tables). Finally you can do ”Dropbox”-like sharing of content but their integration with Desktop and mobile apps are nowhere near where EMC has come with syncplicity. A decent benchmark for social features in ECM but also not a dedicated Social interface like Socialcast, WebEx Social, Connections and Jive. Also no integration with enterprise real-time collaboration tools like WebEx, Lync and Sametime. Still Alfresco seem like things you use if you can leverage other web/cloud-based services.


EMC World 2013 & Momentum 12: EMC Documentum Roadmap Session

Presented by Patrick Walsh, Principle Product Manager Documentum Platform and Aaron Aubrecht VP Product Management & XPO, IIG.

This session will focus on the core platform. Last year they tried to fit in everything and ran 20 min over time and half of the slides left unshown.

Talks about the need for IT to deliver business capability not just applying patching to Documentum. My personal reflection is that many IT-shops do not have a business perspective today. Maybe because they have budget efficiency requirements on them making reducing costs the main priority.

Few people in the room had actually upgraded to Documentum 7 and that is a problem that to much modern capability is left unused in many organisations. That is why the separation of upgrades to platform and clients is pushed now.

What’s new in Documentum 7

So repeating the same message of what is new in D7. Performance improvements due to intelligent session management (ISM) and type caching. ISM reduces memory usage up to 65 % by multiplexing communications between application server, content server and the database. Similar memory usage improvements with type caching.

Talks about xMS with automated deployment of a new D7 and xCP 2.0 stack for private VMWare private cloud environment. Deployment down to hours via XML-based blueprints describing the deployment parameters. Includes embedded deployment and configuration of Hyperic agents. We have yet to try this but I really hope that the blueprints represent a best practice starting point to develop our own blueprints.

Also improved content intelligence with xPlore 1.3. Includes large file support through partial indexing, content classification inline, added date-range search capability & metadata and of course the recommendation engine. It also features ad scriptable command line interface for automation and you can control xPlore from third party tools via Admin UI.

Crypto algorithms switched from DDS to AES which seem about time! Leads also to improved performance.

Finally the EMC Syncplicity Connector for Documentum which allows for external sharing of information with security enforced at the endpoint.

What’s next for Documentum 7.1

Will come in Q4 2013. Full minor version.

Expanded infrastructure certifications:

  • Solaris 11 (with Oracle 11g R2)
  • AIX 7.1 TL2 (with DB2 Enterprise 9.7 FP7)
  • Windows Server 2012 (with SQL Server 2012 and Oracle 11g R2)
  • WebSphere 8.5 is supported in D7.1 while D7 supported alongside tcFabric App Server, Tomcat 7 and Oracle Weblogic
  • RHEL 6.x, x64 in D7.1, Native 64-bit, multithreaded architecture- Intelligence Session Management & Type Caching.

xMS 1.1 is coming in D7.1

Smarter deployments (Automatic discovery of services and componetns for existing environments). Orchestration for externally managed VMs or physical hosts. Clustering support with HA and load balancing.

Web administration UI is coming with automated software patching.

Documentum REST Services API (Q3 2013):

  • Standards based
  • Consumer-agnostic
  • Mobile friendly
  • Everything is a resources
  • Scalability

Enhanced trust and security. Continue to harden Documentum.

  • Stronger authentication security. Non-anonymous SSL.
  • Authentication plug-in for Jasig Central Authentication Service (CAS)
  • SSL Option for internal JMS – Content Server Communication

Great to see CAS support coming.

xPlore 1.4 is coming with faster response times for large result sets, improved diagnostics and automation for easier deployment.

Upgrading to Documentum 7

Talks about strategy forwards. Wants to reduce the number of configurations to test code against. Expect a narrower set of combinations of operating systems, app servers and databases.

Talks about the possibility to move upgrades for platform and clients separately.

The Enterprise Migration Appliance (EMA) is a response to the fact that migrations are complex projects. Happens not on the API level but on the database level. Traditional API-based methods are It is a virtual appliance with a complete server running on vSphere/ESXi environment. Also promotes migration solutions from both fme and euroscript.

There is an EMC Documentum 7.0 Rapid Success Program. To register go to: http://bit.ly/D70RSP by May 17, 2013.

Vision for the Documentum platform:

Best in-class ECM:

  • VIPR integration
  • Rapid content access through addressable caching

Trusted Platform

  • Mobile SSO via SAML and OAuth
  • Federated Identity Management and Dynamic User Enrollment for virtual trust zones

As much cloud as you need:

  • Dynamic Scailing with xMS
  • Cloud-based performance management and monitoring
  • Content contribution and bi-directional sync with Syncplicity
Enhanced by Zemanta

Business Intelligence: Sometimes a problematic term

I often find myself in between the world of military language and the completely different language used in the information technology domain. Naturally it didn’t take long before I understood that term mapping or translation was the only way around it and that I often can act like a bridge in discussions. Understanding that when one side says one thing it needs to be translated or explained to make sense in the other domain.

Being an intelligence officer the term Business Intelligence is of course extremely problematic. The CIA has a good article that dives into the importance of defining intelligence but also some of the problems. In short I think the definition used in the Department of Defense (DoD) Dictionary of Miltary and Associated Terms can illustrate the core components:

The product resulting from the collection, processing, integration, evaluation, analysis, and interpretation of available information concerning foreign nations, hostile or potentially hostile forces or elements, or areas of actual or potential operations. The term is also applied to the activity which results in the product and to the organizations engaged in such activity (p.234).

The important thing is that in order to be intelligence (in my area of work) it both has to gone through some sort of processing and analysis AND only cover things foreign – that is information of a certain category.

When I first encountered the term business intelligence at the University of Lund in southern Sweden it then represented activities done in a commercial corporation to analyse the market and competitor. It almost sounded like a way to take the methods and procedures from military intelligence and just apply it in a corporate environment. Still, it was not at all focused on structured data gathering and statistics/data mining.

So when speaking about Business Intelligence (BI) in a military of governmental context it can often cause some confusion. From an IT-perspective it involves a set of technical products doing Extract-Transform-Load, Data Warehousing as well as the products in the front-end used by analysts to query and visualise the data. Here comes the first issue of a more philophical issue when seeing this in the light of the definition of intelligence above. As long as the main output is to gather data and visualising it using Enterprise Reporting or Dashboards directly to the end user it ends up in a grey area whether or not I would consider that something that is processed. In that use case Business Intelligence sometimes claims to be more (in terms of analytical ambitions) than a person with an Intelligence-background would expect.

Ok, so just displaying some data is not the same thing as doing indepth analysis of the data and use statistical and data mining technology to find patterns, correlations and trends. One of the major players in the market, SAS Institute, has seen exactly that and has tried to market what they can offer as something more than ”just” Business Intelligence by renaming it to Business Analytics. That means that the idea is to achieve ”proactive, predictive, and fact-based decision-making” where the important word is predictive I believe. That means that Business Analytics claims to not just visualise historic data but also claim to make predictions into the future.

An article from BeyeNETWORK also highlights the problematic nature of the term business intelligene because it is often so connected with data warehousing technology and more importantly that only part of an organisation’s information is structured data stored in a data warehouse. Coming from the ECM-domain I completely agree but it says something about the problems of thinking that BI covers both all data we need to do something with but also that BI is all we need to support decision-makers. The article also discusses what analysis and analytics really mean. Looking at Wikipedia it says this about data analysis:

Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making.

The question is then what the difference is between analysis and analytics. The word business is in these terms also and that is because a common application of business intelligence is around an ability to measure the performance through an organisation through processess that are being automated and therefore to a larger degree measurable. The BeyeNETWORK article suggests the following definition of business analytics:

“Business analysis is the process of analyzing trusted data with the goal of highlighting useful information, supporting decision making, suggesting solutions to business problems, and improving business processes. A business intelligence environment helps organizations and business users move from manual to automated business analysis. Important results from business analysis include historical, current and predictive metrics, and indicators of business performance. These results are often called analytics.”

Looking at what the suite of products that is covered under the BI-umbrella that approach downplays that these tools and methods have applications beyond process optimization. In law enforcement, intelligence, pharmaceutical and other applications there is huge potential to use these technologies to not only understand and optimize the internal processes but more importantly the world around them that they are trying to understand. Seeing patterns and trends in crime rates over time and geography, using data mining and statistics to improve understanding of a conflict area or understanding the results of years of scientific experiments. Sure there are toolsets that is marketed more along words like statistics for use in economics and political science but those applications can really use the capabilities of the BI-platform rather than something run on an individual researchers notebook.

In this article from Forbes it seems that IBM is also using business analytics instead of business intelligence to move from more simple dashboard visualizations towards predictive analytics. This can of course be related to the IBM acquisition of SPPS which is focused on that area of work.

From the book written by Davenport and Harris, 2007

However, the notion of neither Business Intelligence nor Business Analytics says anything about what kind of data that is actually being displayed or analyzed. From a military intelligence perspective it means that BI/BA-tools/methods are just one out of many analytical methods employed on data describing ”things foreign”.

In my experience I have seen that misunderstandings can come from the other end as well. Consider a military intelligence branch using…here it comes BI-software…to analyse incoming reports. From an outsider’s perspective it can of course seem like what makes their activity into (military) intelligence is that they use some form of BI-tools and then present graphs, charts and statistical results to the end user. Resulting from that I have heard over and over again that people believe that we should also ”conduct intelligence” in for instance our own logistics systems to uncover trends, patterns and correlations. That is wrong because an intelligence specialists are both skilled in analytics methods (in this case BI) and the area or subject they are studying. However, since these tools are called Business Intelligence the risk for confusion is high of course just because of the Intelligence word in there. What that person means is of course that it seems like BI/BA-tools can be useful in analysing logistics data as well as data of ”things foreign”. A person doing analysis of logistics should of course be a logistics expert rather than an expert in insurgency activities in failed states.

So lets say that what we currently know as the BI-market evolves even more and really assumes a claim to be predictive. A logical argument on the executive level to argue that the investment must provide something more than just self-serve dashboards. From a military intelligence perspective that becomes problematic since all those activities does not need to be predictive. In fact it can be very dangerous if someone is let to believe that everything can be predictive in contemporary complex and dynamic conflict environment. The smart intelligence officer rather need to understand when to use predictive BI/BA and when she or he definitely should not.

So Business Intelligence is a problematic term because:

  • It is a very wide term for both a set of software products and a set of methods
  • It is closely related to data warehousing technology
  • It includes the term intelligence which suggests doing something more than just showing data
  • Military Intelligence only covers ”things foreign”.
  • The move towards expecting prediction (by renaming it to Business Analytics) is logical but dangerous in a military domain.
  • BI still can be a term for open source analysis of competitors in commercial companies.

I am not an native English speaker but I do argue that we must be careful to use such as strong word as intelligence when it is really justifiable. Of course it is still late for that, but still worth reflecting on.

Enhanced by Zemanta

EMC & Greenplum: Why it can be important for Documentum and ECM

It's the logotype of Greenplum
Image via Wikipedia

Recently EMC announced it was acquiring the company Greenplum which many people interpret as EMC is putting more emphasis on the software side of the house. Greenplum is a company that focuses on data warehousing technology for the very large datasets that is called ”big data” applications where the most public examples are Google, FaceBook, Twitter and such. Immediate reactions to this move from ECM is of course it is a sign of market consolidation and a desire play among the largest players like Oracle/Sun, IBM and HP by being able to offer a more complete hardware/software combo stack to its customers. Oracle/Sun of course has its Exadata machine as an appliance-based model to get data warehousing capability. Chuck Hollis comments on this move by hightlighting how it is a logic move that fits nicely both with EMC storage techonology but also of course the virtualisation technology coming out of VMWare. To highlight the importance EMC will create a new Data Computing Product Division out of Greenplum. As a side note I think it is better to keep the old name to keep the ”feeling” around the product just as Documentum is a better name than the previous Content Management & Archiving Division. After an initial glance of Greenplum it seems to be an innovative company that can solve business problems where established big RDBM vendors does not seem to be able to scale enough.

With my obvious focus is on Enterprise Content Management I would like to reflect how I think or maybe hope this move will matter to that area of business. In our project we started looking deeper into the data warehousing and business intelligence issues in January this year. Before we had our focus in implementing a service-oriented architecture with Documentum as a core component. We already knew that in order to meet our challenges around advanced information management there was a need to use AND integrate different kind of products to solve different business needs. ECM for the unstructured content, Enterprise Search to provide a more advanced search infrastructure, GIS-technology to handle maps and all spatial visualisation and so on. Accepting that there is no silver bullet but instead try to use the right tool for the right problem and let each vendor do what it is best at.

Replicate data but stored differently for different use cases
SOA-fanatics has a tendency to want very elegant solutions where everything is a service and all pieces of information is requested as needed. That works fine for more steady near realtime solutions where the assumption is that there is a small piece of information needed in each moment. However, it breaks down when you get larger sets of data that is needed to do longer term analysis, something which is fairly common for intelligence in support of military operations. If each analyst requests all that data over a SOAP-interface it does not scale well and the specialised tool that each analyst needs is not used to its full potential. The solution to this is to accept that the same data needs to be replicated in the architecture for performance reasons sometimes as a cache – common for GIS-solution to get responsive maps. However, there is often a need for different storage and information models depending on the use case. A massive audit trail stored in an OLTP-system based on a SQL-database like Documentum will grow big and accessing it for analysis can be cumbersome. The result is that the whole system can be slowed down just because of one analysis. Instead we quickly understood the need for a more BI-optmized information model to be able to do massive user behaviour analytics with acceptable performance. It is in the end a usability issue I think. Hence the need for a data warehouse to offload data form an ECM-system like Documentum. In fact it not only applies for the audit trail part of the database but also makes up for excellent content analytics by analysing the sum of all metadata on the actual content objects. The audit trail reveals the user interaction and behaviour while the content analytics parts gives a helicopter perspective on what kind of data is stored in the platform. Together the joined information provide quite powerful content/information and social analytics.

Add a DW-store to Documentum?
The technology coming from X-hive now has become both the stand-alone XML database xDB but also of course the Documentum XML Store that sits beside the File Store and the relational database manager. That provides a choice to store information as a document/file in the file store, as structured information in the SQL-database or as XML Documents in the XML Store. Depending on use case we have the choice to choose the optimal storage together with different ways of accessing it. There are some remarkable performance numbers looking at running Xqueries on XML Documents in the XML Store as being presented at EMC World 2010. Without knowing how it makes any sense from an architecture perspective I think it would be interesting to have a Data Warehouse Store as yet another component of the Documentum platform. To some degree it is already in there within the Business Process Management components where the Business Activity Monitor in reality is a data warehouse for process analytics. Analysis is off-loaded from the SQL-database and the information is stored in a different way to power the dashboards in Taskspace.

Other potential pieces in the puzzle for EMC
I realize that Greenplum technology is mainly about scalability and big data applications but to me it would make sense to also use the technology just as xDb in Documentum to become a data warehousing store for the platform. A store focused on taking care of the structured data in a coherent platform with the unstructured that is already in there. Of course it would need a good front-end to enable using the data in the warehouse for viualisation, statistics and data mining. Interestingly Rob Karel has an interesting take on that in hist blog post. During EMC World 2010 EMC announced a partnership with Informatica around Master Data Management (MDM) and Information Lifecycle Management (ILM) which also was a move towards the structured data area. Rob Karel suggest that Informatica could be the next logical acquisition for EMC althought there seem to be more potential buyers for them. Finally he suggests picking up TIBCO both to strengthen EMCs BPM offering but of course also to get access to the Spotfire data visualisation, statistics and data mining platform.

We have recently started working with Spotfire to see how we can use their easy-to-use technology to provide visualisations of content and audit trail data in Documentum. So far we are amazed over how powerful it is but still very easy to use. In a matter of days we have even been able to create a statistics server powered visualisation showing likelyhood of pairs of document being accessed together. Spotfire can then be used to replace Documentum Reporting Services and the BAM solution in Taskspace. Their server components are made in Java but the GUI is based on .Net which is somewhat a limitation but maybe something EMC can live with on the GUI-side. The Spotfire Web Player runs fine on Macs with Safari at least.

An opportunity to create great Social Analytics based on ECM
I hope the newly created Information Intelligence Group (IIG) at EMC sees this opportunity and can convince the management at EMC that there are these synergies except going for the the expanding big data and cloud computing market that is on the rise. In the booming Enterprise 2.0 market upcomers like Jive Software have added Social Analytics to their offering. Powering Centerstage with real enterprise class BI is one way of staying at the front of competitors with much less depth in their platform from an ECM perspective. Less advanced social analytics solutions based on dashboards will probably satisfy the market for  while but I agree with James Kobielus that there will be a need to analysts in the loop and these analysts expect more capable BI-tools just like Spotfire. It resonates well with our conceptual development which suggests that a serious approach to advanced information managements requires some specialists focusing on governing and facilitating the information in the larger enterprise. It is not something I would leave for the IT-department, it is part of the business side but with the focus on information rather than the information technology.

Enhanced by Zemanta

At the Apache Lucene Eurocon in Prague

Today I am in Prague to attend the Apache Lucene Eurocon conference hosted by Lucid Imagination, the commercial company behind the Lucene/Solr -project. It seem to almost 200 people here attending the conference. I am looking forward to speak tomorrow of Friday afternoon about our experiences of using search in general and with Solr in particular. The main part of our integration is that we have integrated Solr with a connector that listens to a queue on our Enterprise Service Bus which is a part of Oracle SOA Suite 10g (the Oracle Middleware platform).


EMC World 2010: Chiming in with Word of Pie about the future of Documentum

We have got a written reaction to Mark Lewis’ keynote held at EMC World 2010 in Boston. I both feel and have the passion around Enterprise Content Management and it is great that Laurence Hart spent so much time and effort on talking to people to craft this post. Someone need to say things even if they are not always easy to hear. So I will try to not repeat what he said in this blog post but rather try to provide my perspective which comes from what I have learned about Information and Knowledge Management over the past years. ECM and Documentum is a very critical component to move that IKM vision from the Powerpoint stage into reality. In our case an experimentation platform that allows to put our ideas to improve the ”business” of staff work in a large military HQ into something people can try, learn and be inspired from. Also, this turned out to be a long blog post which calls for an summary on top:

The Executive Summary (or message to EMC IIG) of this blog post:

  • Good name change but make sure You live up to your name.
  • A greater degree of agility is very much needed but do not simplify the platform so much that implementing an ECM-strategy is impossible.
  • Case Management is not the umbrella term, it is just one of many solutions on top of Documentum xCP
  • The whole web has gone Social Media and Rich Media. The Enterprise is next. Develop what You have and stay relevant in the 2010-ies!
  • Be more precise when it comes to the term ”collaboration”. There is a whole spectrum to support here.
  • Be more bold and tell people that Documentum offers an unique architectural approach to informtion management – stop comparing clients.
  • Tell people that enabling Rich Media, Case Management, E 2.0 and (Team) Collaboration on one platform is both important and possible.
  • I am repeating myself here: You want to sell storage, right? Make sure Video Management is really good in Documentum!

The name change

Before I start I just need to reflect on the name change from Content Management and Archiving into Information Intelligence Group (IIG). I agree with Pie…the had to be changed to make it more relevant in 2010 and a focus on information (as in information management which is more than storage ILM) is the right way to go. The intelligence part of it is of course a bit fun because of my own profession but still it implies doing smart things with information and that should include everything from building context with Enterprise 2.0 features to advanced Content and Information Analytics. You have the repository to store all of that – now make sure you continue to invest in analytics engine to generate structure and visualisation toolkit to make use of all the metadata and audit trails. Maybe do something with TIBCO Spotfire.

Documentum xCP – lowering the threshold and creating a more agile platform

Great. Documentum needs to be easier to deploy, configure and monitored. Needed to get know customers on board easier and make existing ones be able to do smarter things with it in less time. However, it is easy to fall into the trap of simplifying things to much here. To me there is nothing simple around implementing Enterprise Content Management (ECM) as a concept and as a method in an organization. One major problem with Sharepoint and other solutions is that they are way to easy to install so people actually are fooled into skipping the THINKING part of implementing ECM and think it is just ”next-next-finish”. All ECM-systems needs to be configured and adapted to fit the business needs of the organisation. Without that they will fail. xCP can offer a way to do that vital configuration (preceeded by THINKING) a lot more easier and also more often. We often stress how it is important to have the technical configuration move as close to any changes in Standard Operating Procedures (SOP) as possible. If Generals want to change the way they work and the software does not support it they will move away from using the software. Agility is the key.

In our vision the datamodel needs to be much more agile. Value lists need to updated often – sometimes based on ad hoc folksonomy tagging. Monitoring of the use of metadata and tags will drive that. Attributes or even object types need to be updated more often. Content need to be ingested quickly while providing structure later on (think XML Store with new schemas here). xCP is therefore a welcome thing but make sure it does not compromise the core of what makes Documentum unique today.

The whole Case Management thing

Probably the thing that most of us reacted against in the Mark Lewis Keynote was the notion that ECM-people in reality just have done Case Management all the time. I recently spend some time reflecting on that in another blog post here called ”Can BPM meet Enterprise 2.0 over Adaptive Case Management?”. There is clearly a continuum here between supporting very formal process flows and very ad-hoc Knowledge Worker-style work. They clearly seem different and while they likely meet over Adaptive Case Management but to me it makes no sense to have that term cover the whole spectrum – even for EMC Marketing 🙂

I immediately saw that Public Sector Investigative work is often used as an example of Case Management. Case Management in especially done by law enforcement agencies is fundamentally different from work done by Intelligence Agencies because in Case-based Police investigations there is usually some legal requirement to NOT share information between cases unless authorised by managers. This is of not the case (!) for all Case Management applications but from a cultural perspective it is important that Case Management-work by the Police is not a line of business that should be used as an example of information sharing. It is even so that the underlying concept actually is at ends with any concept of unified enterprise content management strategy where information should be shared. That is why workgroup-oriented tools such as i2 Analyst’s Workstation have become so popular there.

The point here is that it is important to not disable sharing in the architectural level because again it is what constitutes a good ECM-system that content can be managed in a unified way. Don’t be fooled by requirements for that – use the powerful security model to make it possible. Then Law Enforcement Agencies can use it as well. However, there must be more to ECM than Case Management – as Word of Pie suggests it is just ONE of many solutions on top of the Documentum xCP platform. A platform which is agile enough to quickly build advanced solutions for ECM on top.

Collaboration vs Sharing and E.20

So, Collaboration is used everywhere now but the real meaning with it actually varies a bit. First there are two kind of collaboration modes:

  • Synchronous (real-time)
  • Asynchronous (non-real time – ”leave info and pick up later)

Obviously neither Documentum nor Sharepoint is in real-time part of the business. For that you will need Lotus Sametime, Office Communications Server, Adobe Connect Pro or similar products. However, Google Wave provides a bit of confusion here since it integrates instant messaging and collaborative document editing/writing.

However, I am bit bothered by the casual notion of anything as a collaboration tool like Sharepoint and for that sake eRoom is getting. To further break this down I believe there is a directness factor in collaboration. Team collaboration has a lot of directness where you collaborate along a given task with collegues. That is not the same as many of the Social Media/Enterprise 2.0 features which does not have a clear recipient of the thing you are sharing. And sharing is the key since you basically are providing a piece of information in case anyone wants/needs it. That is fundamentally different from sending an email to project members or uploading the latest revision to the project’s space. Andrew McAffe has written about this concept and uses the concept of a bullseye representing strong and weak ties to illustrate this effect.

My point is that it is important that tools for team collaborations from an information architecture standpoint can become part of the more weaker indirect sharing concept. That is the vehicle to utilze the Enterprise 2.0 effect in a large enterprise. Otherwise we have just created another set of stove-pipes or bubbles of information that is restricted to team members. I am not saying that all information should be this transparent but I will argue that based on a ”responsibility to provide”-concept (see US Intel Community Information Sharing Policy) restricting that sharing of information should be exception – not the norm.

Sure as Word of Pie points out in his article ”CenterStage, the Latest ex-Collaboration Tool from EMC” there are definitely things missing from the current Centerstage release compared to both Sharepoint and EMC’s old tool eRoom. However, as Andrew Goodale points out in the comments I also think it is a bit unfair because both eRoom and at least previous versions of Sharepoint (which many are using) actually lacks all these important social media features that serves to lower the threshold and increase participation by users. They also provide critical new context around the information objects that was not available before in DAM, WebTop or Taskspace. Centerstage also provides a way to consume them in terms of activity streams, RSS-feeds and faceted search. Remember that Centerstage is the only way to surface those facets from Documentum Search Server today.

So, I am also a bit disappointed that things are missing in Centerstage that should be there and I also really want to stress the importance of putting resources into that development. Those features in there are critical for implementing all serious implementations of an ECM-strategy and the power of Documentum is that they all sits in the same repository architecture with a service layer to access them. Maybe partner with Socialcast to provide a best practice implementation to support a more extensive profile page and microblogging. Choose a partner for Instant Messaging in order to connect the real-time part of collaboration into the platform. Again, use your experience from records management and retention policies to make those real-time collaboration activities saved and managed in the repository.

Be bold enough to say you are an Sharepoint alternative – but for the right reasons

I’m not an IT-person, I come into this business with a vision change the way a military HQ handles information so I see Enterprise Content Management more as a concept than a technology platform. However, when I have tried to execute our vision it becomes very clear that there is a difference between technology vendors and I like to think that difference comes from internal culture, experience, and vision of the company. It is the ”why” behind why the platform looks like it does and has the features it has. So as long you are not building everything from scratch for yourself it actually matters a lot which company you chose to deliver the platform to make your ECM vision happen. That means that there IS a difference between Documentum and Sharepoint in the way the platform works and we need to be able to talk about that. However, what I see now is that most people focus on the client side of it and try to embrace it is a popular collaboration tool. Note that I say tool – not platform. All those focuses on the client side of it where the simplified requirement is basically a need for a digital space to share some documents in. However, the differentiator is not whether Centerstage or Sharepoint meets that requirement – both do. The differentiator is whether you have a conceptual vision on how to manage the sum of all information that an organization have and to what degree those concepts can be implemented in technology. That is where the Documentum platform is different from other vendors and why it is different from Sharepoint. Sharepoint is sometimes a little bit to easy to get started with which unfortunately means there is no ECM-strategy behind the implementation and when the organisation have thousands of Sharepoint sites (silos) after a year or so that is when that choice of platform really starts to differ.

This week at EMC World has been a great one as usual and there is no shortage of brilliant technical skills and development of features in the platform. What I guess bothers me and some other passionate ECM/Documentum-people is the message coming out from the executive level at IIG. In the end, that is where the strategic resource decision are made and where the marketing message being constructed. I think now there is a lot more to do on the vision and marketing level than actually needs to be done on the platform itself. The hard part seem to be proud of what the platform is today, realize it’s potential to remain the most capable and advanced on the market and use that to stay relevant in many applications of ECM – not just Case Management.

Rich Media – A lot of content to manage and storage to sell

One of the strong points of Documentum is that it can manage ALL kind of content in a good way and that includes of course rich media assets such as photos, videos and audio files. Don’t look upon this as some kind of specialised market only needed by traditional ”creative” markets. This is something everybody needs now. All companiens (and military units for that sake) have an abundance of digital still and video cameras where a massive amount of content needs to be managed just as all the rest of the content. There is a need for platform technologies that actually ”understands” that content and can extract metadata from it so that this content can be navigated and found easily. It is also important to assist users in repurposing this content so it can be displayed easily without consuming all bandwith and also easily be included in presentations and other documents. This is also very much relevant from a training and learning perspective where screencams and recorded presentations has so much potential. It does not have to be a full Learning Management System but at least an easy way to provide it. Maybe have a look at your dear friend Cisco and their Show and Share application. Oh, it is marketed as a Social Video System – the connections to Centerstage (and not just MediaWorkspace) is a bit too obvious. Make sure you can provide Flickr and Youtube for the Enterprise real soon. People will love it. Again, on one very capable platform.

Media Workspace is a really cool application now. Even if it does not have all the features of DAM yet (either) it is such a sexy interface on Documentum. The new capabilites of handling presentations and video are just great. Be sure to look more at Apple iPhoto and learn how to leverage (and create) metadata to support management of content based on locations, people and events. A piece of cake on top of a Documentum repository. Now it is a bit stuck in the Cabinet/Folder hierarchy as the main browsing interface.


I agree with Word of Pie that there is a lack of vision – an engaging one that we all can buy into and sell back home to our management. In my project we seem to have such a vision and for us Documentum is a key part of that. I just hoped that EMC IIG would share that to a greater degree. From our responses back home in Sweden and here at EMC World people seem to both want and like it (have a look at my EMC World presentation and see what you think). We can do seriously cool and fun stuff that will make management of content so much more efficient which should be of critical importance for every organisation today. At least in the military one thing is for sure and that is that we won’t get more people. We really have to work smarter and that is what a vision like this will provide a roadmap towards.

So be proud of what you do best EMC IIG and make sure to deliver INTEGRATED solutions on top of that. For those who care that will mean a world of difference in the long run and will gather looks of envy for those who did not get it.


Dave Kellogg on Palantir

I recently began reading the blog written by Dave Kellogg who is the CEO of Mark Logic, a company devoted to XML-based content management. I think I came to notice them when I discovered what cool technology EMC got when it bought X-hive which has now become Documentum xDb/XML Store. Mark Logic and X-hive was of course competitors in the XML Database market. In a recent blog post he reflects on the Palantir product after attending their Government Conference.

The main scope of his blog post is around different business models for a startup and that is not my expertise and I don’t have any particular opinion around that although I tend to agree and it was interesting to read his reflections of how other companies such as Oracle (yet another competitor to Mark Logic and xDb) have approached this.

Instead my thinking is based around his analysis of the product that Palantir offers and how that technology relates to other technology. I think most people (including Kellogg) mainly view Palantir as a visualisation tool because you see all these nice graphs, bars, timelines and maps displaying information. What they tend to forget is that there is huge difference between a tool that ONLY do visualisation and one that actually let you modify the data (actually modifying contextual data around them such as metadata and relations) within those perspectives. There are many different tools around Social Network Analysis for instance. However, many of them assumes that you already have databases full of data just waiting to be visualised and explored. Nothing new here. This is also what many people use Business Intelligence toolkits for. Accessing data in warehouses that is already their, although the effort of getting there from transactions oriented systems (like in retail) is not small in any way. However, the analyst using these visualisation-heavy toolkits access data read-only and only adds analysis of data already structured.

Here is why Palantir is different. It provides access to raw data such as police reports, military reports, open source data. Most of it in unstructured or semi-structured form. When it comes into the system it is not viewable in all these fancy visualisation windows Palantir has. Instead, the whole system rests on a collaborative process where people perform basic analysis which includes manual annotations of words in reports. This digital marker pen allows users to create database objects or connect to existing ones. Sure this is supported by automatic features such as entity extraction but if you care about data quality you do not dare to put them in automatic mode. After all this is done you can start exploring the annotated data and linkages between objects.

However, I do agree with Dave Kellogg that if people think BI is hard, this is harder. The main reason is that you have to have a method or process to do this kind of work. There are no free lunches – no point of dreaming about full automation here. And people need training and mindset to be able to work efficiently. Having played around with TIBCO Spotfire lately I feel that there is a choice between integrated solutions like Palantir which has features from many software areas (BI, GIS, ECM, Search etc) or using dedicated toolkits with your own integration. Powerful BI with data mining is best done in BI-systems whereas they probably never will provide the integration between features that vendors like Palantir offers. An open architecture based on SOA can probably make integration in many ways easier.


Why iPhone OS (iPad) is ECM…

I like Twitter. It exposes me for a lot of interesting thoughts from interesting and smart people that I follow. Today I read a post called  Why the iPad Matters – Its the Beginning of the End by Carl Frappaolo. It talkes a lot of why the iPad brings a new promise for content delivery – a complete digital chain. It made me think about one of the things which is unique with the iPod/iPhone/iPad – it is the lack of a folder-based file system exposed to users. Surprisingly (maybe) it is the lack of it that makes the whole user experience much better.

So how does this relate to ECM then? Well, I guess many of us ECM-evangelists (or ”Ninjas” I heard today) have been in endless meetings and briefings explaining the value of metadata and the whole ”context-infrastructure” around each object in an ECM-system that can hold fine-grained permissions, lifecycles, processess, renditions and so forth. I have even found myself explaining the ECM concept using the iTunes as an analogue. You tag the songs with metadata and access them through playlists which is in essence virtual folders where each song can be viewable in many playlists. That is the same concept as the ”Show in folder” flag in Documentum. Metadata can even power Smart Playlists which in essence is just a saved search query – something we have added as a customization in Documentum Digital Asset Manager (DAM). So in essence the iTunes Library (should be call it a repository 🙂 is a lightversion of an ECM-system. Before continuing I really wonder why I have to customize Documentum to get the GUI-features that iTunes provide…?

So iTunes abstracts away the folder-based file system on a Mac or Windows PC but as long as you are using Mac OS X or Windows the file system is still there right. Some people even get really frustrated by iTunes and just can’t get around their head that there is no need to move files around manually when synching them to iPhone OS-powered devices. And here comes the beauty, in these devices there are no folder-based file system to access. Just the iPod App for music, the Photos App for photos and so forth. All your content is suddenly displayed in context and filtered out based on metadata and that App’s specific usage.

To some degree that means that the whole concept of iPhone OS-based devices not only can make content delivery digital but it can provide a much better user interface that is powered by all these ECM-features that we love (and have a hard time explaining). Suddenly we have an information flow entirely based on metadata instead of folder names and file names. Maybe that will make ECM not only fun but also able to much more quickly explain the dreaded ”What’s in it for me question?”.

Now, can someone quickly write an iPad App for Documentum so I can make my point 🙂 It will be a killer app, believe me!


CPU, Cores and software licenses

In an article in ComputerWorld there is a good discussion around license models for different software vendors. There seem to be a mix of per socket pricing and some notion of a CPU and that each CPU corresponds to a number of processor cores. In EMC:s case for instance a CPU-license corresponds to 2 Cores and Oracle has a similar model. The number of processor cores is steadily increasing and soon it will be common with 6-8 cores per socket on server hardware. I agree with the article that that these models need some kind of revision. This is especially true if you sign longer contracts where this development can lead to some interesting issues. Server hardware need to be replace sooner or later because of power, storage or just performance reasons. It is not uncommon that the idea is to get fewer but more powerful servers in order to save power and cooling.

The interesting effect then is even if you can consolidate software applications on fewer hardware they each overstep their licenses in terms of server cores. What about virtualisation then? Well, that is of course also the future so power can be load-balanced between applications more easily. However, that means that the license model must allow for using virtualisation to throttle down to any number of cores per licensed application. In Oracle’s case again that usually means a requirement to run their own virtualisation product even if you have a VMWare investment.


The Long Tail of Enterprise Content Management

Question: Can we expect a much larger amount of the available content to be consumed or used by at least a few people in the organisations?

Shifting focus from bestsellers to niche markets
In 2006 the editior-in-chief of Wired magazine Chris Andersson published his book called ”The Long Tail – Why the Future of Business is Selling Less of More”. Maybe even the text printed on the top of the cover saying ”How Endless Choice is Creating Unlimted Demand” is the best summary of the book. This might have been said many times before but I felt a strong need to put my reflections into text after reading this book. It put a vital piece of the puzzle in place when seeing the connections to our efforts to implement Enterprise 2.0 within an ECM-context.

Basically Chris Andersson sets out to explain why companies like Amazon, Netflix, Apple iTunes and several others make a lot of money in selling small amounts of a very large set of products. It turns out that out of even millions of songs/books/movies nearly all of them are rented or bought at least once. What makes this possible is comprised out of these things:

Production is democratized which means that the tools and means to produce songs, books and movies is available to almost everybody at a relatively low lost.
– Demoractization of distribution where companies can broker large amount of digital content because there is a very low cost for having a large stock of digital content compared to real products on real shelves in real warehouses.
– Connecting supply and demand so that all this created content meets its potential buyers and the tools for that is search functions, rankings and collaborative reviews.

What this effectivly means is that the hit-culture where everything is focused on a small set of bestsellers is replaced with vast amounts of small niches. That has probably an effect of the society as a whole since the time where a significant amount of the population where exposed to the same thing at the same time is over. That is also reflected in the explosion of the number of specialised TV-channels and TV/video-on-demand services that lets views choose not only which show to watch but also when to watch it.

Early Knowledge Management and the rise of Web 2.0
Back in the late 90-ies Knowledge Management efforts thrived with great aspirations of taking a grip of the knowledge assets of companies and organisations. Although there are many views and definitions of Knowledge Management many of them focused on increasing the capture of knowledge and that the application of that captured knowledge would lead to better efficiency and better business. However, partly because of technical immaturity many of these projects did not reach its ambitous goals.

Five or six years later the landscape has changed completely on the web with the rise of Youtube, Flickr, Google, FaceBook and many other Web 2.0 services. They provided a radically lowered threshold to contribute information and the whole web changed from a focus on consuming information to producing and contributing information. This was in fact just democratization of production but in this case not only products to sell but information of all kind.

Using the large-scale hubs of Youtube, Flickr and Facebook the distribution aspect of the Long Tail was covered since all this new content also was spread in clever ways to friends in our networks or too niche ”consumers” finding info based on tagging and recommendations. Maybe the my friend network in Facebook in essence is a represention of a small niche market who is interested in following what I am contributing (doing).

Social media goes Enterprise
When this effect started spreading beyond the public internet into the corporate network the term Enterprise 2.0 was coined by Andrew McAfee. Inside the enterprise people where starting to share information on a much wider scale than before and in some aspects made the old KM-dreams finally come into being. This time not because of formal management plans but more based on social factors and networking that really inspired people to contribute.

From an Enterprise Content Management perspective this also means that if we can put all this social interaction and generated content on top of an ECM-infrastructure we can achieve far more than just supporting formal workflows, records management and retention demands. The ECM-repository has a possibility to become the backbone to provide all kind of captured knowledge within the enterprise.

The interesting question is if this also marks a cultural change in what types of information that people devoted their attention to. One could argue that traditional ECM-systems provide more of a limited ”hit-oriented” consumption of information. The abscense of good search interfaces, recommendation engines and collaboration probably left most of the information unseen.

Implications for Enterprise Content Management
The social features in Enterprise 2.0 changes all that. Suddenly the same effect on exposure can be seen on enterprise content just as we have seen it on consumer goods. There is no shortage of storage space today. The amount of objects stored is already large but will increase a lot since it is so much easier to contribute. Social features allows exposure of things that have linkages to interests, competencies and networks instead of what the management wants to push. People interested in learning have somewhere to go even for niche interests and those wanting to share can get affirmations when their content is read and commented by others even if it is a small number. Advanced searching and exploitation of social and content analytics can create personalised mashup portals and push notifcations of interesting conent or people.

Could this long tail effect possibly have a difference on the whole knowledge management perspective? This time not from the management aspect of it but rather the learning aspect of it. Can we expect a much larger amount of the available content to be consumed or used by at least a few people in the organisations? Large organisations have a fairly large number or roles and responsibilities to there must reasonably be a great difference in what information they need and with whom they need to share information with. The Long Tail effect in ECM-terms could be a way to illustrate how a much larger percentage of the enterprise content is used and reused. It is not necessarily so that more informtion is better but this can mean more of the right information to more of the right people. Add to that the creative effect of being constantly stimulated by ideas and reflections from others around you and it could be a winning concept.


Andersson, Chris, ”The Long Tail – Why the Future of Business is Selling Less of More”, 2006
Koernan, Brendan I, ”Driven by Distraction – How Twitter and Facebook make us more productive workers” in Wired Magazine March 20