Month: November 2011

Reflections from Momentum 2011 Berlin

EMC (IIG) the company

  • A real tech company
  • Responsive employees
  • Easy to get access inside the company
  • Willing to share information
  • Sometimes hard to figure out ”who is who” in EMC Information Intelligence Group (IIG)

As a customer it is important how the company feels. My experience is that EMC is a company where you can find tech-savvy people who really like what they are doing. And they are good at it. The general experience is that employees are interested in listening to us and very responsive to our needs. It is easy to quickly get access to both key business people as well as people in engineering. On the other hand that is often required because the product is quite complicated. On the negative side the company is big and that means that things are not always coordinated and it can sometimes be difficult to figure out who is who among all the different product managers, general managers, solutions directors and architects.

EMC IIG seems open and transparent to me. Sure there are disclaimers but they are talking openly about most things and there is no NDA at the conference.

 

Strategy

I feel a big difference this year – maybe because I have been away for over a year due to my year at the National Defense College. The big difference is that EMC Information Intelligence Group finally seems to get it. For real. Away from the idea that Case Management is something different than Enterprise Content Management. A realization that nice-looking usable user interfaces is a key thing. Understanding that the cloud is a key component of EMC IIG future. Communicating the power of configuration instead of coding is the real power of xCP but not just the interfaces – the whole application. Finally working to get decent analytics to make use of the contextual information that already exists around objects in the repository. Somehow it feels like there is a new executive team in place that wants to be a little bit more bold and wants to move IIG in a certain direction.

EMC has made numerous acquisitions after it bought Documentum but now it feels like they are finding out that they all have lots of different pieces of technology within the company that together can be a bigger whole.

Working with EMC owned VMWare to provide not only certification for all Documentum components but also leveraging the power of their virtualization infrastructure to both ease deployment but also enable efficient use of infrastructure.

Working with VMWare-owned Socialcast to include activity streams into Documentum user interfaces.

Working with RSA to enchance the security features of the platform.

Working with Greenplum to power analytics but also provide a new perspective of handling big data with smart on top of it – big information.

 

Towards a unified client

  • Client situation is a mess today
  • C6 acquisition was a good move
  • A unified client is coming along
  • Wonderful to see the focus on iOS apps

The user interface of Documentum is frankly a mess nowadays. A result of too many teams working in their own bubble creating user interfaces based on different customer groups. WDK-based Webtop with its DAM-cousin. Taskspace which is also WDK but gains some power from Forms Builder and xCP technologies. ExtJS-based Centerstage which look great but is a bit late and light in features. Feature-rich Media Workspace which is based on Flex in a world where Adobe Flash is obviously loosing traction and HTML5 is taking off. Steve Jobs really made a difference here it turns out. On top of that Desktop applications for OS X and Windows as well as an Outlook client. It is not that I think there is a need for different clients. There is. Especially from a training perspective where some companies require almost zero training whereas other can accept more extensive training.

The inclusion of C6 Technologies into Documentum is a welcome move and I heard lots of positive reactions to that. However, the key thing is that EMC IIG is now firmly committed to unifying all clients with one technology stack which of course will focus a lot on configurability. So in the end it could very well mean that the number of clients will be much bigger, but will be just different configurations based on very specific user needs. The unified client will most likely be based on C6 and ExtJS technologies which means that Flex is going away quickly. So is WDK and Taskspace but in a longer perspective. So think of D2 as a Webtop replacement and X3 as the new Centerstage with lots of widgets including ones for rich media management. Probably we will see the C6 iPad client replace the existing Documentum client as well. Expect an iPhone client soon as well.

Speaking about iOS. To me it almost like a new world compared to my first EMC World in 2007. Everybody at EMC were using Blackberries and Macs were hardly seen. Now the iPad app is out, Peggy talks about “everybody loves their iPads”, Macs are in booths and on stage, there are several Documentum apps and almost all contest prices consists of iPads. Macgirlsweden is both happy and astonished at this development J

 

Policy-based deployment with monitoring

Ok, so Documentum is not easy to deploy. It takes a while but as Jeroen put it: “You guys want to do complicated stuff!”. I think he is right and it might sometimes be a good thing since you have stop and think (not like Sharepoint which is way to easy to install in that sense). You choose Documentum because you have a complicated process to support, large amount of content and an ECM vision. Still, agility really needs to be improved and that will also simplify deployment. So improvement is important for several reasons.

The first part of that is the xCelerated Management System which in essence lets you describe and model your applications and your deployment needs. Tools then translate these policies onto your VMWare-powered infrastructure and deploys the whole Documentum platform based on your needs. Taken into account the number of users, the type of content, type of processes and what kind of high availability demands you have. Finally all of this is monitored using a combination of open-source Hyperic and their Integrien engine they got through an acquisition. Integrien now seem to have become VMWare vCenter Operations. That architecture will in my opinion set EMC Documentum way ahead of its competitors especially if it can provide some additional agility when the Next-Generation Information Server (NGIS) comes.

 

Analytics and Search

  • xPlore is looking good
  • Thesaurus-support is a good thing
  • QBS is great
  • Custom-pipeline support based on UIMA is great

A dear subject of mine where EMC IIG finally seem to get their act together. They have there own search server called xPlore which is based on open-source Lucene and their own powerful XML-database xDB. A really smart move now when FAST, Autonomy, and Endeca have been bought by the other IT-giants.

xPlore 1.2 provides some really cool features both in terms of baseline search capabilities like thesaurus support but also more text analytics oriented features. The content processing pipeline now supports extensions based on UIMA which opens up to having other entity extraction engines connected into explore. Another really cool feature is Query-Based subscriptions which really leverages the Documentum repository. Create a search query based on a combo of free text and metadata. Save it and set it up to run with different intervals and notify you of any new content that has been ingested. You can even use to to fire of a workflow in order to have somebody take action. Hopefully we will see some xCP integration in the xPlore 1.3 release where the search experience and indexing is linked to the characteristics of the xCP Application Model.

In his Innovation Speech the Chief Architect Jeroen van Rotterdam also showcased a modified centerstage which used a recommendation engine based on a Hidden Markow model to suggest similar content to users based on similiarity in context and similarity in content. A really powerful feature that makes EMC live up to its name: Information Intelligence Group (IIG). Jeroen also mentioned that they are working on video and audio analytics including speech-to-text which is then indexed into explore. That will most likely arrive in the iPad client first.

Another cool thing that is coming for the Content Intelligence Services (CIS) component is automated metadata extraction based on rules and taxonomy cold-start. Which means that you could start generating taxonomy based on your existing content.

Next-Generation Information Server (NGIS)

It seems that there has been a big investment in the xDB technology and therefore it is a key component in NGIS. Not any surprise there since Jeroen is one of the founders of the company that EMC bought. That could also mean that future installations of Documentum will not require a traditional SQL RDBMS which would not be such a bad thing. One less license and one less skill set to manage. NGIS is being designed with both the cloud and “big information” in mind. The idea is to be able to use different datastores such as Atmos, Greenplum, Isilon etc together with NGIS. I really like the term “big information” which is a way to take what we now know as “big data” to the next level where it also covers unstructured data and documents. Since there is a wave of information coming over us now it seems smart to design this for huge datasets from the beginning. After all we need to manage it whether we like it or not. As Peter Hinssen put it at the final keynote: “It is not information overload – it is a filter failure”. We CAN handle vast amount of data if we design the architecture right. Another interesting concept is to bring processing to data(nodes) instead of what we do today when we have a central processing node which we pipe all data through. Everybody is realising that the first releases of NGIS will not be feature-complete in comparison with Documentum Content Server but I also wonder to what the cloud focus really mean for NGIS. I hope it means cloud as a technical concept and not only public cloud meaning that NGIS only will be available for OnDemand at first. On the other hand, an early access program is now opening up and that will most likely be run on premise. NGIS will be an important aspect to make Documentum retain its position as the leader in ECM-technology. In the light of the other innovation going on it can be a bright future.

Cloud and EMC OnDemand

So now you can run a complete Documentum stack in the cloud. Great thing which I think will broadened the market a bit. Much easier to get up and running and an ability to focus on core ECM-capabilities instead of installning server OS, DBMS and managing storage. A good thing is the ability to have extra power available if needed. Provisions of a full platform is said to happen in 6-8 hours dependning on configuration. Deployment will be in a vCube where all Documentum servers will be managed as images. Each customer gets its own vCube. It will be possible to run a vCube on premise but that means that EMC still manages the configuration over the internet even though it is running on your hardware. There will be some limitations on the level of customizations that you can do in order to have EMC take responsibilty over the vCube. Remember all server OS and DBMS licenses are included in the vCube. All together the cloud initiative is driving huge configuration and deployments which all aspects of Documentum will gain from.

 

Venue and atmosphere

  • Keep working on the IIG and Documentum community feeling

Another Momentum conference has ended and it is time to reflect on our experiences from this event. This was my second European conference but I have attended four EMC World conferences. I keep hearing that they are different and also stories from the old Momentum conferences before EMC acquired Documentum. During my first EMC World events I really felt that the Documentum community was lost among a wave of storage people roaming around. However, the Momentum brand has been strengthened and I believe the difference between the US and the European conference is much smaller now. I think the main difference is the crowd and the atmosphere. The locations in Europe are a bit smaller in scale but also the event sites physically look different. In all EMC IIG made a very good job organizing this event with no visible friction from my point of view.

 

Practical things

  • More power outlets
  • Dedicated wifi in the keynote area (to allow use of Social media)
  • Set up a blogger’s lounge based on the EMC World concept

In general EMC created a very well organized event but there are some room for improvements anyway. One thing is the meals area. For some reason the Americans prefer round tables ”en masse” whereas this event was located in the ordinary breakfast restaurant in the hotel. Tables were straight ones with 2-8 seats each. To me that did not invite to as many spontaneous lunch encounters as I experience at EMC World. People tend to stay in their small groups and eat in those as well.

Another recurring issue is of course shortages in power outlets, which I found really strange in an IT-conference and with EMC’s strong push for social media interactions. Even though iPads are much more common now (even at EMC events) I think the conference experience would be more productive with a decent number of outlets and a capable Wifi network. My best experience so far is still a Microsoft conference around FAST Search in Vegas where all 1200 participants had tables with outlets.

The were a social media center but I felt it was way to small compared to the spacious EMC World blogger’s lounge. There are still quite few people who are using social media during the conference and a good lounge would encourage interaction IRL between us. Consider creating badges where your Twitter name and blog address is printed.

 

Social events

  • Make them about networking
  • Make it possible to talk – have areas without very loud music
  • Make sure those with allergies can eat and eat safely.

First, of all I don’t drink alcohol at all. So I that sense I may not be representative for the group at large. Still, since this is a professional conference I do have some opinions based on what the utility of these social events could have. Of course, it should be a more relaxed time and a possibility to have some fun. However, I do like to see these events as very good opportunity for networking between all of us at the conference. Locating these events in nightclubs with very loud music is therefore not an ideal setting for networking. I think the EMC World Social events in the US are better that way. Spending the night in Universal Studios for instance was a very much different experience than Ewerke in Berlin. Not just because there are terrific and fun rides there but also because there were lots of places to sit down, eat good food and talk a lot. I had a great evening there last time talking a lot about the future of content analytics with EMC staff and customers. So at least provide areas where people can talk to each other. Make the events more of continuation of the conference day. Make sure that it is in theme – any entertainment should have some connection to ECM. Maybe a stand-up around our community or a show with music with dedicated lyrics about us. Also, it would be great to have more non-alcoholic alternatives than orange juice, coke and Fanta. Also, I am allergic to nuts and I had a small incident where I accidentally ate something with nuts in it. Provide good information and possibly alternatives for us with allergies.

DISCLAIMER: All opinions here are my own and does not represent any official view of my employer. Information are based on notes and conversations and may contain errors.

Enhanced by Zemanta

Notes from the Momentum 2011 session “Current and Future Architecture of Documentum”

These are notes from the session with Jeroen van Rotterdam, Chief Architect, IIG Services. It may contain errors and all these sessions are subject to change from an EMC perspective.

The focus on the Documentum 6.7 release was improved quality and performance improvements

Gives an example from a classic HA Configuration consisting of:

LoadBalancer

4 Web Servers

1 DocBroker

2 Content Server

 

He sometimes gets the question: ”Why is it so hard to deploy DCTM?” He smiled and exclaimed ”You guys want to do complicated stuff”.

 

The current components of the Content Server Repository:

–       Content Files (FS)

–       Metadata (RDBMS)

–       XML Store (xDB)

–       xPlore Full Text (xDB)

External sources

–       Centera

–       Atmos

 

Gives another example of a customer with 20k users

Branch Office Caching Server

–       Predictive caching (push content)

–       Distributed write option (async and sync). Local write and then syn cup.

The idea is to monitor users in a similar type of context.

Some users usually starts with an activity and will be in that process flow and therefore it is his/her context. Content related to that context can then be pushed to servers close to the user.

 

xMS

xMS is yet another acronym which in this case means xCelerated Management System

–       Define requirement – Blueprints

–       Describe them independent of deployment options

–       Automatically deploy blueprint to a target

 

In the Run component there can be:

-multiple VM Clusters running on

-multiple ESX

-virtual machines are created based on the blueprints and will be assigned ESX servers

 

The final component is what they call the Telemetry Project

-Monitor the runtime using open-source Hyperic

They have created hyperic adaptors to the Documentum products.

Integrated with the Integrien product (which now seem to be VMWare vCenter Operations)

Policy also includes upscaling configuration so it is easy to add more power to a configuration.

Automatic remedies like firing up an additional virtual machine

Total amount of metrics

 

Session optimizations

DFC Session Pooling

DFC frees session to pool if idle for 5 seconds

Expensive to switch context for users (to make sure they don’t see what the other users where doing)

 

Platform DFS Services/Platform Rest/Application Services

SOAP DFS/REST

DFC

Two type of services

Core Platform and CMIS on top of that

Generate Application Services based on modeling from xCP stack (simple to use REST services will be generated for a specific part of the model)

 

Builder Tools:

–       Application Modeling

–       UI Builder

Semantic Model of the Application

Generate Optimized Runtime
– Indices etc

The Value of xCP is not just the UI but the application services and optimzed runtime is also of great value. Argues that xCP is sometimes misunderstood in that sense.

 

Dormant State Feature D7

Needed to support cloud deployment

No downtime

Bring the server to a dedicate state for changes (read-only, stopped audit trail, stopped indexing).

Partial availability for users in this state.

The idea is to spread update load on different content servers

Rolling upgrade – continues operation – apply patches on by one

Snapshot of the vApp is possible because it is in a safe state

 

NGIS – Public Cloud

Goal is full-blown multi-tenent architecture

Tremendous investment in xDB over the past years.

Argues that xPlore now beat search vendors FAST, Autonomy, Endeca and since all of them are bought by a big player EMC now has access to solid search technology.

Tenant level backup in xDb 10

xDB/Xplore

–       XACML Security

–       Tree compression (previous version is stored as a change)

–       Search over history (storing complex graph that allow you to query all the versions)

–       Distributed Query Execution

 

Big Data becomes Big Information when you Put Smart on top of the data

Bring processing to the data rather than data to the processing

Impossible with the huge amounts of data of tomorrow to bring data to (central) processing nodes.

 

Plain Hadoop will not work in this case…plain MapReduce is optimzied for back-end.

We need real-time MapReduce processing a lot of research ongoing right now.

Stream-based (looking at Yahoo).

 

SmartContainers (next year)

Kazeon is integrated into NGIS

Offering a builder to model your metadata to generate the run-time

Early access program is available.