Category: Technology

Dave Kellogg on Palantir

I recently began reading the blog written by Dave Kellogg who is the CEO of Mark Logic, a company devoted to XML-based content management. I think I came to notice them when I discovered what cool technology EMC got when it bought X-hive which has now become Documentum xDb/XML Store. Mark Logic and X-hive was of course competitors in the XML Database market. In a recent blog post he reflects on the Palantir product after attending their Government Conference.

The main scope of his blog post is around different business models for a startup and that is not my expertise and I don’t have any particular opinion around that although I tend to agree and it was interesting to read his reflections of how other companies such as Oracle (yet another competitor to Mark Logic and xDb) have approached this.

Instead my thinking is based around his analysis of the product that Palantir offers and how that technology relates to other technology. I think most people (including Kellogg) mainly view Palantir as a visualisation tool because you see all these nice graphs, bars, timelines and maps displaying information. What they tend to forget is that there is huge difference between a tool that ONLY do visualisation and one that actually let you modify the data (actually modifying contextual data around them such as metadata and relations) within those perspectives. There are many different tools around Social Network Analysis for instance. However, many of them assumes that you already have databases full of data just waiting to be visualised and explored. Nothing new here. This is also what many people use Business Intelligence toolkits for. Accessing data in warehouses that is already their, although the effort of getting there from transactions oriented systems (like in retail) is not small in any way. However, the analyst using these visualisation-heavy toolkits access data read-only and only adds analysis of data already structured.

Here is why Palantir is different. It provides access to raw data such as police reports, military reports, open source data. Most of it in unstructured or semi-structured form. When it comes into the system it is not viewable in all these fancy visualisation windows Palantir has. Instead, the whole system rests on a collaborative process where people perform basic analysis which includes manual annotations of words in reports. This digital marker pen allows users to create database objects or connect to existing ones. Sure this is supported by automatic features such as entity extraction but if you care about data quality you do not dare to put them in automatic mode. After all this is done you can start exploring the annotated data and linkages between objects.

However, I do agree with Dave Kellogg that if people think BI is hard, this is harder. The main reason is that you have to have a method or process to do this kind of work. There are no free lunches – no point of dreaming about full automation here. And people need training and mindset to be able to work efficiently. Having played around with TIBCO Spotfire lately I feel that there is a choice between integrated solutions like Palantir which has features from many software areas (BI, GIS, ECM, Search etc) or using dedicated toolkits with your own integration. Powerful BI with data mining is best done in BI-systems whereas they probably never will provide the integration between features that vendors like Palantir offers. An open architecture based on SOA can probably make integration in many ways easier.

Why iPhone OS (iPad) is ECM…

I like Twitter. It exposes me for a lot of interesting thoughts from interesting and smart people that I follow. Today I read a post called  Why the iPad Matters – Its the Beginning of the End by Carl Frappaolo. It talkes a lot of why the iPad brings a new promise for content delivery – a complete digital chain. It made me think about one of the things which is unique with the iPod/iPhone/iPad – it is the lack of a folder-based file system exposed to users. Surprisingly (maybe) it is the lack of it that makes the whole user experience much better.

So how does this relate to ECM then? Well, I guess many of us ECM-evangelists (or “Ninjas” I heard today) have been in endless meetings and briefings explaining the value of metadata and the whole “context-infrastructure” around each object in an ECM-system that can hold fine-grained permissions, lifecycles, processess, renditions and so forth. I have even found myself explaining the ECM concept using the iTunes as an analogue. You tag the songs with metadata and access them through playlists which is in essence virtual folders where each song can be viewable in many playlists. That is the same concept as the “Show in folder” flag in Documentum. Metadata can even power Smart Playlists which in essence is just a saved search query – something we have added as a customization in Documentum Digital Asset Manager (DAM). So in essence the iTunes Library (should be call it a repository 🙂 is a lightversion of an ECM-system. Before continuing I really wonder why I have to customize Documentum to get the GUI-features that iTunes provide…?

So iTunes abstracts away the folder-based file system on a Mac or Windows PC but as long as you are using Mac OS X or Windows the file system is still there right. Some people even get really frustrated by iTunes and just can’t get around their head that there is no need to move files around manually when synching them to iPhone OS-powered devices. And here comes the beauty, in these devices there are no folder-based file system to access. Just the iPod App for music, the Photos App for photos and so forth. All your content is suddenly displayed in context and filtered out based on metadata and that App’s specific usage.

To some degree that means that the whole concept of iPhone OS-based devices not only can make content delivery digital but it can provide a much better user interface that is powered by all these ECM-features that we love (and have a hard time explaining). Suddenly we have an information flow entirely based on metadata instead of folder names and file names. Maybe that will make ECM not only fun but also able to much more quickly explain the dreaded “What’s in it for me question?”.

Now, can someone quickly write an iPad App for Documentum so I can make my point 🙂 It will be a killer app, believe me!

CPU, Cores and software licenses

In an article in ComputerWorld there is a good discussion around license models for different software vendors. There seem to be a mix of per socket pricing and some notion of a CPU and that each CPU corresponds to a number of processor cores. In EMC:s case for instance a CPU-license corresponds to 2 Cores and Oracle has a similar model. The number of processor cores is steadily increasing and soon it will be common with 6-8 cores per socket on server hardware. I agree with the article that that these models need some kind of revision. This is especially true if you sign longer contracts where this development can lead to some interesting issues. Server hardware need to be replace sooner or later because of power, storage or just performance reasons. It is not uncommon that the idea is to get fewer but more powerful servers in order to save power and cooling.

The interesting effect then is even if you can consolidate software applications on fewer hardware they each overstep their licenses in terms of server cores. What about virtualisation then? Well, that is of course also the future so power can be load-balanced between applications more easily. However, that means that the license model must allow for using virtualisation to throttle down to any number of cores per licensed application. In Oracle’s case again that usually means a requirement to run their own virtualisation product even if you have a VMWare investment.

The Long Tail of Enterprise Content Management

Question: Can we expect a much larger amount of the available content to be consumed or used by at least a few people in the organisations?

Shifting focus from bestsellers to niche markets
In 2006 the editior-in-chief of Wired magazine Chris Andersson published his book called ”The Long Tail – Why the Future of Business is Selling Less of More”. Maybe even the text printed on the top of the cover saying ”How Endless Choice is Creating Unlimted Demand” is the best summary of the book. This might have been said many times before but I felt a strong need to put my reflections into text after reading this book. It put a vital piece of the puzzle in place when seeing the connections to our efforts to implement Enterprise 2.0 within an ECM-context.

Basically Chris Andersson sets out to explain why companies like Amazon, Netflix, Apple iTunes and several others make a lot of money in selling small amounts of a very large set of products. It turns out that out of even millions of songs/books/movies nearly all of them are rented or bought at least once. What makes this possible is comprised out of these things:

Production is democratized which means that the tools and means to produce songs, books and movies is available to almost everybody at a relatively low lost.
– Demoractization of distribution where companies can broker large amount of digital content because there is a very low cost for having a large stock of digital content compared to real products on real shelves in real warehouses.
– Connecting supply and demand so that all this created content meets its potential buyers and the tools for that is search functions, rankings and collaborative reviews.

What this effectivly means is that the hit-culture where everything is focused on a small set of bestsellers is replaced with vast amounts of small niches. That has probably an effect of the society as a whole since the time where a significant amount of the population where exposed to the same thing at the same time is over. That is also reflected in the explosion of the number of specialised TV-channels and TV/video-on-demand services that lets views choose not only which show to watch but also when to watch it.

Early Knowledge Management and the rise of Web 2.0
Back in the late 90-ies Knowledge Management efforts thrived with great aspirations of taking a grip of the knowledge assets of companies and organisations. Although there are many views and definitions of Knowledge Management many of them focused on increasing the capture of knowledge and that the application of that captured knowledge would lead to better efficiency and better business. However, partly because of technical immaturity many of these projects did not reach its ambitous goals.

Five or six years later the landscape has changed completely on the web with the rise of Youtube, Flickr, Google, FaceBook and many other Web 2.0 services. They provided a radically lowered threshold to contribute information and the whole web changed from a focus on consuming information to producing and contributing information. This was in fact just democratization of production but in this case not only products to sell but information of all kind.

Using the large-scale hubs of Youtube, Flickr and Facebook the distribution aspect of the Long Tail was covered since all this new content also was spread in clever ways to friends in our networks or too niche ”consumers” finding info based on tagging and recommendations. Maybe the my friend network in Facebook in essence is a represention of a small niche market who is interested in following what I am contributing (doing).

Social media goes Enterprise
When this effect started spreading beyond the public internet into the corporate network the term Enterprise 2.0 was coined by Andrew McAfee. Inside the enterprise people where starting to share information on a much wider scale than before and in some aspects made the old KM-dreams finally come into being. This time not because of formal management plans but more based on social factors and networking that really inspired people to contribute.

From an Enterprise Content Management perspective this also means that if we can put all this social interaction and generated content on top of an ECM-infrastructure we can achieve far more than just supporting formal workflows, records management and retention demands. The ECM-repository has a possibility to become the backbone to provide all kind of captured knowledge within the enterprise.

The interesting question is if this also marks a cultural change in what types of information that people devoted their attention to. One could argue that traditional ECM-systems provide more of a limited ”hit-oriented” consumption of information. The abscense of good search interfaces, recommendation engines and collaboration probably left most of the information unseen.

Implications for Enterprise Content Management
The social features in Enterprise 2.0 changes all that. Suddenly the same effect on exposure can be seen on enterprise content just as we have seen it on consumer goods. There is no shortage of storage space today. The amount of objects stored is already large but will increase a lot since it is so much easier to contribute. Social features allows exposure of things that have linkages to interests, competencies and networks instead of what the management wants to push. People interested in learning have somewhere to go even for niche interests and those wanting to share can get affirmations when their content is read and commented by others even if it is a small number. Advanced searching and exploitation of social and content analytics can create personalised mashup portals and push notifcations of interesting conent or people.

Could this long tail effect possibly have a difference on the whole knowledge management perspective? This time not from the management aspect of it but rather the learning aspect of it. Can we expect a much larger amount of the available content to be consumed or used by at least a few people in the organisations? Large organisations have a fairly large number or roles and responsibilities to there must reasonably be a great difference in what information they need and with whom they need to share information with. The Long Tail effect in ECM-terms could be a way to illustrate how a much larger percentage of the enterprise content is used and reused. It is not necessarily so that more informtion is better but this can mean more of the right information to more of the right people. Add to that the creative effect of being constantly stimulated by ideas and reflections from others around you and it could be a winning concept.

Sources

Andersson, Chris, ”The Long Tail – Why the Future of Business is Selling Less of More”, 2006
Koernan, Brendan I, ”Driven by Distraction – How Twitter and Facebook make us more productive workers” in Wired Magazine March 20

Is Etiquette and Netiquett different? Should it be?

Lately I have started think about how social rules IRL(in real life) and using digital media really works. As everything else in society all these rules vary to some degree between situations and are affected by who you are interacting with. The question is what is considered being good tone and what is considered to be rude nowadays. Humans are really good at sending signals “between the lines” using diplomatic language with hints and insinuations and using body language to signal different emotions which then other humans are differently skilled at interpreting or even caring about at all.

In normal day-to-day conversations around a table it is generally considered rude to ignore what someone is saying or even refrain from answering direct questions. Over the phone or a voice chat it is similar but body language isn’t communicated (unless using video chats) and you can therefore afford to look bored, do faces or whatever while somebody is talking in the other end. As long as we are doing synchronous (real-time) voice communication a lot of the social rules for IRL seem to apply.

When the mobile phone rings you either answer or don’t but most people choose to call back at a later time to see what that person had on their mind. To me that is a good example of a social rule in modern society. Can one expect someone to call back if we have bothered to call them? Or is the social rule that if it important (enough) you expect someone to try again? Is therefore a repeated set of calls in a short matter of time a sign of urgency?

Getting an text message (SMS) notifying me that I have a voice message usually also signfies a sense or urgency or importance which I usually find results in a call back to me. However, I believe here is another area where we see a change in social interactions because the mobile phone is always with us and always on. Many people today bring their phone everywhere which includes meetings,vistit at friend’s and dinners. That means that is has been regarded ok to not answer because you are not able to talk at that specific time. Reasonable that has also meant that people choose not to answer when someone is calling and you don’t feel like talking to them.

Text chats seem nowadays to bridge synchronous and asynchronous communication. In one sense it is real-time because you can interact very rapidly and if both are typing really fast it can become a fast paced discussion. In general I also think that in the early days of Instant Messaging (IM) the siginificance of a text chat was higher than it is today. If you got that pop-up window with a bleep I usually switched my focus on that and bothered to answered directly. Today, we see IM going really mainstream and becoming a part of corporate infrastructures often with the argument of replacing some emails. That means that IM text chats are to some respects a replacement of asynchronous messaging (often email) where you type something up which does not really require an immediate response but something you want your co-worker to be aware of. You know that people are in meetings, talk to people around them and therefore can’t be expected to pay attention to all incoming IM-messages right away. That IM-message has then became an asynchronous message that gets read minutes or even hours later. Socially that must mean that there is an acceptance of IM-messages not being answered to directly and therefore not considered a rude behaviour. However, I do believe that it is a little but rude to ignore replying to an IM at all or at least mentioning that in an email or the next IM-chat.

I personally think it is really cool to be online at all times but the question is if that also means a committment(personal domain) or responsibility (corporate domain) to also answer and interact as soon as you can? In my personal domain I think the way IM is used has changed a bit over the years. In the early days of iChat we were intensely chatting often but nowadays it has almost shifted that an IM is done only when you have something important to say and therefore almost “worthy” of a phone call but just almost. The way IM works is that it usually doesn’t require your full attention the way a phone call does. Nowadays you do IM while doing something else.

I wonder if this means that all means of communcation changes in the way we see them as requiring our attention or how important they are to us. We I got my first Internet connection back in 1995 I think I considered an email being somewhat the same as an old-fashion letter. It was carefully drafted and sent with some sense of importance and thus requiring an answer. Over time email also became a way to share information “for-your-information” rather than something requiring a direct reponse. Email became a way to share information more casually. Compared to writing a letter it is so much easier to copy a text or just send a link to a web page. More of anything can often mean that the sense exclusiveness goes away somewhat, unless you are in love of course when I guess many love messages only make things better in most cases.

Social media (such as FaceBook) brings the sharing aspect of information to a whole new level. Nowadays you can share your current situation where you express what you are doing, how feel and what you are about to do. Thing that differes social media networks from web pages with information is that it usually assumes you have some sort of relationship to people who you are sharing your information with. The information is personalised and therefore to a higher degree targeted by you. Just as you expect a reaction to something you say over dinner about what is happening in your life I guess many people who post their “status” on FaceBook hope or desire some kind of reaction to it. Congratulations to good things that happen and expressions of compassion when bad things happen in their life. So as we are getting more and more information about people around us the questions how we do handle the social rules about all this social information. Is it rude to not read or try to keep up-to-date about someone you know? Do you expect comments from these people around events in your life? Is a FaceBook message something that require an answer just as we might think of an email or an IM-chat?

No matter if how reactions or response will arrive the increase of information streams (should we call personal ones life streams?) coming from sources you have chosen will most likely affect people who consume them to some degree. In an era of mass information and need for affirmation it can be confusing when different people apply different social rules to all these communication possibilities. Some people apply the social rules of IRL strictly and get offended when people don’t follow them. Others are very relaxed about the whole thing and don’t feel obliged to do anything at all. The thing that confuses me is when the level of obligation is determined out of someones particular view of a specific tool rather than their relation to the person they have a relationship with.

EMC World 2009: Enterprise Search Server (ESS)

To me one of the biggest news delivered during the conference was the new generation of Documentum full text indexing called the Enterprise Serch Server (ESS). This marks the first official message that EMC Documentum will move away from the OEM-version of FAST ESP which has been in use since Documentum 5.3 (2005). The inclusion of FAST back then meant that Documentum got a solution where metadata from the relational database where merged with text from the content file into an XML-file (FTXML) that could be queried using DQL. Before diving into the features of the new technology I guess everyone wonders about the reason for this decision. The main reasons are said to be:

  • Performance. 1 FAST Full-text node supports up to around 20 Million objects in the repository (some customers commented that their experience were closer to 10 M…) and it requires in memory indices. With Documentum installations containing Billions of objects that means 100+ nodes and that has been a hard sell in terms of hardware requirements.
  • Virtualisation. Apparently talks with Microsoft/FAST about the requirement on supportin all Documentum products on VMWare made no progress. This has been a customer demand for some time. MS/FAST cites intensive I/O-demands as a reason why they where not interested in certifying the full-text index on virtualisation.
  • NAS-support.
  • More flexible High Availability (HA) options. Today FAST can be clustered by adding new nodes which leads to a requirement of having the same amount of nodes for backup/high availability.

From a performance stand-point I personally think that the current implementation of FAST lead to slow end-user experience when searching in Documentum. One reason for this is that a search is first triggered to FAST which then delivers a search result set irrespective of my permissions. Instead the whole result set must be filtered by quering it towards the relational database. That takes time. This is also a reason why we have integrated an external search engone based on the more modern FAST ESP 5.x server with Security Access Module which means that acl:s are indexed and filtering can be done in one step when searching in the external FAST Search Front-end (SFE). More about how that is solved in ESS later on.

From a business perspective EMC outlines these challenges they see a need to satisfy:

  • End users expect Google/Yahoo search paradigms
  • IT-managers want low cost, scalable, ease of deployment and easy admininstration.
  • Requirements for large scale, distributed deployments with multiingual support.
  • Enterprise requirements such as low cost HA, backup/restore and SAN/NAS-suppprt.

New new ESS is based on the xDb technology coming from the aquisition of the company X-hive and leveraging the open source full-text indexing technology in the Lucene project. The goal for ESS is to leverage the existing open indexing architecture in Documentum. The idea is both to create a solution that really scales but of course with some trade-offs when it comes to space vs query performance.

ESS supports structured and unstructed search by leveraging best of breeed XML Database and XQuery Standards. It is designed for Enterprise readiness, scalabiity, ingestion throughput and high quality of search as core features. It also provides Advanced Data Management (enables control where placement of data on disk is done) functionality necessary for large scale systems. The intention is to give EMC to continue to develop and provide new search features and functionality required by their customer base.

It is architected for greater scalability and gives smaller footprint than current Full-Text Search as well as scale both horisontally (more nodes) as vertically (more servers on the same node). It is designed to support tens to hundreds of millions of objects per node.

This allows for solutions such as Archiving where there can be Billion+ emails/documents while preserving the high quality of search while still achieving scale. The query response time can be throttled up or down based on needs – priority can be shifted between indexing and quering.

The installation procedure is also simplified and EMC promises that a two node deployment can be up and running in less than 20 minutes. The solution is also designed to easily allow to add new nodes to an installation.

ESS is much more than a simple replacement of the full-text engne. It will focus on deliver these additional features compared to existing solutions:
– Low cost HA (n+1 Server based)
– Disaster Recovery
– Data Mangement
– VMWare Support
– NAS Support
– New Administration Framework

The new admin features includes a new ESS Admin interface which has a look and feel very similar to CenterStage. Since the intention is to support ESS on non-Documentum installation it is a separate web client. The framwoork also supports Web Services, Java API, JMX and it is open for administration using OpenView, Tivoli, MMC etc.

The server consists of:

  • ESS API
  • Indexing Services will have document batching capability, callback support for searchable indication and a Content Processing Pipeline with text extraction and linguistic analysis via CPS.
  • Search Services. This will provide search for meta-data, content or both (XQuery based) as well as multiple search options such as batching, spooling, filters, language, analyser etc. It will return results in a XML format and provides term highlight, summary and relevancy. The thread execution management support multi-query and parallell query. It also includes low level security filtering.
  • Content Processing Services is responsible for language detection, text extraction and linguistic analysis. The CPS can be local or remote (co-located with content for improved performance). It will have a pluggable architecture to support various analysers and/or text extractors. It will include out of the box support for Basis RLP and Apache SnowBall analysers. However only one analyser can be configured per ESS. (My question: Can I have different analysers on different nodes?). Content Processing can be extended by plugins.
  • Node and Data Management Services is the primary interface for all data and node management within ESS. It provides ability to control routing of documents and placements of collections and indices on disk. It deals with index management and supports bind, detach, attach, merge, freeze, read-only etc.
  • Analytics includes API’s and Data model for logging, metrics and auditing, ingestion and search analysis and facet computation services.
  • Admin Services. The example shown was really powerfull very an admin could view all searches made by a user by time and see what time it took to first result set. The one with a longer time could be explored by viewing the query to analyse why it took so long.

Below that the xDB can be found and in the botton the Lucene indices. The whole solution is 100% Java and xDb stores XML Documents in a Persistend DOM formats and support XQuery and XPath. Indices conists of a combination of native B-tree indices + Lucene. The xDb supports single and multi-node architecture and has support for multi-statement transactions and full ACID support. In additon it supports XQFT (see introduction it here) which is a proposed standard extension to XQuery which includes:

  • LQL via a full text entension
  • Logical full-text operator
  • Wildcard option
  • Anyall options
  • Positional filters
  • Score variables

ESS includes native security which means that security is replicated into the search server and security filtering is done on a low level in the xDb database. This means effective searches on large result sets and enables facet computation on entire result sets.

Native facet computation is a key feature in ESS which is of course linked to the new search interface in CenterStage which is based on facets in an iTunes-like interface. Facets are of course nothing new but it is good that EMC has finally realised that it is a powerful but still easy way to give users “advanced search”.

ESS Leverages a Distributed Content Architecture (for instance using BOCS) by only sendning the raw text (DFTXML) over the network instead of the binary file which can be very much larger in many cases (such as big PowerPoint files). ESS also utilizes the new Content Processing Services (CPS) as well as ACS.

The new solutions also makes it possible to do hot backups without taking the index server down before as it is today. Backup and restore can be done on a sub-index level. The new options for High Availability include:

  • Active/active shared data (the only one available for FAST)
  • Active/passive with clusters
  • N+1 Server based

Things I like to see but have not heard yet:

  • Word frequency analysis (word clouds based on document content)
  • Clustering and categorisation (maybe done by Content Intelligence Services)
  • Synonym management
  • Query-expansion management
  • How document similarity is handled by vector-space search (I guess done by Lucene?)
  • Boosting & Blocking of specific content connected to a query
  • Multiple search-views (different settings for synonyms, boost&blocking etc)
  • Visualisation of entity extraction and other annotations
  • Functionality or at least an API to manually edit entity extraction within the index. Semi-automatic solutions are the best.
  • Freshness management.
  • Speech-to-text integration (maybe from Audio/Video Transformation Services)

Personally I think this is a much needed move to really improve the internal search in Documentum and make much better use of the underlying information infrastructure in Documentum. It will be interesting to see what effect this has on Microsoft/FAST ambitions to support the Documentum connector. Maybe the remaining resources (no OEM to develop) can focus on bringing the connector from an old 5.3 API to a modern 6.5 API. I still see a need for utilising multiple search engines but as ESS gains more advanced features the rationale for an expensive external solution can change. The beta for Content Intelligence Studio will be one important step in outlining the overall enterprise search architecture for big ECM-solutions. In this lies of course tracking what Autonomy brings to market in the near future.

Another thing worth mentioning is that I during the past four conferences have heard quite a few complaints about the stability of the current FAST-based full-text index. It crashes/stops reguarly and often without letting anybody knowing it before users start complaing about strange search results.

A public beta will be released in Q3 2009 and customers are invited to participate. Participants will recieve a piece of hardware with the ESS pre-installed and pre-configured and after a few configuration changes in Content Server it should be up an running.

Customers will have the option of upgrading existing FAST full-text index  or run the new ESS side-by-side with FAST. ECM will also market ESS for non-Documentum solutions.

Be sure to also read Word of Pie’s notes as well as my previous notes from FAST Forward 09 around the future of FAST ESP.

EMC World 2009: Reflections from the Momentum conference

A very hectic week has passed by and EMC World 2009 is over. Just as I did last year I felt like reflecting a about the conference.

First of all many thanks to EMC for listening us and improving a lot of things from the last year. I have been to EMC World 07 and 08 and during both these occasions I felt a little lost as a Documentum customer among all these storage and virtualisation people. Back then I heard people referring with love to past Momentum conferences where the sense of community was there. In November 08 I had the chance to go to Momemtum in Prague as a speaker and it was actually a bit different from EMC World. Suddenly all the focus was on Documentum.

Things well done

So the establishment of a Content Management & Archiving (CMA) Community was just what we all needed. We all got yellow ribbons with text “Momentum” to attach to our badges which made us all much more visible to each other. We got all the sessions in the same area which meant no more running around and the chance to bump into people with those ribbons. Intead of having a very thick catalogue with all sessions merged together into a giant schedule we got our own CMA Show Guide which was really easy to use and made life much easier to me. Next to all the sessions we had a beautiful  Momentum Lounge which was manned all day around. You could even meet CMA executives for drinks after sessions on Wednesday and Thursday. It had nice sofas and chairs togeter with soft red lighting which made it quite cosy. In the solutions exhibitions all CMA Booths were gathered in the same area with a separate graphic profile then the rest of the EMC booths. Around the CMA booth you found all the CMA partners co-located. Finally we had our own CMA Party on Monday evening which was well attended as far I saw. In addition to that we finally seem to have a working online community both for Documentum and XML Technologies.

[nggallery id=4]

There was also a great thing to create a Blogger’s lounge where all people who blogged and Twittered could register. Outside the lounge there was a large screen displaying what we all were saying more or less live. And the Vanilla Latte served there was a life saver! On Tuesday their Barista started making mine as soon as I passed the entrance 🙂 What a service! I think EMC actually made social media into a working business tool here. Really something to build on. If you have not done it search for #emcworld on Twitter to see what it was all about.

I attened one Product Advisory Forum (PAF) around the new Enterprise Search Server (ESS) and that was a great experience. Ed Bueche and Aamir Farooq did a great job to inspire great discussions between us customers and the engineering team. I attended PAFs in Prague as well and those were also a great part of the conference.

We had access to wireless internet all around the conference area and that is vital for a conference like this. Especially for us who Blog and Tweet!

Things to improve

First of all EMC is a company which has a payoff saying “Where Information Lives” and touts itself as an information infrastructure company. I assume that all means digital information and is it something we Documentum people care about then it is information management. Then it does make a lot of sense taking notes and searching the web on a laptop computer during sessions. After all we are IT-nerds 🙂 Please get us some rooms with sufficient number of power outlets!

Why not even extend it further and use your own technology to integrate Tweets, Blog posts with the conference schedule so we more or less can interact live around sessions. It would even make sense for me at least to being able to register that I am attending a conference (voluntary of course) using the online profile community that alre which would make it even easier

There seem to be fewer sessions in general and especially I believe the number of developer oriented ones have become significantly fewer. I am not a coder myself so I actually think it makes sense to have sessions focused on people writing code and other with different advancement levels for us focusing on architectures, features and business cases. Another thing I noted is that there are no call for papers to EMC World the same way it works for Momentum (Europe). I think use cases from customers are an important part of the conference and it would be great to find a way to get them back in.

Please also have a look at what Word of Pie had to say about this year’s conference.

See you next year in Boston!

EMC World 2009: What is new with Digital Asset Management

Media Work Space
Controlled release in June 30th targetted at internal use at EMC Marketing, General Availability will come later this year. Still licensed with DAM. The new release will support Images, Presentation, Audio and video.

It will introduce a new gridless view which lists all objects as list with columns for attributs. Gridless view also can can show thumbnails at the left end of each line. There will also be a storyboard view much like the one existing in today’s Digital Asset Manager.

MWS will now have support for comments – which can interact with CenterStage comments.

Personalised Dashboard include the following views:

  • QuickFlows
  • Most Popular Assets
  • Recently Viewed Assets
  • Recently Updated Assets

To met that looks like they have starting to think in terms of Information Analytics…There is now also a feature to show the accumulative rating among users.

They see a need for customisations and an SDK or similar will be released during 2009

The Inbox allows to open a quickflow which actually was really nice-looking with attached images as thumbnails below. Looked rather similar to an email message which is the right way to go I think.

QuickSearch now supports searching on any index data.

Advanced Search has a tab called General and then for Presenation, Video, Audio and Images which allows for a higher level restriction of search.
Search on properties for instances image with a certain pixels…

There is new Presentation slide view which looks way more flexible than current PowerPoint assembly. Looks actually like viewing/reviewing slides now can be done completely without opening the application.

The view below the preview of the slides has tabs for Metadata, Versions, Rendtions, Comments, Permission Relationships

Slides can be rated and metadata can be editied just by clicking in the fields.

Video view supports thumbnails but also preview of the video utilzing FlipFactory. Looked like the previewer was using Flash.

FileSharing Services, My Documentum and Documentum for Outlook will be merged into a new MyDocumentum product and then moved into the Knowledge Worker group. Documentum Connector for InDesign & Quark Xpress are also part of My Documentum but from a Digital Asset Management side of  the house.

Many companies have 3D-data which comes from different CA-systems. Therefore they have started to develop CAD Integration with in Documentum with support of Right Hemisphere Integration (press release) which supports viewing data from 80 CAD/PLM-systems.

The solutions allows customers to request and repurpose derivatives
Flat Iron Solutions have a demonstration in the exhibition area at EMC World 2009.

Content Transformation Services

There are mainly bug fixes and some Improvements on the performance for the OEM products they are using mainly on the image side of the house.

CTS now includes support for for Adobe CS3 & CS4
There is an SDK for CTS which can be used to handle custom encoders….from my point of view the obvious question is whether or not i make sense to develop support for GIS-formats?

The next release of MWS will probably be available in September 2009.

There is available technology in the the platform to support annotations on video files but not yet exposed.

Aility to show forms in a Flex environment is something they are working on and it seems fairly important especially for us who use both TaskSpace and DAM with Forms.

VISION
The main areas which they focus on are:

Web Experience Management
Customer Comms Management (build websites based on preferences)
Customer Intelligence Management
Marketing Process Management
Brand Management include:
– Presentation
– Video
– Image
– Collateral
– 3D Image
– Agency Collaboration

MidYear
– New version of Presentation Assembly

End of Year
– MWS Pro
– Integrated Collaboration and Publsihing
– Campaign Management
– Marketiing and Web Metrics Tracking
– KPI
– Rapid and Setup of Brand

D7 – 2010
– MWS Field Editin
– SalesForce integration
– Support of Personalised Customer

MWS Pro
– Different Libraries as Tabs

Q1 2010 MWS & DAM Sp3

Last time to see two Space Shuttles on the pads

[singlepic id=5 w=320 h=240 mode=web20 float=center]

In preparation for the upcoming mission to service the Hubble Space Telescope there is now a unique sight at the Kennedy Space Center (KSC) in Florida. Space Shuttle Endeavour has been moved to 39B as a rescue vehicle if something goes wrong with Space Shuttle Atlantis during the servicing mission. It is truly a unique sight and will never been seen again since the pad is turned over the next generation rocket program at NASA and Shuttle missions will end in 2010. The sight is almost like a real version of the motion picture Armageddon which also has two shuttles on the pads, although some spaced-up future versions 🙂

Here is another great picture of them

Where the FAST Enterprise Search Platform (ESP) is going now…

I have spent the last week in Las Vegas attending the FAST Forward 09 conference. About a year ago the Norvegian company FAST Search & Transfer was acquired by Microsoft and like me customers all over the world wonder what would happen. Some thought it was great to have a huge company with its R&D resources to take the platform forward while others like me feared a technology transition which would include cancelling support for other operating systems and integration with nothing but Microsoft technology.

It was very clear that the Microsoft Marketing department had a lot to say about the conference and what messages that were to be conveyed. Somewhere behind all that you could still see some of the old FAST mentality but it was really toned down. To me the conference was about convincing existing customers that MS is committed to Enterprise Search and to give Sharepoint customers some idea of what Enterprise Search is all about.

It is clear that the product line is diversifying in a common Microsoft strategy:

Solutions for Internet Business

  • FAST Search for Internet Business
  • FAST Search for Sharepoint Internet sites
  • FAST AdMomentum
  • Solutions for Business Productivity

  • FAST Search for Sharepoint
  • FAST Search for Internal Application
  • FAST Search for Sharepoint won’t be available until Office Wave 14 (incl Sharepoint) will be released so in the meantime there will be a product called FAST ESP for Sharepoint that can be used today and will have a license migration path towards FAST Search for Sharepoint. That product will have product license of aroudn 25 000 USD and then additional Client Access License (CAL) will follow in a standrad MS manner.

    So what does all of this means for us who like to see FAST ESP continue as an enterprise component in a heterogenous environment? Well, MS has commited to 10 years of support for current customers, I guess in a gesture towards those who are worried. Over and over again I heard representatives talking about how important those high-end installations on other operating systems are. The same message appeared when it came to connectors and integration with Enterprise Content Management systems like EMC Documentum. Still, most if not all demos was connected to Sharepoint and/or other MS-specific technologies.

    The technical roadmap means that the past year has been devoted in rewriting their next generation search platform from Java to .Net. The first product that will be released is the Content Integration Studio (CIS) which consist of Visual Studio (I guess earlier in Eclipse) component and a server-side execution engine. This will only be available on Windows since it is deeply connected to the .Net-environment. It looks like a promising product with support for flows instead of linear pipeline to handle the processing of information before it is handed of to the index engine. CIS therefore sits in-front of FAST ESP and a combination of actions in flow and in old pipelines can be executed. Information from CIS is written to the ESP which then creates the index and also processes queries to it.

    What I think we can expect is that new innovation is focused on creating a modular architecture where CIS is the first one. Features in ESP will the be gradually reengineered in a .Net-environment and thus creating a common search platform some years into the future. It will likely mean that we will still see one or two upgrades to the core ESP as we know it today to enable it to function together with the new components. Content Fusion will most likely be the next module that will extend ESP but on a .Net-architecture.

    When it comes to the presentation logic where we today have the FAST Search Front-End (SFE) we will see them either as Web parts for Sharepoint or as AJAX Aerogel from MS. These are currently developed using Javascript but will include Silverlight later on.

    These will initially be offered in both a IIS and a Tomcat flavour and possibly others if there is demand. They will intitially integrated with ESP and Unity and thus opening up for a new approach of developing a search experience on top of them.

    I general I don’t like the Microsoft approach of insisting of owning the whole technology stack by themselves and refusing to invest in other standards-based projects. Instead of developing their own AJAX libraries they could have used ExtJS or even Google Web Toolkit. While it is not open source MS argues that it is a very Permissive licence from MS that has many of the same qualities. A good thing is that MS was comitted to make sure that this framework works on all major browsers including FireFox, Safari and Chrome. It is interoperable with JQuery.

    In summary I think it is kind of a mixed experience. The new features being developed are truly needed to make FAST keep being one of the most advanced search engines available. I think many of the features look really promising and I can’t wait to get my hands on then. On the other hand it is clear that things are going proprietary (FAST ESP had a lot of open source in it), it is being aligned in a Microsoft stack and thus gradually minimizing options. That includes how new technologies are being implemented (MS-ones instead of open source), what operating systems it will run on and how the support for developing presenation logics look like. It means I have to have people how know both Java and .Net, both Flash and Silverlight (possibly JavaFx) and both ExtJS/GWT and MS AJAX/Aerogel.

    We are deeply invested in the EMC Documentum Platform and would of course like to continue use ESP as a way to add advanced capabilities and performance to our architecture. However, I think I will over time get sick and tired on Microsoft sales people trying to convince me to use Sharepoint instead of Documentum. For anybody who know how both platform work it is almost a joke but I will most likely have to keep explaining and explaining. I just hope that we can have decent connector developed for Documentum.

    Too read more you can go to the FAST Forward Blog which has many interviews, look at videos at the Microsoft Press Room and check out the chatter on ffc09 tagged tweets on Twitter. An finally here is what CMS Watch has to say about it.