Year: 2010

Business Intelligence: Sometimes a problematic term

I often find myself in between the world of military language and the completely different language used in the information technology domain. Naturally it didn’t take long before I understood that term mapping or translation was the only way around it and that I often can act like a bridge in discussions. Understanding that when one side says one thing it needs to be translated or explained to make sense in the other domain.

Being an intelligence officer the term Business Intelligence is of course extremely problematic. The CIA has a good article that dives into the importance of defining intelligence but also some of the problems. In short I think the definition used in the Department of Defense (DoD) Dictionary of Miltary and Associated Terms can illustrate the core components:

The product resulting from the collection, processing, integration, evaluation, analysis, and interpretation of available information concerning foreign nations, hostile or potentially hostile forces or elements, or areas of actual or potential operations. The term is also applied to the activity which results in the product and to the organizations engaged in such activity (p.234).

The important thing is that in order to be intelligence (in my area of work) it both has to gone through some sort of processing and analysis AND only cover things foreign – that is information of a certain category.

When I first encountered the term business intelligence at the University of Lund in southern Sweden it then represented activities done in a commercial corporation to analyse the market and competitor. It almost sounded like a way to take the methods and procedures from military intelligence and just apply it in a corporate environment. Still, it was not at all focused on structured data gathering and statistics/data mining.

So when speaking about Business Intelligence (BI) in a military of governmental context it can often cause some confusion. From an IT-perspective it involves a set of technical products doing Extract-Transform-Load, Data Warehousing as well as the products in the front-end used by analysts to query and visualise the data. Here comes the first issue of a more philophical issue when seeing this in the light of the definition of intelligence above. As long as the main output is to gather data and visualising it using Enterprise Reporting or Dashboards directly to the end user it ends up in a grey area whether or not I would consider that something that is processed. In that use case Business Intelligence sometimes claims to be more (in terms of analytical ambitions) than a person with an Intelligence-background would expect.

Ok, so just displaying some data is not the same thing as doing indepth analysis of the data and use statistical and data mining technology to find patterns, correlations and trends. One of the major players in the market, SAS Institute, has seen exactly that and has tried to market what they can offer as something more than “just” Business Intelligence by renaming it to Business Analytics. That means that the idea is to achieve “proactive, predictive, and fact-based decision-making” where the important word is predictive I believe. That means that Business Analytics claims to not just visualise historic data but also claim to make predictions into the future.

An article from BeyeNETWORK also highlights the problematic nature of the term business intelligene because it is often so connected with data warehousing technology and more importantly that only part of an organisation’s information is structured data stored in a data warehouse. Coming from the ECM-domain I completely agree but it says something about the problems of thinking that BI covers both all data we need to do something with but also that BI is all we need to support decision-makers. The article also discusses what analysis and analytics really mean. Looking at Wikipedia it says this about data analysis:

Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making.

The question is then what the difference is between analysis and analytics. The word business is in these terms also and that is because a common application of business intelligence is around an ability to measure the performance through an organisation through processess that are being automated and therefore to a larger degree measurable. The BeyeNETWORK article suggests the following definition of business analytics:

“Business analysis is the process of analyzing trusted data with the goal of highlighting useful information, supporting decision making, suggesting solutions to business problems, and improving business processes. A business intelligence environment helps organizations and business users move from manual to automated business analysis. Important results from business analysis include historical, current and predictive metrics, and indicators of business performance. These results are often called analytics.”

Looking at what the suite of products that is covered under the BI-umbrella that approach downplays that these tools and methods have applications beyond process optimization. In law enforcement, intelligence, pharmaceutical and other applications there is huge potential to use these technologies to not only understand and optimize the internal processes but more importantly the world around them that they are trying to understand. Seeing patterns and trends in crime rates over time and geography, using data mining and statistics to improve understanding of a conflict area or understanding the results of years of scientific experiments. Sure there are toolsets that is marketed more along words like statistics for use in economics and political science but those applications can really use the capabilities of the BI-platform rather than something run on an individual researchers notebook.

In this article from Forbes it seems that IBM is also using business analytics instead of business intelligence to move from more simple dashboard visualizations towards predictive analytics. This can of course be related to the IBM acquisition of SPPS which is focused on that area of work.

From the book written by Davenport and Harris, 2007

However, the notion of neither Business Intelligence nor Business Analytics says anything about what kind of data that is actually being displayed or analyzed. From a military intelligence perspective it means that BI/BA-tools/methods are just one out of many analytical methods employed on data describing “things foreign”.

In my experience I have seen that misunderstandings can come from the other end as well. Consider a military intelligence branch using…here it comes BI-software…to analyse incoming reports. From an outsider’s perspective it can of course seem like what makes their activity into (military) intelligence is that they use some form of BI-tools and then present graphs, charts and statistical results to the end user. Resulting from that I have heard over and over again that people believe that we should also “conduct intelligence” in for instance our own logistics systems to uncover trends, patterns and correlations. That is wrong because an intelligence specialists are both skilled in analytics methods (in this case BI) and the area or subject they are studying. However, since these tools are called Business Intelligence the risk for confusion is high of course just because of the Intelligence word in there. What that person means is of course that it seems like BI/BA-tools can be useful in analysing logistics data as well as data of “things foreign”. A person doing analysis of logistics should of course be a logistics expert rather than an expert in insurgency activities in failed states.

So lets say that what we currently know as the BI-market evolves even more and really assumes a claim to be predictive. A logical argument on the executive level to argue that the investment must provide something more than just self-serve dashboards. From a military intelligence perspective that becomes problematic since all those activities does not need to be predictive. In fact it can be very dangerous if someone is let to believe that everything can be predictive in contemporary complex and dynamic conflict environment. The smart intelligence officer rather need to understand when to use predictive BI/BA and when she or he definitely should not.

So Business Intelligence is a problematic term because:

  • It is a very wide term for both a set of software products and a set of methods
  • It is closely related to data warehousing technology
  • It includes the term intelligence which suggests doing something more than just showing data
  • Military Intelligence only covers “things foreign”.
  • The move towards expecting prediction (by renaming it to Business Analytics) is logical but dangerous in a military domain.
  • BI still can be a term for open source analysis of competitors in commercial companies.

I am not an native English speaker but I do argue that we must be careful to use such as strong word as intelligence when it is really justifiable. Of course it is still late for that, but still worth reflecting on.

Enhanced by Zemanta

EMC & Greenplum: Why it can be important for Documentum and ECM

It's the logotype of Greenplum
Image via Wikipedia

Recently EMC announced it was acquiring the company Greenplum which many people interpret as EMC is putting more emphasis on the software side of the house. Greenplum is a company that focuses on data warehousing technology for the very large datasets that is called “big data” applications where the most public examples are Google, FaceBook, Twitter and such. Immediate reactions to this move from ECM is of course it is a sign of market consolidation and a desire play among the largest players like Oracle/Sun, IBM and HP by being able to offer a more complete hardware/software combo stack to its customers. Oracle/Sun of course has its Exadata machine as an appliance-based model to get data warehousing capability. Chuck Hollis comments on this move by hightlighting how it is a logic move that fits nicely both with EMC storage techonology but also of course the virtualisation technology coming out of VMWare. To highlight the importance EMC will create a new Data Computing Product Division out of Greenplum. As a side note I think it is better to keep the old name to keep the “feeling” around the product just as Documentum is a better name than the previous Content Management & Archiving Division. After an initial glance of Greenplum it seems to be an innovative company that can solve business problems where established big RDBM vendors does not seem to be able to scale enough.

With my obvious focus is on Enterprise Content Management I would like to reflect how I think or maybe hope this move will matter to that area of business. In our project we started looking deeper into the data warehousing and business intelligence issues in January this year. Before we had our focus in implementing a service-oriented architecture with Documentum as a core component. We already knew that in order to meet our challenges around advanced information management there was a need to use AND integrate different kind of products to solve different business needs. ECM for the unstructured content, Enterprise Search to provide a more advanced search infrastructure, GIS-technology to handle maps and all spatial visualisation and so on. Accepting that there is no silver bullet but instead try to use the right tool for the right problem and let each vendor do what it is best at.

Replicate data but stored differently for different use cases
SOA-fanatics has a tendency to want very elegant solutions where everything is a service and all pieces of information is requested as needed. That works fine for more steady near realtime solutions where the assumption is that there is a small piece of information needed in each moment. However, it breaks down when you get larger sets of data that is needed to do longer term analysis, something which is fairly common for intelligence in support of military operations. If each analyst requests all that data over a SOAP-interface it does not scale well and the specialised tool that each analyst needs is not used to its full potential. The solution to this is to accept that the same data needs to be replicated in the architecture for performance reasons sometimes as a cache – common for GIS-solution to get responsive maps. However, there is often a need for different storage and information models depending on the use case. A massive audit trail stored in an OLTP-system based on a SQL-database like Documentum will grow big and accessing it for analysis can be cumbersome. The result is that the whole system can be slowed down just because of one analysis. Instead we quickly understood the need for a more BI-optmized information model to be able to do massive user behaviour analytics with acceptable performance. It is in the end a usability issue I think. Hence the need for a data warehouse to offload data form an ECM-system like Documentum. In fact it not only applies for the audit trail part of the database but also makes up for excellent content analytics by analysing the sum of all metadata on the actual content objects. The audit trail reveals the user interaction and behaviour while the content analytics parts gives a helicopter perspective on what kind of data is stored in the platform. Together the joined information provide quite powerful content/information and social analytics.

Add a DW-store to Documentum?
The technology coming from X-hive now has become both the stand-alone XML database xDB but also of course the Documentum XML Store that sits beside the File Store and the relational database manager. That provides a choice to store information as a document/file in the file store, as structured information in the SQL-database or as XML Documents in the XML Store. Depending on use case we have the choice to choose the optimal storage together with different ways of accessing it. There are some remarkable performance numbers looking at running Xqueries on XML Documents in the XML Store as being presented at EMC World 2010. Without knowing how it makes any sense from an architecture perspective I think it would be interesting to have a Data Warehouse Store as yet another component of the Documentum platform. To some degree it is already in there within the Business Process Management components where the Business Activity Monitor in reality is a data warehouse for process analytics. Analysis is off-loaded from the SQL-database and the information is stored in a different way to power the dashboards in Taskspace.

Other potential pieces in the puzzle for EMC
I realize that Greenplum technology is mainly about scalability and big data applications but to me it would make sense to also use the technology just as xDb in Documentum to become a data warehousing store for the platform. A store focused on taking care of the structured data in a coherent platform with the unstructured that is already in there. Of course it would need a good front-end to enable using the data in the warehouse for viualisation, statistics and data mining. Interestingly Rob Karel has an interesting take on that in hist blog post. During EMC World 2010 EMC announced a partnership with Informatica around Master Data Management (MDM) and Information Lifecycle Management (ILM) which also was a move towards the structured data area. Rob Karel suggest that Informatica could be the next logical acquisition for EMC althought there seem to be more potential buyers for them. Finally he suggests picking up TIBCO both to strengthen EMCs BPM offering but of course also to get access to the Spotfire data visualisation, statistics and data mining platform.

We have recently started working with Spotfire to see how we can use their easy-to-use technology to provide visualisations of content and audit trail data in Documentum. So far we are amazed over how powerful it is but still very easy to use. In a matter of days we have even been able to create a statistics server powered visualisation showing likelyhood of pairs of document being accessed together. Spotfire can then be used to replace Documentum Reporting Services and the BAM solution in Taskspace. Their server components are made in Java but the GUI is based on .Net which is somewhat a limitation but maybe something EMC can live with on the GUI-side. The Spotfire Web Player runs fine on Macs with Safari at least.

An opportunity to create great Social Analytics based on ECM
I hope the newly created Information Intelligence Group (IIG) at EMC sees this opportunity and can convince the management at EMC that there are these synergies except going for the the expanding big data and cloud computing market that is on the rise. In the booming Enterprise 2.0 market upcomers like Jive Software have added Social Analytics to their offering. Powering Centerstage with real enterprise class BI is one way of staying at the front of competitors with much less depth in their platform from an ECM perspective. Less advanced social analytics solutions based on dashboards will probably satisfy the market for  while but I agree with James Kobielus that there will be a need to analysts in the loop and these analysts expect more capable BI-tools just like Spotfire. It resonates well with our conceptual development which suggests that a serious approach to advanced information managements requires some specialists focusing on governing and facilitating the information in the larger enterprise. It is not something I would leave for the IT-department, it is part of the business side but with the focus on information rather than the information technology.

Enhanced by Zemanta

iPhone/iPad and mobile access to ECM

Behold the iPad in All Its Glory
Image via Wikipedia

Inspired by my recent discovery of a Documentum client for iPhone and iPad by Flatiron solutions I wanted to do some research what is going on when it comes to mobile access using iPhone OS for Enterprise Content Management systems. It turned out that there are a few solutions out there but first I would like to dwell a little bit about the rationale for all of this.

First of all we are of course going more and more mobile. Sales of laptop computers are increasing on the expense on stationary ones. Wireless high-speed internet is no longer just available as Airport/WiFi but also as 3G/4G connections using phones and dongles for laptops. Nothing new here. Another recent change is Web 2.0 and it’s workrelated counterpart Enterprise 2.0 which is now gaining a lot of traction among companies and organisations. It is all about capitalized on the Web 2.0 effects but in an Enterprise context. Lower threshold to produce information and even more to particpate with comments and rating based on relationships to people. All this drives consumption of information even more as the distance between producer and consumers is shorter than ever before.

Here comes the new smartphone (basically following the introduction of the iPhone) where it actually makes sense to use that for a number of different tasks which previously was possible but not very pleasant to do. The bigger form factor of the iPad to me opens even more possibilities where mobile access meets E 2.0 based on ECM. Not only does the appliance makes sense to use on the move but it also has really good support for collaboration and sharing on the move.

It seems the open-source community is doing good here. Alfresco is an open-source ECM-system created by the founders of Documentum and Interwoven and there are actually a few solutions for accessing Alfresco on the iPhone. This slide share presentation outlines one solution:

iPhone Integration with Alfresco – Open Source ECM

Another is Freshdoc for the iPhone developed by Zia Consulting. The company also seem to have presented an Fresh Docs for Filenet iPad application at IBM IOD (Information on Demand) Conference in Rome, Italy May 19 – 21. It is open source and can be downloaded at Google Code.
Yet another company that provides iPad access is the open source product Saperion ECM. Open Text Social Media also provides an iPhone App for their platform. Another company that seem to be in the works for an iPhone app is Nuxeo.
Cara for iPhone is also available from Generiscorp – an application that uses CMIS to connect to repositories with CMIS-support which includes both Documentum and Alfresco.
In our application the mobile access is somewhat less importance but the iPad changes that to some degree. Even if you maybe can’t offer mobile over the air acccess enabling users to have large screen multi-touch interfaces like the iPad is of course very interesting. From a Documentum perspective the only thing we have seen in the mobile area from EMC itself is a Blackberry client for Centerstage (check p.22 in the PDF) (there is also a Blackberry client available for IRM). I understand that Blackberry is popular in the US but in terms of being visionary having a nice iPhone OS app is important I think. As I said before there are many similarities between how information is handled in the iPad and how an ECM-system like Documentum handles information. It is all about metadata.

In the light of the fact that Flatiron’s iPhone app iECM so far is not said to be a product for purchase but rather a proof-of-concept I wonder if EMC or some partner would be the best way to provide access to a long-term iPhone OS app for Documentum.

Reblog this post [with Zemanta]

EMC World 2010: Next-generation Search: Documentum Search Services

Presented by Aamir Farooq

Verity: Largest ingex 1 M Docs

FAST: Largest Index 200 M Docs

Challenging requirements today that all requires tradeoffs. Instead of trying to plugin third party search engines chose to build and integrated search engine for content and case management.

Flexible Scalability being promoted.

Tens to Hundreds of Millions of objects per host

Routing of indexing streams to different collections can be made.

Two instances can be up and running in less than 20 min!

Online backup restore is possible using DSS instead of just offline for FAST

FAST only supported Active/Active HA. In DSS more options:

Active/Passive

Native security. Replicates ACL and Groups to DSS

All fulltext queries leverage native security

Efficient deep facet computation within DSS with security enforcement. Security in facets is vital.

Enables effective searches on large result sets (underpriveleged users not allowed to see most hits in result set)

Without DSS, facets computed over only first 150 results pulled into client apps

100x more with DSS

All metrics for all queries is saved and can be used in analytics. Run reports in the admin UI.

DSS Feature Comparison

DSS supports 150 formats (500 versions)

The only thing lacking now is Thesaurus (coming in v 1.2)

Native 64-bit support for Linux and Windows, Core DSS is 64-bit)

Virtutalisation support on VMWare

Fulltext Roadmap

DSS 1.0 GA compatible with D 6.5 SP2 or later. Integration with CS 1.1 for facets, native security and XQuery)

Documentum FAST is in maintenance mode.

D6.5 SP3, 6.6 and 6.7 will be the last release that support FAST

From 2011 DSS will be the search solution for Documentum.

Index Agent Improvements

Guides you through reindexing or simply processing new indexing events.

Failure thresholds. Configure how many error message you allow.

One Box Search: As you add more terms it is doing OR instead of AND between each terms

Wildcards are not allowed OOTB. It can be changed.

Recommendations for upgrade/migration

  • Commit to Migrate
  • No additional license costs – included in Content Server
  • Identity and Mitigate Risks
  • 6.5 SP2 or later supported
  • No change to DQL – Xquery available.
  • Points out that both xDb and Lucene are very mature projects
  • Plan and analyze your HA and DR requirements

Straight migration. Build indices while FAST is running. Switch from FAST to DSS when indexing is done. Does not require multiple Content Servers.

Formal Benchmarks

  • Over 30 M documents spread over 6 nodes
  • Single node with 17 million documents (over 300 Gb index size)
  • Performance: 6 M Documents in FAST took two weeks. 30 M with DSS also took 2 weeks but with a lot of stops.
  • Around 42% faster for ingest for a single node compared to FAST

The idea is to use xProc to do extra processing of the content as it comes into DSS.

Conclusion

This is a very welcome improvement for one of the few weak points in the Documentum platform. We were selected to be part of the beta program so I would now have loved to tell you how great of an improvement it really is. However, we were forced to focus on other things in our SOA-project first. Hopefully I will come back in a few weeks or so and tell you how great the beta is. We have an external Enterprise Search solution powered by Apache Solr and I often get the question if DSS will make that unnecessary. For the near future I think it will not and that is because the search experience is also about the GUI. We believe in multiple interfaces targeted at different business needs and roles and our own Solr GUI has been configured to meet our needs based from a browse and search perspective. From a Documentum perspective the only client today that will leverage the faceted navigation is Centerstage and that is focused on asynchronous collaboration and is a key component in our thinking as well, but for different purposes. Also even though DSS is based on two mature products (as I experienced at Lucene Eurocon this week) I think the capabilities to tweak and monitor the search experience at least initially will be much better in our external Solr than using the new DSS Admin Tool although it seems like a great improvement form what the FAST solution offers today.

Another interesting development will be how the xDB inside DSS will related to the “internal” XML Store in terms of integration. Initially they will be two servers but maybe in the future you can start doing things with them together. Especially if next-gen Documentum will replace the RDBMS as Victor Spivak mentioned as a way forward.

At the end having a fast search experience in Documentum from now is so important!

Further reading

Be sure to also read the good summary from Technology Services Group and Blue Fish Development Group about their take on DSS.

Reblog this post [with Zemanta]

At the Apache Lucene Eurocon in Prague

Today I am in Prague to attend the Apache Lucene Eurocon conference hosted by Lucid Imagination, the commercial company behind the Lucene/Solr -project. It seem to almost 200 people here attending the conference. I am looking forward to speak tomorrow of Friday afternoon about our experiences of using search in general and with Solr in particular. The main part of our integration is that we have integrated Solr with a connector that listens to a queue on our Enterprise Service Bus which is a part of Oracle SOA Suite 10g (the Oracle Middleware platform).

More Presentation Support Tools but less (Powerpoint) slide shows

In a recent article called ”We Have Met the Enemy and He Is PowerPoint” by Elisabeth Bumiller there is a big outcry to stop using Powerpoint because it supposed to make us more stupid in decision-making. I agree and can just reiterate a quote from the top US Intelligence Official in Afghanistan, Maj Gen Michael Flynn in the report “FIXING INTEL: A BLUEPRINT FOR MAKING INTELLIGENCE RELEVANT IN AFGHANISTAN”:

“The format of intelligence products matters. Commanders who think PowerPoint storyboards and color-coded spreadsheets are adequate for describing the Afghan conflict and its complexities have some soul searching to do.”

These are quite hard words directed towards his commanders in ISAF and the US Component in Afghanistan but I think he is right. However, the underlying issue is a desire to simplify things which should not be simplified. Combine that with a lack of vision when it comes to tools support for higher level of military command. Basically the tools supposed to support that kind of planning are either general purpose tools like Microsoft Office or highly specialised military application which exists in their own stove-pipe.

Oversimplifications
With Powerpoint comes a method and that method mainly consists of boiling information down to single bullets. Perfect for fine tuned marketing messages that want to leave just a few critical words or terms in the heads of the recipient. Not that good for complex reasoning around complex issues like modern conflicts. Powerpoint sets out to convey a message when we instead should focus on creating situation focused on improving our understanding.

Static representations
Most Powerpoint presentations are very static in nature. They usually represent a manually crafted snapshot of a given situation which means that it can become outdated very quickly. As time goes on there are more and more static presentations that should be regularly updated but usually never are. Either they disappear in the file sharing if the organisation lacks an Enterprise Content Management system or there is no process on monitoring which presentations that need to be updated. Usually because all the traceability is lost from when they were being created. Some companies have implemented some dynamic areas in their presentations were for instance weekly sales figures are updated when the presentation opens but that is far from keeping track of where the orgins for each bullet, diagram and images are.

Laborintensive work
As described in the article there are quite a few junior officers that spend time collating information and transforming into presentations. To start with there is much to be done to support this kind of ”research work” where users are navigation and searching for relevant pieces of information. However, after the information has been collated the next part of the work starts which is to transform that presentations using a template of some kind. Decision-makers usually have an opinion of how they want their presentations set up so they recognize the structure of the information from time to time. Add to that the fact that most organisations have a graphical profile to adhere which suggests a common styling and formatting of the content. To me all this really calls for a more semi-automated way of compiling this information. I am not saying that all content can be templated, far from it, but where it is possible it would save lots of time. Hopefully time that could be spent thinking instead of searching and formatting in Powerpoint.

Lack of interactivity
Another problem of these static representations are that since they usually take hours to compile and the flexilbilty in the actual briefing situations is usually low. If the decision-maker suddenly asks to filter the information from another perspective in say a graph the unfortunate answer will be: ”We will get back to you in half-an-hour or so”. Not exactly the best conditions to inspire reflections that puts complex problems in a new light. Spotfire has even written a paper around that which is called ”Minority Reports – How a new form of data visualization promises to do away with the meetings we all know and loathe”. The ability to introduce dynamic data which is interactive can bring us a new enviroment for meetings, especially if we also have access to large multi-touch walls that invite more than one person to easily manipulate and interact with the data.

Format matters
The General is right, format matters. There is a need for several different formats of the same information. Maj Gen Flynn calls out for more products based on writing which allows people to follow a more complex reasoning. That tackles the simplification aspect of the problem. However, there is still a need to do things together in a room and handing out written reports in Times New Roman 12 points is not the answer. In fact we really need a revolution in terms of visualisation of all that information we have decided to store digitally. Especially since we are increasingly able to provide structure to unstructured information with metadata but also able to collect data with XML-based data structures. We really need more presentation and visualisation support to be able to work productively with our information. However, we need less Powerpoint because it is a very time-consuming way to do stuff which can be done much better with another set of tools. Multi-channel publishing is an establish concept in marketing areas which means that the same content can be repurposed for print, web, mobile phones and large digital signage screens. We need to think in a similar way when it comes to what we use Powerpoint for today. There is even a complete toolsets such as EMC Document Sciences which, surprise, is based on templates in order to do customized market communications where static content meets dynamic content from databases. In this case based around common design tools such as Adobe InDesign.

The Space Shuttle Columbia experience
One tragic example of when the use of Powerpoint was a contributing factor was the tragic loss of Space Shuttle Columbia. The Columbia Accident Investigation Board (CAIB) took the help of Professor Edward Tufte from Yale University to analyse the communication failure that in the end made NASA to not be aware of the seriousness of the foam strike. The board makes the following finding which is all in line with General Flynn’s observations:

At many points during its investigation, the Board was surprised to receive similar presentation slides from NASA officials in place of technical reports. The Board views the endemic use of PowerPoint briefing slides instead of technical papers as an illustration of the problematic methods of technical communication at NASA.

Tufte continues to make the argument that the low resolution of the Powerpoint slides used forces technical terms to be abbreviated and thus adding ambiguity and that the usual large font size in headline also forces shortening. He also notes that the typography and hiearchies provided by the bullet organisation also added confusion and that in the case of NASA some more advanced typographic feature to handle math and other technical formatting is needed.

During the Return to Flight work later on this was further emphasized with the following statement:

“Several members of the Task Group noted, as had CAIB before them, that many of the engineering packages brought before formal control boards were documented only in PowerPoint presentations,”

Unfortunately, this is something I can relate to in my line of business. The main form of documentation are a slide shows being emailed around. Since you know that they will be emailed around without You being there to talk to them I believe many add a lot more extra text in them which makes them into some kind of in-between creatures. Neither slides shows nor reports. At least these added words hopefully reduced ambiguity some degree. I have now started to record my presentations with my own voice to help mitigate that to some degree.

The Physical Resolution is usually to low
To further add to the Columbia findings I have serious issue with how briefing rooms are usually setup to today. They usually have only one projector with resolution between 1024×728 or 1280×1024. Many laptops today have widescreen formats on the screen which when used on “clone mode” makes the image of a 4:3 format projector looked really skewed. When projector handles widescreen formats especially with a higher resolution they are never used because:

  • Users are given computers with sub-performance graphic cards that really don’t handle full HD (1920×1080) resolution.
  • Users don’t know anything else but to “clone” their screen. What you see on the laptop is what you see on the projector. Thus in essence limiting the resolution on the projector to what ever the laptop handles. Again because users have been given cheap computers.
  • The resolution has to be turned down from the highest one “because everything became too small to see”. The reason for this is that the physical screen size is too small which makes the projector sit too close and the actual pixels too small to see from most of the room.

Combine that with Powerpoint templates with big font sizes we have a situation which means that not a lot information can be displayed for us which I also think adds to the oversimplification issue of this issue.

Why the Afghan “Spaghetti image” is actually rather good

The NYT article contains an images from the Afghanistan conflict with hundreds of nodes being connected by arrows in different colors and this is given as an example of the problems of using Powerpoint. To start with I am not even sure that the image is made in Powerpoint, at least not from the beginning. I think a likely candidate instead is Consideo which is a MODELING tool not a PRESENTATION tool. The problems with that image is that when it enters the Powerpoint world it is static with no connections to underlying data. Imagine instead that that images was a dynamic and interactive visualizations of objects with relationships objects powering the lines. Metadata allows for filtering based on object and relationship attributes. Suddenly that images is just one of almost endless perspectives of the conflict. Imaging if all these nodes also are connected to underlying data such as reports and written analysis. Then it becomes easier even for an outsider to start understanding the image. We also need to understand that some visualizations are not intended for the decision-maker. Sometimes in order to understand them you need to have been there in the room most of the time so you understand how the discussions were. So this images is potentially rather good because it does not contain oversimplified bullets but instead is something you probably could stare at for hours while reflecting. However, it MUST NOT be an image that is manually updated in Powerpoint – it has to be a generated visualisation on top of databases.

Still valid for marketing
The almighty Master of presentations, Steve Jobs, who actually is using Apple Keynote instead of Powerpoint will most likely continue using that format. He delivers a very precise markeing message with slides that does not contain very much text at all. The rest of us who are not selling iPads need to start figuring out a smarter way to do business. Newer versions of ever more complex MS Powerpoint applications are simply not the answer. It is so general purpose that it doesn’t fit anyone any longer. At least if you care about your own time and data quality. It helps to some degree that both Keynote and Powerpoint use XML today – that means that the technical ability to use them as just a front-end is possible. The real issue has to do with information architecture and usage.

Conclusion
Oh, so how to do this, then? Use Enterprise Content Management systems to manage your content and move to an concept where content is handled in XML so it can be reused and repurposed while preserving tracability. Have a look at my other blog post around “The Information Continuum” to get an idea of how. Since we do store all of our information digitally there is a need for much more in terms of visualisation and presentation support tools – not less. However, we need to find a way to be able to present lines of reasoning with a capability to do drill-down to utilize the tracability aspect. Maybe presentations to some degree will be more in the form of a rendition with links back to text, data, graphs, images or whatever. We need to accept that in many cases it isn’t realistic to try to boil it down to summarized and instead be able to explore that data ourselves. Now, let us setup our mindset, software and meeting rooms to do just that!

Interesting thoughts around the Information Continuum

In a blog post called “The Information Continuum and the Three Types of Subtly Semi-Structured Information” Mark Kellogg discusses what we really mean with unstructured, semi-structured and structured information. In my project we have constant discussions around this and how to look upon the whole aspect of chunking down content into reusable pieces that in itself needs some structured in order to be just that – reusable. At first we were ecstatic over the metadata capabilities in our Documentum platform because we have made our unstructured content semi-structured which in itself is a huge improvement. However, it is important to see this as some kind of continuum instead of three fixed positions.

One example is of course the PowerPoint/Keynote/Impress-presentation which actually is not one piece. Mark Kellogg reminded me of the discussions we have had around those slides being bits of content in a composite document structure. It is easy to focus on the more traditional text-based editing that you see in Technical Publications and forget that presentations have that aspect in them already. To be honest when we first got Documentum Digital Asset Manager (DAM) in 2006 and saw the Powerpoint Assembly tool we became very enthusiastic about content reuse. However, we found that feature a little bit too hard to use and it never really took off. What we see in Documentum MediaWorkSpace now is a very much remamped version of that which I look forward to play around with. I guess the whole thing comes back to the semi-structured aspect of those slides because in order to facilitate reuse they somehow need to get some additional metadata and tags. Otherwise it is easy the sheer number of slides available will be too much if you can’t filter it down based on how it categories but who has created them.

Last year we decided to take another stab at composite document management to be able to construct templates referring to both static and dynamic (queries) pieces of content. We have made ourselves a rather cool dynamic document compsotion tool on top of our SOA-platform with Documentum in it. It is based on DITA and we use XMetaL Author Enterprise as the authoring tool to construct the templates, the service bus will resolve the dynamic queries and Documentum will store and transform the large DITA-file into a PDF. What we quickly saw was yet another aspect of semi-structured information since we need a large team to be able to work in parallell to “connect” information into the finished product. Again, there is a need for context in terms of metadata around these pieces of reusable content that will end up in the finished product based on the template. Since we depend of using a lot of information coming in from outside the organisation we can’t have strict enforcement of the structure of the content. It will arrive in Word, PDF, Text, HTML, PPT etc. So there is a need to transform content into XML, chunk it up in reusable pieces and tag it so we can refer to it in the template or use queries to include content with a particular set of tags.

This of course bring up the whole problem with the editing/authoring client. The whole concept of a document is be questioned as it in itself is part of this Continuum. Collaborative writing in the same document has been offered by CoWord, TextFlow and the recently open source Google tool Etherpad and will now be part of the next version of Microsoft Office. Google Wave is a little bit of a disrupting force here since it merges the concept of instant messaging, asynchronous messaging (email) and collaborative document editing. Based on the Google Wave Federation protocol it is also being implemented in Enterprise Applications such as Novell Pulse.

So why don’t just use a wiki then? Well, the layout tools is nowhere as rich as what you will find in Word processors and presentation software and since we are dependent on being able to handle real documents in these common format it becomes a hassle to convert them into wiki format or even worse try to attach them to a wiki page. More importantly a wiki is asynchronous in nature and that is probably not that user friendly compared to live updates. The XML Vendors have also went into this market with tools like XMetaL Reviewer which leverages the XML infrastructure in a web-based tool that almost in real-time allow users to see changes made and review them collaboratively.

This lead us into the importance of the format we choose as the baseline for both collaborative writing and the chunk-based reusable content handling that we like to leverage. Everybody I talk to are please with the new Office XML-formats but say in their next breath that the format is complex and a bit nasty. So do we choose OpenOffice, DITA or what? What we choose as some real impact on the tool-end of our solutions because You probably get most out of a tool when it is handling its native format or at least the one it is certified to support. Since it is all XML when can always transform back and forth using XSLT or XProc.

Ok, we have the toolset and some infrastructure in place for that. Now comes my desire to not stove-pipe this information in some close system only used to store “collaborative content”. Somehow we need to be able to “commit” those “snapshots” of XML-content that to some degree consitutes a document. Maybe we want to “lock it” down so we know what version of all of that has been sent externally or just to know what we knew at a specific time. Very important in military business. That means that it must be integrated into our Enterprise Content Management-infrastructure where it in fact can move on the continuum into being more unstructured since it could even be stored as a single binary document file. Some we need to be able to keep the tracability so you know what versions of specific chunks was used and who connected them into the “document”. Again, just choosing something like Textflow or Etherpad will not provide that integration. MS Office will of course be integrated with Sharepoint but I am afraid that implementation will not support all the capabilities in terms of tracability and visualisation that I think you need to make the solution complete. Also XML-content actually like to live in XML-databases such as Mark Logic Server and Documentum XML Store so that integration is very much need more or less out of the box in order to make it possible to craft a solution.

We will definitely look into Documentum XML Technologies more deeply to see if we can design an integrated solutions on top of that. It looks promising especially since a XProc Pipeline for DITA is around the corner.

Reblog this post [with Zemanta]

EMC World 2010: Chiming in with Word of Pie about the future of Documentum

We have got a written reaction to Mark Lewis’ keynote held at EMC World 2010 in Boston. I both feel and have the passion around Enterprise Content Management and it is great that Laurence Hart spent so much time and effort on talking to people to craft this post. Someone need to say things even if they are not always easy to hear. So I will try to not repeat what he said in this blog post but rather try to provide my perspective which comes from what I have learned about Information and Knowledge Management over the past years. ECM and Documentum is a very critical component to move that IKM vision from the Powerpoint stage into reality. In our case an experimentation platform that allows to put our ideas to improve the “business” of staff work in a large military HQ into something people can try, learn and be inspired from. Also, this turned out to be a long blog post which calls for an summary on top:

The Executive Summary (or message to EMC IIG) of this blog post:

  • Good name change but make sure You live up to your name.
  • A greater degree of agility is very much needed but do not simplify the platform so much that implementing an ECM-strategy is impossible.
  • Case Management is not the umbrella term, it is just one of many solutions on top of Documentum xCP
  • The whole web has gone Social Media and Rich Media. The Enterprise is next. Develop what You have and stay relevant in the 2010-ies!
  • Be more precise when it comes to the term “collaboration”. There is a whole spectrum to support here.
  • Be more bold and tell people that Documentum offers an unique architectural approach to informtion management – stop comparing clients.
  • Tell people that enabling Rich Media, Case Management, E 2.0 and (Team) Collaboration on one platform is both important and possible.
  • I am repeating myself here: You want to sell storage, right? Make sure Video Management is really good in Documentum!

The name change

Before I start I just need to reflect on the name change from Content Management and Archiving into Information Intelligence Group (IIG). I agree with Pie…the had to be changed to make it more relevant in 2010 and a focus on information (as in information management which is more than storage ILM) is the right way to go. The intelligence part of it is of course a bit fun because of my own profession but still it implies doing smart things with information and that should include everything from building context with Enterprise 2.0 features to advanced Content and Information Analytics. You have the repository to store all of that – now make sure you continue to invest in analytics engine to generate structure and visualisation toolkit to make use of all the metadata and audit trails. Maybe do something with TIBCO Spotfire.

Documentum xCP – lowering the threshold and creating a more agile platform

Great. Documentum needs to be easier to deploy, configure and monitored. Needed to get know customers on board easier and make existing ones be able to do smarter things with it in less time. However, it is easy to fall into the trap of simplifying things to much here. To me there is nothing simple around implementing Enterprise Content Management (ECM) as a concept and as a method in an organization. One major problem with Sharepoint and other solutions is that they are way to easy to install so people actually are fooled into skipping the THINKING part of implementing ECM and think it is just “next-next-finish”. All ECM-systems needs to be configured and adapted to fit the business needs of the organisation. Without that they will fail. xCP can offer a way to do that vital configuration (preceeded by THINKING) a lot more easier and also more often. We often stress how it is important to have the technical configuration move as close to any changes in Standard Operating Procedures (SOP) as possible. If Generals want to change the way they work and the software does not support it they will move away from using the software. Agility is the key.

In our vision the datamodel needs to be much more agile. Value lists need to updated often – sometimes based on ad hoc folksonomy tagging. Monitoring of the use of metadata and tags will drive that. Attributes or even object types need to be updated more often. Content need to be ingested quickly while providing structure later on (think XML Store with new schemas here). xCP is therefore a welcome thing but make sure it does not compromise the core of what makes Documentum unique today.

The whole Case Management thing

Probably the thing that most of us reacted against in the Mark Lewis Keynote was the notion that ECM-people in reality just have done Case Management all the time. I recently spend some time reflecting on that in another blog post here called “Can BPM meet Enterprise 2.0 over Adaptive Case Management?“. There is clearly a continuum here between supporting very formal process flows and very ad-hoc Knowledge Worker-style work. They clearly seem different and while they likely meet over Adaptive Case Management but to me it makes no sense to have that term cover the whole spectrum – even for EMC Marketing 🙂

I immediately saw that Public Sector Investigative work is often used as an example of Case Management. Case Management in especially done by law enforcement agencies is fundamentally different from work done by Intelligence Agencies because in Case-based Police investigations there is usually some legal requirement to NOT share information between cases unless authorised by managers. This is of not the case (!) for all Case Management applications but from a cultural perspective it is important that Case Management-work by the Police is not a line of business that should be used as an example of information sharing. It is even so that the underlying concept actually is at ends with any concept of unified enterprise content management strategy where information should be shared. That is why workgroup-oriented tools such as i2 Analyst’s Workstation have become so popular there.

The point here is that it is important to not disable sharing in the architectural level because again it is what constitutes a good ECM-system that content can be managed in a unified way. Don’t be fooled by requirements for that – use the powerful security model to make it possible. Then Law Enforcement Agencies can use it as well. However, there must be more to ECM than Case Management – as Word of Pie suggests it is just ONE of many solutions on top of the Documentum xCP platform. A platform which is agile enough to quickly build advanced solutions for ECM on top.

Collaboration vs Sharing and E.20

So, Collaboration is used everywhere now but the real meaning with it actually varies a bit. First there are two kind of collaboration modes:

  • Synchronous (real-time)
  • Asynchronous (non-real time – “leave info and pick up later)

Obviously neither Documentum nor Sharepoint is in real-time part of the business. For that you will need Lotus Sametime, Office Communications Server, Adobe Connect Pro or similar products. However, Google Wave provides a bit of confusion here since it integrates instant messaging and collaborative document editing/writing.

However, I am bit bothered by the casual notion of anything as a collaboration tool like Sharepoint and for that sake eRoom is getting. To further break this down I believe there is a directness factor in collaboration. Team collaboration has a lot of directness where you collaborate along a given task with collegues. That is not the same as many of the Social Media/Enterprise 2.0 features which does not have a clear recipient of the thing you are sharing. And sharing is the key since you basically are providing a piece of information in case anyone wants/needs it. That is fundamentally different from sending an email to project members or uploading the latest revision to the project’s space. Andrew McAffe has written about this concept and uses the concept of a bullseye representing strong and weak ties to illustrate this effect.

My point is that it is important that tools for team collaborations from an information architecture standpoint can become part of the more weaker indirect sharing concept. That is the vehicle to utilze the Enterprise 2.0 effect in a large enterprise. Otherwise we have just created another set of stove-pipes or bubbles of information that is restricted to team members. I am not saying that all information should be this transparent but I will argue that based on a “responsibility to provide”-concept (see US Intel Community Information Sharing Policy) restricting that sharing of information should be exception – not the norm.

Sure as Word of Pie points out in his article “CenterStage, the Latest ex-Collaboration Tool from EMC” there are definitely things missing from the current Centerstage release compared to both Sharepoint and EMC’s old tool eRoom. However, as Andrew Goodale points out in the comments I also think it is a bit unfair because both eRoom and at least previous versions of Sharepoint (which many are using) actually lacks all these important social media features that serves to lower the threshold and increase participation by users. They also provide critical new context around the information objects that was not available before in DAM, WebTop or Taskspace. Centerstage also provides a way to consume them in terms of activity streams, RSS-feeds and faceted search. Remember that Centerstage is the only way to surface those facets from Documentum Search Server today.

So, I am also a bit disappointed that things are missing in Centerstage that should be there and I also really want to stress the importance of putting resources into that development. Those features in there are critical for implementing all serious implementations of an ECM-strategy and the power of Documentum is that they all sits in the same repository architecture with a service layer to access them. Maybe partner with Socialcast to provide a best practice implementation to support a more extensive profile page and microblogging. Choose a partner for Instant Messaging in order to connect the real-time part of collaboration into the platform. Again, use your experience from records management and retention policies to make those real-time collaboration activities saved and managed in the repository.

Be bold enough to say you are an Sharepoint alternative – but for the right reasons

I’m not an IT-person, I come into this business with a vision change the way a military HQ handles information so I see Enterprise Content Management more as a concept than a technology platform. However, when I have tried to execute our vision it becomes very clear that there is a difference between technology vendors and I like to think that difference comes from internal culture, experience, and vision of the company. It is the “why” behind why the platform looks like it does and has the features it has. So as long you are not building everything from scratch for yourself it actually matters a lot which company you chose to deliver the platform to make your ECM vision happen. That means that there IS a difference between Documentum and Sharepoint in the way the platform works and we need to be able to talk about that. However, what I see now is that most people focus on the client side of it and try to embrace it is a popular collaboration tool. Note that I say tool – not platform. All those focuses on the client side of it where the simplified requirement is basically a need for a digital space to share some documents in. However, the differentiator is not whether Centerstage or Sharepoint meets that requirement – both do. The differentiator is whether you have a conceptual vision on how to manage the sum of all information that an organization have and to what degree those concepts can be implemented in technology. That is where the Documentum platform is different from other vendors and why it is different from Sharepoint. Sharepoint is sometimes a little bit to easy to get started with which unfortunately means there is no ECM-strategy behind the implementation and when the organisation have thousands of Sharepoint sites (silos) after a year or so that is when that choice of platform really starts to differ.

This week at EMC World has been a great one as usual and there is no shortage of brilliant technical skills and development of features in the platform. What I guess bothers me and some other passionate ECM/Documentum-people is the message coming out from the executive level at IIG. In the end, that is where the strategic resource decision are made and where the marketing message being constructed. I think now there is a lot more to do on the vision and marketing level than actually needs to be done on the platform itself. The hard part seem to be proud of what the platform is today, realize it’s potential to remain the most capable and advanced on the market and use that to stay relevant in many applications of ECM – not just Case Management.

Rich Media – A lot of content to manage and storage to sell

One of the strong points of Documentum is that it can manage ALL kind of content in a good way and that includes of course rich media assets such as photos, videos and audio files. Don’t look upon this as some kind of specialised market only needed by traditional “creative” markets. This is something everybody needs now. All companiens (and military units for that sake) have an abundance of digital still and video cameras where a massive amount of content needs to be managed just as all the rest of the content. There is a need for platform technologies that actually “understands” that content and can extract metadata from it so that this content can be navigated and found easily. It is also important to assist users in repurposing this content so it can be displayed easily without consuming all bandwith and also easily be included in presentations and other documents. This is also very much relevant from a training and learning perspective where screencams and recorded presentations has so much potential. It does not have to be a full Learning Management System but at least an easy way to provide it. Maybe have a look at your dear friend Cisco and their Show and Share application. Oh, it is marketed as a Social Video System – the connections to Centerstage (and not just MediaWorkspace) is a bit too obvious. Make sure you can provide Flickr and Youtube for the Enterprise real soon. People will love it. Again, on one very capable platform.

Media Workspace is a really cool application now. Even if it does not have all the features of DAM yet (either) it is such a sexy interface on Documentum. The new capabilites of handling presentations and video are just great. Be sure to look more at Apple iPhoto and learn how to leverage (and create) metadata to support management of content based on locations, people and events. A piece of cake on top of a Documentum repository. Now it is a bit stuck in the Cabinet/Folder hierarchy as the main browsing interface.

Summary

I agree with Word of Pie that there is a lack of vision – an engaging one that we all can buy into and sell back home to our management. In my project we seem to have such a vision and for us Documentum is a key part of that. I just hoped that EMC IIG would share that to a greater degree. From our responses back home in Sweden and here at EMC World people seem to both want and like it (have a look at my EMC World presentation and see what you think). We can do seriously cool and fun stuff that will make management of content so much more efficient which should be of critical importance for every organisation today. At least in the military one thing is for sure and that is that we won’t get more people. We really have to work smarter and that is what a vision like this will provide a roadmap towards.

So be proud of what you do best EMC IIG and make sure to deliver INTEGRATED solutions on top of that. For those who care that will mean a world of difference in the long run and will gather looks of envy for those who did not get it.

With Jamie Pappas in the Blogger’s Lounge at EMC World 2010

The Blogger’s lounge is a great water hole to stop by to get a really good latte but of course also sit down in nice chairs and sofas with power outlets on the floor to blog and tweet about experiences at EMC World 2010 in Boston. Today I stopped by in the morning to have my photo taken with Jamie Pappas who is Enterprise 2.0 & Social Media Strategist, Evangelist & Community Manager at EMC. Be sure to visit her blog and follow her on Twitter. My dear Canon EOS 5D camera managed to capture the nice lighting in the lounge I think.