Monday, October 4, 2010

A Crisis Information Repository in the cloud

Many years ago I was a program manager for a technology called Microsoft Repository. It was my first introduction to the world of metadata. Seeing how important it was to companies to create repositories of the enterprise data available within their company was key to enhancing business operations through technologies such as data warehouses and business intelligence.

When working in a crisis we however seem to create multiple repositories of crisis information in multiple platforms. What makes it even worse is that each organization creates their own repository for the data they gather. Some of these repositories are formal and collect all the different metadata about the crisis data, but others are simply file shares where data files get copied.

Attempts have been made in the past to create some repositories. This is especially true in the GIS world, where solutions have been created to store the various GIS products and collect the metadata about each product. But often these solutions have been run by individual organizations or have been restricted to just showing what data is available and not provide access to the data itself.

And the other thing that we really want to do is actually collect as much of the baseline data that we need before a disaster actually strikes.

The case of Haiti also teaches us that these datasets can not sit in the disaster affected countries themselves, because they may either get destroyed during the disaster or be hard to get to in the chaos following the disaster.

I would like to see us move towards a crisis information repository that lives in the cloud. The cost of storage in large scale data centers has gone drastically and that provides us with an opportunity to get economies of scale. By establishing a single collaborative Crisis Information Repository where everyone can contribute and retrieve data from then we can simplify the life of crisis information managers in the times of disasters by providing them with a single place to go to. Just like you have data.gov as a single source of open, available government data in the US we need a single place everyone can go to and retrieve data from. And when they create new data they can contribute to that same repository.

I don’t know what it takes to make this kind of effort a reality, but mainly I think it needs a mind shift in the organizations of wanting to share and collaborate in this way.

If you feel strongly about this lets work together and figure out a way to make this a reality.

10 comments:

  1. Generally a very good idea Gisli. Beyond the organizational mindset challenge, we will need to:

    1) Figure out how to allow people "on-the-ground" work in the system when they have very poor internet connection (in Haiti, OCHA had terrible connection at the beginning and many organizations had little to no connection at all)

    2) Figure out how to get information out to the affected - remember, it is very unlikely that they will have internet connection so they will not be able to 'access the cloud'.

    Andrej

    ReplyDelete
  2. 1) Yes there needs to be an on-the-ground approach to that cloud. What I like is taking the model we were proposing at MS for OneResponse by having quickly deployable mini-"datacenter"/"cloud" that could be moved into your local network on the ground. Then have that take care of the synchronization to the rest of the world.

    2) Have the ability for NGOs on the ground and the government itself to tap into the "cloud in a box" described in #1 through internet cafes, etc. And of course have million memory sticks to give away :-)

    ReplyDelete
  3. But I think it is not a technology problem, but rather as you put it an organizational mindset problem...

    ReplyDelete
  4. The problem that we wrestled with when discussing this 6-7 years ago is that there's a sovreignty issue. Governments and national organisations in general are not always comfortable with data being stored a) by a third party and b) outside the country. Now we might feel that this view has been overtaken by technology, but it's neither possible or desirable to discount it if we want to work with those organisations. Data.gov is of course owned by the US government.

    There are also more basic problems of connectivity, and even more basic problems of interoperable formats - why should an agency change its spreadsheet fields (and therefore its survey methodology) just to upload it to a collective site? Who will use such a site, and how, since most organisations struggle to turn their internal data into meaningful decisions, let alone data from other organisations? You have to look beyond "collaboration" as something good in itself and ask: what is it good for in this particular context, particularly given the transaction costs that collaboration implies for any organisation?

    ReplyDelete
  5. Good points Paul. But we should not give up before we attempt. Of course in order for anything like this to be a success you need buy-in and support from the donor community so they push the actors to share their data.

    ReplyDelete
  6. There will always be a shortage of resources in the humanitarian sector, and every initiative has a cost in terms of resources. We cannot pursue every initiative, and therefore before we start any initiative we need to ask a) how much it will cost (not just in terms of money), b) whether that cost is justified in terms of what the initiative will deliver to those paying the cost, and c) whether the initiative stands a reasonable chance of success in the first place.

    Depending on the answers to those questions, it may be entirely sensible to give up particular initiatives before we waste resources in making the attempt. We've got the space here on your blog, so it may be a good place to start thrashing out the answers to those questions.

    ReplyDelete
  7. Good points again Paul and I am glad you are not killing my ideas right away :-)

    Happy to use this as a forum for such a discussion.

    ReplyDelete
  8. Perhaps UN is a good place for this?
    A few years ago I tried to propose a situational awareness tool that was fed by all nations. The advantage to full disclosure and participation would far outweigh any advantage from keeping data proprietary. I thought it should be three dimensional in order to produce the most profound experience and ensure uptake, besides the fact that it creates much better understanding of a situation.
    The drawback to UN repository is data access out. It is an institution and has organizational limitations not experienced by things like Open Street Maps.
    I agree that organizational mindset is the barrier, but at the same time, looking at the sheer volume of peripheral and ad-hoc groups and unrelated virtual volunteers, surely the mandate exists for open, free and collected data.
    In terms of delivery, yes it is true that one of the most vulnerable networks is internet... however, if this repository existed, to get a quick dump onto hardened storage drives would be basically trivial, and have data flown in on the same pallet as water and antibiotics.
    I think we have reached the point where data has reached equal importance because of the short+long term on-the-ground advantages gained form utilizing appropriate technology.

    ReplyDelete
  9. Quite right Gisli that data access 'on the day' in an emergency is often problematic. Eg in Haiti where national mapping agency was devastated. (Thank God for the Open Street Mappers!)

    We (MapAction) have just been doing some work with OCHA in Nepal to secure access to geodata in a sudden-onset. OCHA is doing a great job, in Asia Pacific at least, on data preparedness.

    BUT we're left thinking that the long term solution must be to help governments to be the custodians of their own comprehensive data sets for all aspects of DRR. Not sure if a 'global' project will work towards or against this. As Paul says, it needs plenty of thought.

    Cheers, Nigel.

    ReplyDelete
  10. When I was working as IMO in Multan; one of the OCHA field offices in Pakistan during Monsoon Floods 2010, there was a request made by WASH cluster IMO that they need a space as repository on OCHA Pakistan website (pakresponse.info) where every cluster can upload all their common datasets, assessments, GIS related data (shape files), contact list, data on damaged facilities etc.

    The idea of this repository was not to share such data with the humanitarian community but to share it among the cluster IMOs to strengthen the data/information coordination among the clusters and avoid sending the documents back and forth in the email.

    We really liked this idea of having a central repository and whoever (cluster IM) when upload their documents will also add the document metadata, person’s contact details to follow up and this way we avoid digging emails and contacting each other every time for the same document but rather access it online whenever it was needed.

    Now this functionality is available on the pakresponse.info website and I hope that we can have something similar in the coming emergencies especially for the clusters and other partners organization who are involved in the information management to have access to such data but internet connection is always the main challenge as we know it.

    ReplyDelete