11 thoughts on “SharePoint metadata design principles”

  1. This post crystalizes (and very clearly articulates) a line of reasoning that has been circulating in my mind for a while. My experience with clients mirrors Carsten’s, and has led me to pare back the use of metadata.

    Just this week I had the opportunity to hear USAF Lt. Col. David Sanchez speak at SHARE 2012 in Atlanta about ‘weaponized SharePoint’. His metadata involves life & death decision making and protection of content vital to national security. The gist of his argument was that the metadata is crucial to the operation of the system, and that people are totally unreliable.

    His solution is to leverage a lot of ‘out of the box’ policy and workflow functionality along with automated concept classification tools that automate the assignment of content types and associated metadata in ways that ensure compliance when combined with workflows.

    This work is complex in his scenario due to the nature of the multiple dimensions of security and audiences that he has to deal with.

    I think that we in the SharePoint community need to think hard about the ideas laid out here and validate (or refute) them, leading to an adjustment in the way we design and build our SharePoint solutions.


  2. I completely agree, Ruven. Carsten does a great job describing the intersection between metadata needs and end-user willingness. In the end, we have to focus on the metadata that will solve our users’ pain. If we can do that, we have a hope of getting them to embrace metadata population. If not, we’re sunk before we even begin.

  3. Hi Carsten…

    As I read this I was thinking about one of my current clients. They are the project and contract management practice department. They are accountable for improving the quality of how projects are managed and the contracts that underpin them. This is the world of policies, procedures, knowledge management, changing prevailing attitudes towards project delivery.

    But they do not run projects themselves, and the organsiation has many different project delivery models, contract types and is geographically dispersed across all of Western Australia. To put it more bluntly, this is essentially a giant community of practice scenario.

    Now in this scenario, when I apply your model, search tends to be an equal (or bigger) bubble than business process. Granted, the vast majority of scenarios are more like the situation you describe here are more transactional and process focused because ultimately its about getting shit done better than before.

    So this led me to (re)thinking about my facets of collaboration model. http://www.cleverworkarounds.com/2011/01/19/the-facets-of-collaboration-part-2enter-the-matrix/. That was an attempt to provide a better framing for collaborative scenarios and if I was to put a label onto the example you described here, I would argue that it was primary a task based scenario, incorporating transactional as well as knowledge work. But the scenario I describe is on the trait side of the fence.

    So I wonder how your model would work with the facets model. Imagine “sizing” each bubble according to each quadrant?



  4. Hi Paul,

    I will take a look at your blog post; sounds like there are interesting connections to be made.

    For me, the point was mainly to foreground the collection of metadata to support day-to-day business processes… and ideally, to try and collect _all_ metadata in this bucket, as I see the other buckets as a “hard sell” to users. Of course, how hard of a sell depends on the sophistication of your users. My point was that if we could fit our entire minimal term set into the “magic intersection” we would by definition no longer have to justify metadata to users at all because its value would be self-evident.

    I realize of course that this is an ideal state and no project is ever likely to get there entirely. But that’s what I’d like to try and strive for because it solves several problems with a single approach.


  5. I agree with Ruven that this is a great way to think about metadata. I am firmly in the “just enough to do the job” camp of creating and assigning metadata. I think your three business purposes for metadata are spot on. Whether you are a power user or a consultant, being thoughtful about metadata is going to result in a better outcome. I think you typically get about 15 seconds or less of “metadata patience” when you ask people to add content to SharePoint in a collaboration scenario so you better have a really good reason for every metadata value. In a publishing scenario (for example, HR publishing benefits information on the intranet), you can ask for more time to classify content and, if the metadata choices meet your criteria, you can be a little more flexible. As with pretty much all things in SharePoint, just because you CAN (create a gazillion possible metadata columns), doesn’t mean you SHOULD!

  6. Looking for a freely available term set can be a good starting point. But the point I’m making is that–whether you’re constructing, downloading or buying one–off-the-shelf ontologies are never a match for a specific organization’s needs because those needs are specific to the business objectives, software implementation, user behaviour, etc. Which is why I’m advocating for this “ground-up” model of creating a minimal term set that’s anchored in people’s actual day-to-day work rather than an outside classification.

  7. An excellent description of the problem and some thoughtful and useful comments. I would like to throw a wrench into all that by heading out on a tangent (a “talent” I seem to be genetically predisposed to).

    At what point do we stop asking users to manage their own information (something they are not good at and have no interest in doing) and hire someone qualified to do it for them? (oops, I think my bias is showing…)

    Yes, the business process has to aimed at, and completed by, the business users. But as you point out, it is a hard (impossible?) sell to get them to give a damn about search or records management. So rather than rely on them to supply that metadata (which is vital to the business even if nobody recognizes it), perhaps someone else has to step in to add the necessary information. Maybe this is what Ruven is suggesting could be done via workflows? Inferring metadata values based on other metadata values would be great, but as a colleague of mine keeps pointing out – “Nobody has invented “ESP.exe” yet.”

    As you can tell, I am no expert and have no solutions to offer, just some comments to “stir the pot”.

  8. Some organizations in fact do this. I once interviewed with a “big four” consulting firm for a position in their international knowledge management team, and they actually have a “knowledge harvesting team” (could never quite figure out if that sounded more like farming or organ harvesting, but neither association made me feel like “wow, that sounds like a fun job!”).

    I can certainly see some validity to the approach (if you want it done right, just do it yourself), but it is expensive and labour-intensive. It’s also not scalable at all. It results in creating another chain of command (the chain of harvesting?) that is required to identify, scrub, nominate, curate, bless and publish. So while it would possibly work, I feel philosophically (and economically) opposed to not letting people manage their own information.

    Instead, I continue to toil away at figuring out how to make better systems and user experiences to encourage people to do it themselves. There are, of course, some indications out there in the world that certain tactics can result in taking ownership for managing your own information: hash tags on Twitter are a good example. So there’s hope.

  9. An interesting article that reinforces and very clearly articulates views that I too had developed in the course of attempting to specify requirements for a new legal document management system.

    Users like folders; they understand folders; and they don’t like being required to enter metadata that they, themselves, don’t perceive to be of value. Somewhere I found a quote along the the lines that, to willingly enter metadata (even tags), users have to accept a social contract based on reciprocality. Absent obvious reciprocality the social contract will break unless it is punitively enforced by management; and that never happens because management is more focussed on productivity today than tomorrow. This may even be a correct view; there is a very good chance in legal document management that the great majority of documents will never be retrieved, so the cost in time and aggravation in correctly tagging every document may not in fact be justified. Users get shot for not doing todays work, they rarely get shot for making somebody else’s future work more difficult. This means that a reasonably coherent folder structure in which each document is initially filed in exactly one place is actually the optimal (lowest cost) solution for the majority of users and the majority of documents.

    The question seems, therefore, to be how best to exploit the short term (day-by-day) utility of a simple folder tree without loosing the long term (yearly or longer) flexibility of a metadata driven system. The answer I propose is that every folder should be a “smart” folder, populated by querying tags/ labels/ categories (pick your term). Filing a document in that folder should assign the folder’s and all its parent folders’ labels to the document. Moving a document from one folder to another should delete and reassign the document’s labels; copying a document should merge the two sets of lables. All other metadata should follow your principle #2.

    This doen’t, of course, obviate the need to create a simple ontology of categories and to manage additions to that ontology and business needs dictate (e.g. new ptojects, new clients); and clearly the queries embodied in the smart folder must be restricted to those that can be expressed solely in terms of AND conjunctions. NOT and OR don’t give unique values for each category of label to be applied to the document. However, its seems (at least to me) that this is a reasonable compromise; certainly its no worse than a typical arranagement folders on a shared drive . Because all folders would be “smart” folders, it has the distinct advantage that folder hierarchies can be readily adjusted to meet differing business requirements – do I want to search first by client or by project or by work type or what – and can be tuned to produce rapidly navigable structures. These are (IMHO) the principle benefits of label driven classification.

    The question I have, not being a SharePoint programmer, is: can it be done?

Leave a comment