How Does It Feel?
Think about data you interact with on a daily basis, perhaps as part of your job or a hobby. How do you feel when you're using it? Do you get tense at the thought of having to look something up? Do you have actual physical sensations of fear, of anxiety when it comes to understanding and using your own data? It doesn't have to be that way, and it shouldn't be that way. Your data should be beautiful, and using it should be a satisfying and familiar experience.
Find Your Fluency
Things like games, sports, crafting and cooking give us the opportunity to do well and develop skill at something with a scope we understand. There is a special kind of experience in the way it feels when we know what we're doing, when we're making a trusted recipe or playing our favorite video game or sport. It’s a feeling of fluency and capability that we get when we know the rules of the game, what's going on around us and what we can do about it.
You have your data in the first place because it's important to you. You wouldn't be keeping track of it if it were noise or nonsense, so it only makes sense for you to want to feel that fluency and familiarity, to enjoy interacting with the world of information.
Our relationship with data, both individually and as a society, has been haphazard at the beginning. Ad hoc and improvised solutions are used to tie together new technologies, but now our years of examples and experiences about what we should and shouldn't do have led to new understanding and new technologies to make fluency, even mastery, possible and practical.
The Ground Rules
There are important principles of responsibility and accountability we must navigate, but rather than hinder us, they should help point us in a clear direction.
We are responsible for the data we collect and use. It is not acceptable for a business to have data it found in a bag on the side of the road. Once upon a time, it was hard to keep track of data and its comings and goings. That time has passed. We are now accountable for what we know and how we know it. To whatever extent we're able to keep track of that, people and organizations have a fundamental right to some say in what is done with what is known about them (see the “right to be forgotten”), and if an enterprise can't demonstrate that its data was obtained in a legal or ethical way, they are taking on liability by using it.
Individuals will have ways to demand removal of their personal information, as will governments and other regulatory agencies, and if “toxic” data is inextricably woven into your data set, then you have to throw out the baby and the bathwater. But if you've kept track of where the data came from and when you got it, you'll be able to unspool the effects of one toxic record or a million records based on complex and dynamic selection criteria. There's a whole realm of ability that comes when you keep track of how you know what you know, where the data came from, and when you got it.
Beginnings are Hard
One of the consequences of this early stage of data development is that consultants, vendors and companies build data structures in different ways. For both practical and commercial reasons, your data is often built into something closed, something proprietary, even something you have to pay a license fee to keep access to. This leads to some dire consequences when companies need to change and evolve.
The things that companies should have the most intimate relationship with and control over become something they have to pay to use, to ask new questions about, or to build into new capabilities. The most important benefits of well-organized data, flexibility and vision, are discouraged by the price tag of experimentation and implementation. Trying and building new things costs a lot of money if you're not the one doing it, and you will likely have to pay a fortune to move your data someplace better, cheaper, or faster, and then pay again to build and start all over. All these things discourage innovation, make organizations harder to maneuver, and can make evolution and improvement prohibitively difficult and expensive.
Principled Data Management
If all this has you feeling lost and wishing for better ways to navigate the way forward, there are some principles that can help keep your wits about you. When taken together, they (and a few others) go a long way toward describing a new way.
Own your data. Before the word ‘data’ triggers an anxious sense of disorienting noise, think about what it is: information about things that matter to you, things you have found and done and learned. You should be able to access it, read it and interpret it for yourself, because it's yours. It's you.
An application that collects everything for you in an inscrutable and inaccessible array of files and folders is not giving you ownership of your data. Your data should exist in clear and comprehensible structures, described by language, related across many dimensions in ways that make sense to you.
Durable knowledge. The difficult part of building the knowledge of your enterprise into something beautiful is in the questions you have to ask and answer about what things are and what they mean. And the fruits of that conversation are some of the most valuable things your enterprise has. Every future project will be made easier by answering them, so it’s important to own that knowledge for yourself in a way that remains useful, that is used and improved, and that you can bring with you as technology inevitably changes. When it does, you will already have a head start.
The same principle applies to the logic and processes that drive your organization. If you need something simple, you can build it in a day. If you need something complicated, you can build it in a couple of days, because you already know what you know how to measure it and what you can do with it. And building a new tool or capability doesn't change any of that, and doesn't require you to redefine it, because you've already had that conversation. And the process becomes a quick back and forth, rather than a prolonged and expensive negotiation.
These fundamental questions and illustrate and create durable knowledge about your enterprise and your organization. And these structures, these documents and data sets that you're building will be useful to and usable by applications and developers for whatever your needs might be, while remaining yours. A new partner or a new vendor can build almost exactly what you need on the first try, without dozens of hours of discovery and forensics, and then hundreds of hours of discussion about names and concepts and ontologies. Once you have built this knowledge, it remains, and it is yours.
Legibility. The data you have that you own, that is yours, is in a format that is easily read by both human and machine intelligences. In the past, we've been hindered by the need for strict standards in order to make our information comprehensible to machines. That is no longer the limiting factor, so nearly any sensible and consistent organization scheme can be interpreted by machine intelligences and intelligent systems.
In the past, we've sacrificed human sensibility in order to make our information usable and programmable. Now, the machine learning side of the balance is so capable and adaptable, we can put more attention to maintaining the human sensibility of our knowledge graphs and data structures. Intelligent systems can help us find places where we might have misunderstandings or missed connections, and can bring on board new data sources using schemes that are clear and easy to read and maintain.
These sensible, legible structures also make it easy for us to integrate with any new tool or methodology we might come across. Doing things with data involves being able to say to something external, this is the information I want you to act on. This is what I want you to know, so that you can do things for me with it. Imagine if that were as easy as going to the spice rack or the tackle box, or the golf bag. There's still a great deal of nuance and flexibility and creativity. But you know what your tools are. You know how they perform, you know what their shortcomings might be. And often those things you know correspond directly to you and your experience, and might be conditional on multiple different factors and even tools.
The graph. Those questions and answers, and the data connections they enable, define the terms of a model, a “graph” that connects the things you know and makes it possible to find them simply by asking. It weaves together what you know about your enterprise into a whole, into a holograph. In this graph, things exist as you understand them, with properties that are meaningful to you. That technicolor mass of raw data is still there, but the graph raises the level of intelligence by applying concepts you know, honoring the connections you see and suggesting new ones.
Everything you know is connected by you. Your data may come from hundreds of different places, but what unifies it is you. So this holograph brings together the things you know around you and your understanding of them, and the way you think about them and talk about them and use them. It is more than just a map, though. It is built from your data, even as it translates raw data into information and knowledge. This graph will be the defining structure of the next stage of data development, by making your experience of data not disorienting, but centered, not confusing, but fluent.
How It Should Be
Working with your data should feel like going to the golf bag or the tackle box, working in the kitchen or the wood shop. It should be something you can care about and know you can do well. The only impediment to having that kind of experience with your enterprise data is the decision to do it. There already plenty of tools to help, but the essential thing to remember is that you won't be beholden to any of them.
This new thing you're creating, the graph, is yours. It's what you know and how you know it. If you put it into play with the best tool for the job this week, and six months later a new best tool takes the stage and dazzles, moving on to the next will be like changing horses, not like building a new car.
Go back and take a moment to remember how you feel when interacting with your data. Now think about how it feels when you are doing something you love and are comfortable with. There is beauty and joy in that experience, like a graceful brush stroke or a carved line that points toward something new. Imagine if you could feel that way about your data… because you can.