Data view

From OIAr Archive 2013
Jump to navigation Jump to search

When we model infrastructure, we require a view on data to help us model the right properties of the infrastructure. OIAm uses the following view on data.

Data services

In the simplest view, infrastructure can perform three different actions for its clients, regardless if the client is a user, an IT system, or another infrastructure facility.

  • It can transport data: infrastructure accepts data at a certain location and point in time, and delivers that same data at a different location at (almost) the next moment;
  • It can retain data: infrastructure accepts data at a certain location and point in time, and returns that same data at the same location on request at a future point in time;
  • It can process data: infrastructure accepts instructions and/or data at a certain location and point in time, and manipulates the data according to specified sets of rules/algorithms/programs. This processing includes taking decisions, such as whether to send a chunk of data on for transport to another location or not. Note that the infrastructure has obligations both to its users and to the organization, but in case of contradicting requirements, the latter will override the former.

Since transport and retention are relative straightforward functions, the majority of infrastructure functions fall in the third category.

Data classification

OIAm differentiates between the following four data classes:

  1. In IT hardware, all information is basically constructed of bits, bytes or similar data elements. OIAm calls this “data at the hardware level” or “raw bits”. At this lowest level, infrastructure has no notion of the relation between one data element and the next.
  2. Infrastructure is capable of handling data in coherent chunks that are larger than the minute pieces that are handled at the hardware level. The infrastructure still has no real notion of the content of a chunk of data, other than that it's a specific sequence of bits/bytes, handed over by a client with a certain set of properties. This is called “loosely structured data”.
  3. Furthermore, when information can be represented by a set of data chunks with a specific internal cohesion, then infrastructure can manipulate the data in the set according to its knowledge of the internal cohesion. The prevalent form of these data sets are databases. OIAm calls this type of data “strictly structured data”.
  4. Data can also appear in a special form: when a sequence of data elements must arrive at a location over time, then OIAm calls this sequence a “stream”. The time dependence of a stream defines that it's about data in transport. It also strongly suggests that a data element that arrives “late” becomes irrelevant and may be discarded. Examples of streams are digital transmissions of voice and video data, but also the connection from a game server to a game client.