Semantic MediaWiki and related extensions
datamodel

A https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/master/docs/architecture/datamodel.dataitem.md "DataItem" represents the system perspective on the data to interact with a database in order to allow data to be managed, stored, and queried.

The DataItem class and its subclasses are the basic building block of all Semantic MediaWiki data elements. Its purpose is to provide a unified interface for all ''semantic entities'' that SMW deals with, e.g., numbers, dates, geo coordinates, wiki pages, and properties. It might be surprising that not only values but also subjects and properties are represented by a user facing https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/master/docs/architecture/datamodel.datavalue.md "`DataValue`" class. This makes sense since wiki pages can be both subjects and values, and since properties have many similarities with wiki pages (in particular the associated articles).

Characteristics

Objects of class https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/master/docs/architecture/datamodel.dataitem.md "DataItem" represent a very simple piece of data. A https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/master/docs/architecture/datamodel.dataitem.md "DataItem" is similar to a primitive type in PHP (e.g. a PHP string or number): its identity is determined by its contents and nothing else. Dataitems should thus be thought of as "primitive values" that are merely a bit more elaborate than the primitive types in PHP. Their main characteristics are:

Being immutable is essential for dataitems to behave like simple values. It imposes a restriction on programmers, but it also simplifies programming a lot since one does not have to be concerned about dataitems being changed by code that happens to have a reference to them.

DataItem types

The available kinds of dataitems correspond to subclasses of https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/master/docs/architecture/datamodel.dataitem.md "DataItem". For convenience, each kind of dataitem is also associated with a PHP constant called its "DIType". For example, instead of using a nested if-then-else statement with many instanceof checks, one can use a switch over this DIType to handle different cases. The following is a list all available dataitems:

Type restriction

The restriction to these types of dataitem may at first look like a major limitation, since it means that SMW can only represent limited forms of data. For example, there is no dataitem for storing the structure of chemical formulae – doesn't this mean that SMW can never handle such data? No, because the existing dataitems can be used to keep all required information (for example by representing chemical formulae as strings). The task of interpreting this basic data as a chemical formula has to be handled on higher levels that deal with user input and output using a https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/master/docs/architecture/datamodel.datavalue.md "DataValue".

Container type

There is one kind of dataitem, the DIContainer, that represents "values" that consist of many facts (subject-property-value triples); almost all complex forms of data that SMW does not have a dataitem for could be accurately represented in this format. This type uses the https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/master/docs/architecture/datamodel.semanticdata.md "SemanticData" as object representation.

Technical notes

Creating dataitems is very easy: just call the constructor of the dataitem with the required values. Note that dataitems are strict about data quality: they are not meant to show the error-tolerance of the SMW user interface. For a programmer, it is more useful to see a clear error than to have SMW use some "repaired" or partly "guessed" value when a problem occurred. When trying to create dataitems from illegal data (e.g. trying to make a wikipage for an empty page title), an exception will be thrown. Usually dataitems will only implement basic data validation to avoid complex computations. If strict validation of, say, a URI string is needed, then own methods need to be implemented.

The https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/master/docs/architecture/datamodel.dataitem.md "DataItem" implements a standard interface that allows useful operations like serialization and deserialization (a second way to create them from serialized strings). They also can generate a string hash code to efficiently compare their contents. Each dataitem also implements basic get methods to access the data, and sometimes other helper methods that are useful for the given kind of data.

The important thing is to keep data items reasonably lean and simple data containers – complex parsing or formatting functions are implemented elsewhere.


About | General disclaimer | Privacy policy