Schema for a generic data distribution record

This schema is centered on the Distribution class for describing concrete data distributions, such as an individual file, an archive of files, or a directory of files.

The schema builds on the elements and principles of the Thing and Provenance schemas, and extends them with elements from DCAT vocabulary.

Through the joint set of included concepts and properties this schema supports the description of

  • data versions and composition
  • data access methods
  • data access rights and policies
  • related resources, including topics, data types/formats
  • provenance of data and related entities

Importantly, all this information can be represented using the Distribution class as a structural container. Hence this schema is particularly suitable for systems that (only) support attaching metadata to data objects.

For more information, see the general documentation, and concrete examples on the documentation pages of individual classes. Some noteworthy examples are

The schema is available as

URI: https://concepts.datalad.org/s/distribution/unreleased

Name: distribution-schema

Schema Diagram

erDiagram Distribution { uriList access_url NonNegativeInteger byte_size W3CISO8601 date_modified W3CISO8601 date_published uriList download_url uriorcurie format string media_type uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } Resource { W3CISO8601 date_modified W3CISO8601 date_published stringList keyword uri landing_page string version uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } LicenseDocument { string license_text uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } DistributionPart { string name uriorcurie object } DataService { string download_url_template uri endpoint_description uri endpoint_url W3CISO8601 date_modified W3CISO8601 date_published stringList keyword uri landing_page string version uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } QualifiedAccess { } ThingMixin { uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } ValueSpecificationMixin { uriorcurie range string value } AttributeSpecification { uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type uriorcurie range string value } Property { uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } Statement { uriorcurie object } Thing { uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } ValueSpecification { uriorcurie range string value uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } Annotation { uriorcurie annotation_tag string annotation_value } Identifier { uriorcurie creator string notation NodeUriOrCurie schema_type } IssuedIdentifier { string schema_agency uriorcurie creator string notation NodeUriOrCurie schema_type } ComputedIdentifier { uriorcurie creator string notation NodeUriOrCurie schema_type } Checksum { uriorcurie creator HexBinary notation NodeUriOrCurie schema_type } DOI { string schema_agency uriorcurie creator string notation NodeUriOrCurie schema_type } Role { uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } Relationship { uriorcurie object } Location { uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } InstantaneousEvent { W3CISO8601 at_time uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } Agent { uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } Activity { W3CISO8601 ended_at W3CISO8601 started_at uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } Entity { uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } SoftwareAgent { uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } Person { stringList additional_names string family_name string given_name string honorific_name_prefix string honorific_name_suffix string formatted_name uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } Organization { string name string short_name uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } Project { string short_name string title W3CISO8601 ended_at W3CISO8601 started_at uriorcurie id uriorcurieList broad_mappings uriorcurieList close_mappings string description uriorcurieList exact_mappings uriorcurieList narrow_mappings uriorcurieList related_mappings NodeUriOrCurie schema_type } Distribution ||--}o DataService : "access_service" Distribution ||--}o Checksum : "checksum" Distribution ||--}o Distribution : "has_part" Distribution ||--|o Resource : "is_distribution_of" Distribution ||--|o LicenseDocument : "license" Distribution ||--}o QualifiedAccess : "qualified_access" Distribution ||--}o DistributionPart : "qualified_part" Distribution ||--}o Identifier : "identifiers" Distribution ||--}o Relationship : "qualified_relations" Distribution ||--}o Agent : "attributed_to" Distribution ||--}o Entity : "derived_from" Distribution ||--}o Activity : "generated_by" Distribution ||--}o Thing : "relations" Distribution ||--}o Annotation : "annotations" Distribution ||--}o AttributeSpecification : "attributes" Distribution ||--}o Statement : "characterized_by" Resource ||--|o Agent : "contact_point" Resource ||--|o Resource : "is_part_of" Resource ||--|o Resource : "is_version_of" Resource ||--}o Identifier : "identifiers" Resource ||--}o Relationship : "qualified_relations" Resource ||--}o Agent : "attributed_to" Resource ||--}o Entity : "derived_from" Resource ||--}o Activity : "generated_by" Resource ||--}o Thing : "relations" Resource ||--}o Annotation : "annotations" Resource ||--}o AttributeSpecification : "attributes" Resource ||--}o Statement : "characterized_by" LicenseDocument ||--}o Identifier : "identifiers" LicenseDocument ||--}o Relationship : "qualified_relations" LicenseDocument ||--}o Agent : "attributed_to" LicenseDocument ||--}o Entity : "derived_from" LicenseDocument ||--}o Activity : "generated_by" LicenseDocument ||--}o Thing : "relations" LicenseDocument ||--}o Annotation : "annotations" LicenseDocument ||--}o AttributeSpecification : "attributes" LicenseDocument ||--}o Statement : "characterized_by" DataService ||--|o Agent : "contact_point" DataService ||--|o Resource : "is_part_of" DataService ||--|o Resource : "is_version_of" DataService ||--}o Identifier : "identifiers" DataService ||--}o Relationship : "qualified_relations" DataService ||--}o Agent : "attributed_to" DataService ||--}o Entity : "derived_from" DataService ||--}o Activity : "generated_by" DataService ||--}o Thing : "relations" DataService ||--}o Annotation : "annotations" DataService ||--}o AttributeSpecification : "attributes" DataService ||--}o Statement : "characterized_by" QualifiedAccess ||--}o DataService : "access_service" ThingMixin ||--}o Annotation : "annotations" ThingMixin ||--}o AttributeSpecification : "attributes" ThingMixin ||--}o Statement : "characterized_by" AttributeSpecification ||--|| Property : "predicate" AttributeSpecification ||--}o Annotation : "annotations" AttributeSpecification ||--}o AttributeSpecification : "attributes" AttributeSpecification ||--}o Statement : "characterized_by" Property ||--}o Thing : "relations" Property ||--}o Annotation : "annotations" Property ||--}o AttributeSpecification : "attributes" Property ||--}o Statement : "characterized_by" Statement ||--|| Property : "predicate" Thing ||--}o Thing : "relations" Thing ||--}o Annotation : "annotations" Thing ||--}o AttributeSpecification : "attributes" Thing ||--}o Statement : "characterized_by" ValueSpecification ||--}o Thing : "relations" ValueSpecification ||--}o Annotation : "annotations" ValueSpecification ||--}o AttributeSpecification : "attributes" ValueSpecification ||--}o Statement : "characterized_by" Role ||--}o Thing : "relations" Role ||--}o Annotation : "annotations" Role ||--}o AttributeSpecification : "attributes" Role ||--}o Statement : "characterized_by" Relationship ||--}| Role : "roles" Location ||--}o Identifier : "identifiers" Location ||--}o Relationship : "qualified_relations" Location ||--}o Thing : "relations" Location ||--}o Annotation : "annotations" Location ||--}o AttributeSpecification : "attributes" Location ||--}o Statement : "characterized_by" InstantaneousEvent ||--}o Identifier : "identifiers" InstantaneousEvent ||--}o Relationship : "qualified_relations" InstantaneousEvent ||--}o Thing : "relations" InstantaneousEvent ||--}o Annotation : "annotations" InstantaneousEvent ||--}o AttributeSpecification : "attributes" InstantaneousEvent ||--}o Statement : "characterized_by" Agent ||--}o Agent : "acted_on_behalf_of" Agent ||--|o Location : "at_location" Agent ||--}o Identifier : "identifiers" Agent ||--}o Relationship : "qualified_relations" Agent ||--}o Thing : "relations" Agent ||--}o Annotation : "annotations" Agent ||--}o AttributeSpecification : "attributes" Agent ||--}o Statement : "characterized_by" Activity ||--|o Location : "at_location" Activity ||--}o Identifier : "identifiers" Activity ||--}o Relationship : "qualified_relations" Activity ||--}o Agent : "associated_with" Activity ||--}o Activity : "informed_by" Activity ||--}o Thing : "relations" Activity ||--}o Annotation : "annotations" Activity ||--}o AttributeSpecification : "attributes" Activity ||--}o Statement : "characterized_by" Entity ||--}o Identifier : "identifiers" Entity ||--}o Relationship : "qualified_relations" Entity ||--}o Agent : "attributed_to" Entity ||--}o Entity : "derived_from" Entity ||--}o Activity : "generated_by" Entity ||--}o Thing : "relations" Entity ||--}o Annotation : "annotations" Entity ||--}o AttributeSpecification : "attributes" Entity ||--}o Statement : "characterized_by" SoftwareAgent ||--}o Agent : "acted_on_behalf_of" SoftwareAgent ||--|o Location : "at_location" SoftwareAgent ||--}o Identifier : "identifiers" SoftwareAgent ||--}o Relationship : "qualified_relations" SoftwareAgent ||--}o Thing : "relations" SoftwareAgent ||--}o Annotation : "annotations" SoftwareAgent ||--}o AttributeSpecification : "attributes" SoftwareAgent ||--}o Statement : "characterized_by" Person ||--}o Agent : "acted_on_behalf_of" Person ||--|o Location : "at_location" Person ||--}o Identifier : "identifiers" Person ||--}o Relationship : "qualified_relations" Person ||--}o Thing : "relations" Person ||--}o Annotation : "annotations" Person ||--}o AttributeSpecification : "attributes" Person ||--}o Statement : "characterized_by" Organization ||--}o Agent : "acted_on_behalf_of" Organization ||--|o Location : "at_location" Organization ||--}o Identifier : "identifiers" Organization ||--}o Relationship : "qualified_relations" Organization ||--}o Thing : "relations" Organization ||--}o Annotation : "annotations" Organization ||--}o AttributeSpecification : "attributes" Organization ||--}o Statement : "characterized_by" Project ||--|o Location : "at_location" Project ||--}o Identifier : "identifiers" Project ||--}o Relationship : "qualified_relations" Project ||--}o Agent : "associated_with" Project ||--}o Activity : "informed_by" Project ||--}o Thing : "relations" Project ||--}o Annotation : "annotations" Project ||--}o AttributeSpecification : "attributes" Project ||--}o Statement : "characterized_by"

Classes

Class Description
Annotation A tag/value pair with the semantics of OWL Annotation.
AttributeSpecification An attribute is conceptually a thing, but it requires no dedicated identifier (id). Instead, it is linked to a Thing via its attributes slot and declares a predicate on the nature of the relationship.
DistributionPart An association class for attaching additional information to a hasPart relationship.
Identifier An identifier is a label that uniquely identifies an item in a particular context. Some identifiers are globally unique. All identifiers are unique within their individual scope.
        ComputedIdentifier An identifier that has been derived from information on the identified entity.
                Checksum A Checksum is a value that allows to check the integrity of the contents of a file. Even small changes to the content of the file will change its checksum. This class allows the results of a variety of checksum and cryptographic message digest algorithms to be represented.
        IssuedIdentifier An identifier that was issued by a particular agent with a notation that has no (or an undefined) relation to the nature of the identified entity.
                DOI Digital Object Identifier (DOI; ISO 26324), an identifier system governed by the DOI Foundation, where individual identifiers are issued by one of several registration agencies.
QualifiedAccess An association class for attaching additional information to an access_service relationship between a dcat:Distribution and a dcat:DataService.
Relationship An association class for characterizing the relation between two things with the role(s) the object had with respect to the subject. A relationship is always between two things only, but can be annotated with multiple roles (for example, a person having both an author role with respect to a dataset, and also being the person who is legally responsible contact for it).
Statement An RDF statement that links a predicate (a Property) with an object (a Thing) to the subject to form a triple. A Statement is used to qualify a relation to a Thing referenced by its identifier. For specifying a qualified relation to an attribute that has no dedicated identifier, use an AttributeSpecification.
Thing The most basic, identifiable item. In addition to the slots that are common between a Thing and an AttributeSpecification (see ThingMixin), two additional slots are provided. The id slot takes the required identifier for a Thing. The relation slot allows for the inline specification of other Thing instances. Such a relation is unqualified (and symmetric), and should be further characterized via a Statement (see characterized_by). From a schema perspective, the relation slots allows for building self-contained, structured documents (e.g., a JSON object) with arbitrarily complex information on a Thing.
        Activity An activity is something that occurs over a period of time and acts upon or with entities; it may include consuming, processing, transforming, modifying, relocating, using, or generating entities.
                Project A collective endeavour of some kind. Typically it is a planned process that is undertaken or attempted to meet some requirement, or to achieve a particular goal.
        Agent Something that bears some form of responsibility for an activity taking place, for the existence of an entity, or for another agent's activity.
                Organization A social or legal instititution such as a company, a society, or a university.
                Person Person agents are people, alive, dead, or fictional.
                SoftwareAgent Running software.
        Entity A physical, digital, conceptual, or other kind of thing with some fixed aspects; entities may be real or imaginary.
                Distribution A specific representation of data, which may come in the form of a single file, or an archive or directory of many files, may be standalone or part of a dataset.
                LicenseDocument A legal document giving official permission to do something with a resource.
                Resource Resource published or curated by a single agent.
                        DataService A collection of operations that provides access to one or more distributions or data processing functions.
        InstantaneousEvent A moment of a transition from one particular state of the world to another.
        Location A location can be an identifiable geographic place (ISO 19112), but it can also be a non-geographic place such as a directory, row, or column. As such, there are numerous ways in which location can be expressed, such as by a coordinate, address, landmark, and so forth.
        Property An RDF property, a Thing used to define a predicate, for example in a Statement.
        Role A role is the function of a resource or agent with respect to a subject, in the context of resource attribution or relationships.
        ValueSpecification A Thing that is a value of some kind. This class can be used to describe an outcome of a measurement, a factual value or constant, or other qualitative or quantitative information with an associated identifier. If no identifier is available, an AttributeSpecification can be used within the context of an associated Thing (attributes).
ThingMixin Mix-in with the common interface of Thing and AttributeSpecification. This interface enables type specifications (rdf:type) for things and attributes via a type designator slot to indicate specialized schema classes for validation where a slot's range is too generic. A thing or attribute can be further describe with statements on qualified relations to other things (characterized_by), or inline attributes (attributes). A set of mappings slots enables the alignment for arbitrary external schemas and terminologies.
ValueSpecificationMixin Mix-in for a (structured) value specification. Two slots are provided to define a (literal) value (value) and its type (range).

Slots

Slot Description
about A relation of an information artifact to the subject, such as a URL identifyi...
access_service A data service that gives access to a distribution
access_url URL that gives access to the subject
acted_on_behalf_of Assign the authority and responsibility for carrying out a specific activity ...
additional_names Additional name(s) associated with the subject, such as one or more middle na...
address Physical address of the subject, such as a postal address, a bibliographic lo...
affiliation An organization that an agent is affiliated with
annotation_tag A tag identifying an annotation
annotation_value The actual annotation
annotations A record of properties of the metadata record on a subject, a collection of t...
associated_with An activity association is an assignment of responsibility to an agent for an...
at_location Associate the subject with a location
at_time Time at which an instanteneous event takes place or took place
attributed_to Attribution is the ascribing of an entity to an agent
attributes Declares a relation that associates a Thing (or another attribute) with an ...
broad_mappings A list of terms from different schemas or terminology systems that have broad...
byte_size The size of a distribution in bytes
characterized_by Qualifies relationships between a subject Thing and an object Thing with ...
checksum The checksum property provides a mechanism that can be used to verify that th...
close_mappings A list of terms from different schemas or terminology systems that have close...
conforms_to An established standard to which the subject conforms
contact_point Relevant contact information for the subject
creator An agent responsible for making an entity
date_modified Timepoint at which the subject was (last) changed, updated or modified
date_published Timepoint at which the subject was (last) published
derived_from Derivation is a transformation of an entity into another, an update of an ent...
description A free-text account of the subject
distribution An available distribution of a resource
download_url URL that gives direct access to the subject in the form of a downloadable fil...
download_url_template A URL template with placeholders enclosed in braces ({example})
email Email address associated with an entity
ended_at End is when an activity is deemed to have been ended by some trigger
endpoint_description A description of the services available via the end-points, including their o...
endpoint_url The root location or primary endpoint of a service (a Web-resolvable IRI)
exact_mappings A list of terms from different schemas or terminology systems that have ident...
family_name The (inherited) family name of the subject
format The file format of a distribution
formatted_name A formatted text corresponding to the name of the subject
generated_by Generation is the completion of production of a new entity by an activity
given_name The given (non-inherited) name of the subject
has_part A related resource that is included either physically or logically in the des...
honorific_name_prefix The honorific prefix(es) of the subject's name
honorific_name_suffix The honorific suffix(es) of the subject's name
id Persistent and globally unique identifier of a Thing
identifiers An unambiguous reference to the subject within a given context
informed_by Communication is the exchange of an entity by two activities, one activity us...
is_distribution_of Inverse property of dcat:distribution
is_part_of A related resource that is included either physically or logically in the des...
is_version_of A related resource of which the described resource is a version
keyword One or more keywords or tags describing the resource
landing_page A Web page that can be navigated to in a Web browser to gain access to a reso...
license A legal document under which the resource is made available
license_text A copy of the actual text of a license reference, file or snippet that is ass...
mappings A list of terms from different schemas or terminology systems that have compa...
media_type The media type of a distribution as defined by IANA
name Name of the subject
narrow_mappings A list of terms from different schemas or terminology systems that have narro...
notation String of characters such as "T58:5" or "30:4833" used to uniquely identify a...
object Reference to a Thing within a Statement
predicate Reference to a Property within a Statement
qualified_access Link to a description of a access_service relationship with `dcat:DataServi...
qualified_part Qualified a hasPart relationship with another entity
qualified_relations Characterizes the relationship or role of an entity with respect to the subje...
range Declares that the value of a Thing or AttributeSpecification are instance...
related_mappings A list of terms from different schemas or terminology systems that have relat...
relations Declares an unqualified relation of the subject Thing to another Thing
roles Describes the function of an entity or agent (object) within the scope of a `...
same_as Declares that the subject and an object are equal
schema_agency Name of the agency that issued an identifier
schema_type State that the subject is an instance of a particular schema class
short_name A shortened name for the subject
started_at Start is when an activity is deemed to have been started by some trigger
title A summarily description of the subject
value Value of a thing
version Version indicator (name or identifier) of a resource

Enumerations

Enumeration Description

Types

Type Description
Boolean A binary (true or false) value
Curie a compact URI
Date a date (year, month and day) in an idealized calendar
DateOrDatetime Either a date or a datetime
Datetime The combination of a date and time
Decimal A real number with arbitrary precision that conforms to the xsd:decimal speci...
Double A real number that conforms to the xsd:double specification
EmailAddress RFC 5322 compliant email address
Float A real number that conforms to the xsd:float specification
HexBinary hex-encoded binary data
Integer An integer
Jsonpath A string encoding a JSON Path
Jsonpointer A string encoding a JSON Pointer
Ncname Prefix part of CURIE
Nodeidentifier A URI, CURIE or BNODE that represents a node in a model
NodeUriOrCurie A type referencing an graph node
NonNegativeInteger An integer
Objectidentifier A URI or CURIE that represents an object in the model
Sparqlpath A string encoding a SPARQL Property Path
String A character string
Time A time object represents a (local) time of day, independent of any particular...
Uri a complete URI
Uriorcurie a URI or a CURIE
W3CISO8601 W3C variant/subset of IS08601 for specifying date(times)

Subsets

Subset Description