Components.js: a semantic dependency injection framework for addressable and discoverable software configurations

  • 1IDLab, Department of Electronics and Information Systems, Ghent University – imec
  • 2Enterprise Information Systems Department, University of Bonn


In empirical software engineering, replications provide knowledge on what results hold under which conditions. Often, these ought to be exact: the experimental procedures and their accompanying software are matched as closely as possible. Because different algorithms, or implementations thereof, need to be easily swappable in a transparent manner, a dependency injection design benefits experimental software. Various wiring of independent components can be created, tested, and compared—even through means of static program analysis. Within the Semantic Web, the opportunity exists to move beyond the local scope of existing dependency injection frameworks and facilitate exact replication on the Web with addressable, dereferenceable, and unambiguous software configurations. Therefore, we introduce Components.js, a semantic dependency injection framework for JavaScript, that (i) describes software components using a Object-Oriented Components and Object Mapping vocabulary, (ii) automatically instantiates experimental configurations using linking and dereferencing, and (iii) is complementary to the modular programming of package managers. This article presents the framework, its application, and incorporates a proof of concept on discovery of experimental software configurations on the Web. All 700,000+ Node Package Manager libraries were described as more than 300 million RDF triples, which interlink different modules. Thereby, a set of queries provide insights like what experiments use the same algorithm, or the different implementations of the same function. Components.js enables research articles to complete the provenance chain of experimental results. This ultimately brings faster and more accurate reproductions of experiments, and facilitates the evaluation of new research contributions.

Keywords Dependency injection; Linked Data; Semantic Web; RDF; JavaScript; Reproducibility; Artificial intelligence

Notifications and annotations

In reply to


Among the many fields conveyed by the Semantic Web domain, empirical software engineering [1][2] is undeniably prominent. Here, research concerns itself with empirical observation of software engineering artifacts and the empirical validation of software engineering theories and assumptions [3], thus relieving tension between the curiosity-driven science and the utility-driven engineering. Evidently, this includes developing software in a way that improves reporting, i.e., supporting a systematic, standardized presentation of empirical research in publications [4], and conducting controlled experiments, i.e., testing hypotheses where one or more independent variables (treatment) are manipulated to measure their effect on one or more dependent variables (e.g., execution time, etc.) [5]. Experimental software therefore preferably supports the exact replication of experimental procedures [6], which keep the conditions of the experiment dependent (all remain the same or very similar), or independent (one or more major aspects are deliberately varied).

Rather than obscuring experimental software in monolithic, non-transparent packages—often referred to in an ambiguous way by only name or version number—different algorithms and implementations thereof need to be easily swappable in a transparent manner. The latter is embodied by the Dependency Injection [7] pattern, where instead of custom code, a generic framework—the assembler—determines the flow of control and calls upon individual software components when needed. Such components are globs of software that are intended to be used, without change, by an application that is out of the control of the writers of the component [7], and can be defined and injected independently. An external configuration document specifies the wiring of these components during the configuration phase, which is used by the assembler to perform the actual instantiation during the injection phase. With the Semantic Web in mind, these configurations could move beyond their local scope, and also improve in reporting to help finding the right experiment, understanding how it is conducted, and assess the validity of its results [4].

To this end, we present Components.js, a semantic dependency Injection framework for JavaScript applications that makes software configuration addressable and discoverable, hence surpassing existing dependency injection frameworks. The framework is open-source, available on npm, and its complete documentation can be found at https:/​/​ Furthermore, it is being actively used in tools such as the Linked Data Fragments server [8] and Comunica [9]. Within Components.js, software configurations and modules are described as Linked Data using the Object-Oriented Components vocabulary [10] and the introduced Object Mapping vocabulary [10]. By publishing such descriptions, the composition of experimental software (and parts thereof) can be unambiguously identified by IRIs and retrieved through dereferencing. Components.js automatically instantiates such software configurations, including resolving the necessary dependencies, and is fully compatible with the modular programming approach. In total, this entails the following benefits: (i) extended reporting of experiments in research articles by IRI; (ii) improved transparency and replication of experimental software; (iii) facilitation of static program analysis through the use of external, semantic configuration files; (iv) a joint data space of research articles and experimental software, enabling discoverable and queryable links between research and implementations.

Finally, we include a proof of-concept based on the Node.js package manager npm. An RDF-based description was generated for 700,000+ packages. Thereby, we demonstrate the description of an existing application and its available components (available as modules), the automated instantiation of such a configuration, and the discoverability with a set of insightful queries. Note that, although this is a JavaScript implementation, the principles are generalizable, can be implemented in other languages, or can improve cross-language replication of software.

A Semantic Dependency Injection framework

This section introduces the dependency injection framework Components.js. It can instantiate JavaScript components based on a declarative configuration. These are semantic by default (i.e., described in RDF using a set of vocabularies), but can also be non-semantic (i.e., using direct references to a JavaScript classes).

In this respect, Components.js distinguishes between three main concepts:

  • Module: a software package containing zero or more components. This is equivalent to a Node module or npm package.

  • Component: a class that can be instantiated by creating a new instance of that type with zero or more parameter values. Parameters are defined by the class and its superclasses.

  • Component Configuration: a semantic representation of an instantiation of a component into an instance based on parameters.

All concepts are described in the programming language independent Object-Oriented Components vocabulary (OO) [10]. In the following, we first explain how to describe modules and components using this vocabulary. Then, we introduce an additional Object Mapping vocabulary to describe parameter order in constructors. Finally, we explain how component configuration files are created, which describe the application wiring, and how the framework can instantiate this file to compose a working application.

Describing modules and components

The Object-Oriented Components vocabulary reuses Fowler’s definition of a software component [7] as a “glob” of software, which provides operations that can be used by other components. The instantiation of such components requires certain parameters, analog to constructor arguments in object-oriented programming. This is interpreted in the broad sense: only classes, objects and constructor parameters are considered. An overview is given in Fig. 1.

[Object-Oriented Components vocabulary diagram]

Fig. 1: Classes and properties in the Object-Oriented Components vocabulary, with as prefix oo.

Listing 1 illustrates the basic definition of a module MyModule, which is indicated by the type oo:Module. Note that we added a prefix ex to shorten the URIs. Additional metadata is added with the Description of a Project (DOAP) vocabulary, e.g., doap:name.

A module is considered a collection of components. Within object-oriented languages, this can correspond to for example a software library or an application. A component is typed as oo:Component, which is a subclass of rdfs:Class. The parameters to construct the component can therefore be defined as an rdfs:Property on a component.

PREFIX oo: <>
PREFIX om: <>
PREFIX doap: <>
PREFIX rdfs: <>
PREFIX ex: <>

ex:MyModule a oo:Module;
            doap:name "my-module".

Listing 1: A description of a module ex:MyModule.

ex:MyModule a oo:Module;
  doap:name "my-module";
  oo:component ex:MyModule/MyComponent.

ex:MyModule/MyComponent a oo:Class;
  oo:componentPath "MyComponent";
  oo:parameter ex:MyModule/MyComponent#name;
  oo:constructorArguments ( ex:MyModule/MyComponent#name ).

ex:MyModule/MyComponent#name a oo:Parameter;
  rdfs:comment "A name";
  rdfs:range "xsd:string";
  oo:uniqueValue true.

Listing 2: The component ex:MyModule/MyComponent is described as part of the module ex:MyModule.

For example, Listing 2 shows how to add the component ex:MyModule/MyComponent to the module MyModule via oo:component. The type oo:Class is one of the several defined subclasses of oo:Component, which indicates that it is instantiatable based on parameters. Each component can refer to its path within a module using the oo:componentPath predicate, which can for instance be the package name in npm. The resulting description can be included in the module (e.g., as a JSON-LD file), or can be created and referred to externally. Afterwards, it can be reused by multiple dependents.

The parameters that are used to instantiate an oo:Class are of type oo:Parameter. An oo:Parameter is a subclass of rdfs:Property, which simplifies its usage as an RDF property. oo:defaultValue allows parameters to have a default value when no other values have been provided: upon instantiation (Subsection 3.3), a closed world will be assumed. The oo:uniqueValue predicate is a flag that can be set to indicate whether or not the parameter can only have a single value.

PREFIX oo: <>
PREFIX rdfs: <>
PREFIX ldfs: <>

  a oo:Module;
  oo:component ldfs:Server:Qpf, ldfs:Datasource:Hdt, ldfs:Datasource:Sparql.
ldfs:Server:Tpf a oo:Class;
  oo:parameter ldfs:datasource, ldfs:port.
ldfs:Datasource a oo:AbstractClass;
  oo:parameter ldfs:Datasource:title.
ldfs:Datasource:Hdt a oo:Class;
  rdfs:subClassOf ldfs:Datasource;
  oo:parameter ldfs:Datasource:Hdt:file.
ldfs:Datasource:Sparql a oo:Class;
  rdfs:subClassOf ldfs:Datasource;
  oo:parameter ldfs:Datasource:Sparql:endpoint.

ldfs:datasource                 a oo:Parameter; rdfs:range ldfs:Datasource.
ldfs:port                       a oo:Parameter; rdfs:range xsd:integer.
ldfs:Datasource:title           a oo:Parameter; rdfs:range xsd:string.
ldfs:Datasource:Hdt:file        a oo:Parameter; rdfs:range ldfs:HdtFile.
ldfs:Datasource:Sparql:endpoint a oo:Parameter; rdfs:range ldfs:SparqlEndpoint.

Listing 3: The LDF server module contains, among others, an HDT and SPARQL-based datasource component, which both extend from the abstract datasource component. The HDT and SPARQL datasource are a classes, which both inherit the title parameter from the abstract datasource. The HDT datasource takes an HDT file as parameter. The SPARQL datasource takes a SPARQL endpoint IRI as parameter.

Listing 3 shows a simplified example of the Linked Data Fragments (LDF) server npm module. It exposes several components such as an HDT and SPARQL datasource and a TPF server, each of which can take multiple parameters. These are provided with a unique identifier and definition, such that the software configuration can receive a semantic interpretation.

Although the examples in this article are presented in Turtle syntax, Components.js encourages the use of JSON-LD for compatibility with JSON and the use of shortcuts. A general context is defined for the Object-Oriented Components vocabulary, which is available at https:/​/​ The dereferencaable URI of your module is defined by @id, and requireName refers to the package (as defined in npm’s package.json file).

Describing object mappings

The constructor injection described above works out of the box with single-argument constructors that accept a map, as is quite common in JavaScript. Components.js then creates a map with key/value pairs with the property IRIs and corresponding objects of all triples with the instance as subject. This map is then passed to the constructor, which reads its settings from the map. Depending on a flag, the keys and values are either full IRIs or abbreviated JSON-LD strings.

New libraries that use Components.js can be designed for such single-parameter constructors. For all other constructor types, a mapping mechanism is needed between the RDF properties and the concrete parameter order of the constructor. To this end, we introduce the Object Mapping vocabulary. Fig. 2 shows an overview of all its classes and predicates.

[Object Mapping vocabulary diagram]

Fig. 2: Overview of the classes and properties in the Object Mapping vocabulary, with as prefix om.

The vocabulary introduces the object mapping and the array mapping. An object map can have several object mapping entries, where each entry has a field name and a field value. An array map can have several array mapping entries, where each entry only has a value. Together, they can express all ways in which the flat object from the RDF description maps to an ordered list of simple or complex constructor parameters.

ldfs:Server:Tpf oo:constructorArguments ([ om:field
  [ om:fieldName "datasources"; om:fieldValue
    [ om:fieldName ldfs:Datasource:title, om:fieldValue rdf:object ] ],
  [ om:fieldName "port"; om:fieldValue: ldfs:port ].
ldfs:Datasource:Hdt oo:constructorArguments ([ om:field
  [ om:fieldName "title"; om:fieldValue: ldfs:Datasource:title ],
  [ om:fieldName "file";  om:fieldValue: ldfs:Datasource:Hdt:file ].
ldfs:Datasource:Sparql oo:constructorArguments ([ om:field
  [ om:fieldName "title";    om:fieldValue: ldfs:Datasource:title ],
  [ om:fieldName "endpoint"; om:fieldValue: ldfs:Datasource:Sparql:endpoint ].

Listing 4: The HDT and SPARQL-based datasource constructors both take a custom object as argument for the constructor. The entries of this object are mapped from the parameter values using this mapping. The TPF server constructor similarly requires a custom object, where the datasources entry points to an object that is a mapping from titles to datasources.

Listing 4 shows the mapping of the LDF component parameters to the constructor implementation. This description complements the component definitions from Listing 3 as it provides an implementation view on the component constructors. Like the component definitions, a mapping is only necessary once per module and can be reused across dependents.

Describing and instantiating a component configuration

Once modules and components are described, a component configuration can wire a JavaScript application by declaring specific instances. An instance of the module from Listing 2, with the parameter ex:MyModule/MyComponent#name set to John, is given in Listing 5. The type of our instance ex:myInstance is simply the component that must be instantiated, in this case ex:MyModule/MyComponent. All instantiations of oo:Class instances are also of type oo:Instance. The parameters that were defined by the component, can now be used as keys in the configuration file.

PREFIX ex: <>

ex:myInstance a ex:MyModule/MyComponent;
  ex:MyModule/MyComponent#name "John".

Listing 5: A component configuration file that describes the instantiation of ex:MyModule/MyComponent with the parameter ex:MyModule/MyComponent#name set to John.

An oo:Class can also be an oo:AbstractClass, which does not allow directly instantiating this component type. Abstract components can be used to define a set of shared parameters in a common ancestor. Conforming to the RDF semantics, components can have multiple ancestors, and are indicated using the rdfs:subClassOf predicate.

const Loader = require('componentsjs').Loader;

const loader = new Loader();
const myServer = await loader.instantiate('');

Listing 6: First, a new component loader is created after which the component definitions are registered. Finally, a declarative component instantiation is supplied by providing the component IRI.

The initial steps for using the framework are shown in Listing 6. First, it provides a Loader class that acts as an assembler. This Loader provides constructor injection: it dynamically calls the constructor of the component and passes the configured parameters in a single object argument. Behind the scenes, a module description is registered, which can be retrieved by automatically scanning npm modules, parsing a downloaded RDF document via URL, or reading a raw triple stream. At the time of writing, the parser accepts RDF documents serialized as either JSON-LD, Turtle, TriG, N-Triples or N-Quads. Finally, the Loader instantiates one or more components by invoking a component configuration. Listing 7 depicts the configuration file for a Linked Data Fragments server application, which is identified by http:/​/​

PREFIX ldfs: <>

:myServer a ldfs:Server:Qpf;
  ldfs:datasource :myHdtDatasource, :mySparqlDatasource.
:myHdtDatasource a ldfs:Datasource:Hdt;
  ldfs:Datasource:title "A DBpedia 2016 datasource";
  ldfs:Datasource:Hdt:file <>.
:mySparqlDatasource a ldfs:Datasource:Sparql;
  ldfs:Datasource:title "A SPARQL-based DBpedia 2016 datasource";
  ldfs:Datasource:Sparql:endpoint <>.

Listing 7: ex:myServer is a TPF server which will be loaded with a HDT and SPARQL-based datasource.

Note that, while Linked Data has an open-world assumption, our dependency injector operates in the closed-world environment of OOP. Hence, we assume that all the necessary constructor arguments are included in the configuration and are available to the loader, as this is required for features such as default arguments.

Proof of Concept

To demonstrate the merits of a semantic Dependency Injection framework for JavaScript, we present a proof of concept using the Node Package Manager (npm) library. npm is a large collection of modules with over 700,000 JavaScript libraries, all with their own features and requirements. Such package contains the description of the project together with all its versions. Using the terminology from Section 3, a specific version of a npm package is considered a module, which contains the specific dependencies and a link to the actual implementation.

npm stores the metadata of every package in a CouchDB instance, which includes the information added by the package developer in the package.json file, and additional metadata added by the npm publishing process. An example of a JSON representation of the N3.js npm package can be found at https:/​/​, which contains all the general descriptions that apply to all packages in this module, such as the name, homepage and description. To uniquely identify software components and, more importantly, interlink them, we added a JSON-LD context to the JSON metadata provided by the npm registry, and published this RDF in a server. This context is available at https:/​/​ and maps most of the npm tags to corresponding RDF predicates, leaving these tags unchanged in the JSON-LD representation.

For data fields that could not be mapped by using just the JSON-LD context, such as concatenating values to create an IRI, we modified some of the input JSON before exporting it to JSON-LD. The limitations of context mapping necessitated some other changes, the most important one relating to the specific versions of the package. This can be seen by retrieving https:/​/​ with an Accept: application/ld+json header. In this case, the package contains links to its corresponding modules, providing semantic connections between them. Additionally, some tags were added to provide identifiers and a link to the original repository.

Since JSON-LD is an RDF representation, it can easily be converted to other syntaxes, of which several are supported by our server, such as Turtle and N-Triples. These can be retrieved by sending the corresponding Accept headers. An example of some of the data generated this way can be seen in Listing 8.

npm:n3 a doap:Project;
  dcterms:abstract "Lightning fast, asynchronous, streaming...";
  dcterms:subject "turtle", "rdf", "n3", "streaming", "asynchronous";
  spdx:licenseDeclared <>;
  doap:bug-database <>;
  doap:homepage <>;
  doap:name "n3";
  owl:sameAs "";
  foaf:maker users:rubenverborgh.
users:rubenverborgh foaf:name "Ruben Verborgh".

Listing 8: This listing shows a partial representation of https:/​/​ in the Turtle syntax. Prefixes omitted for brevity.

  doap:revision "0.10.0";
  foaf:maker users:rubenverborgh;

Listing 9: This listing shows a partial representation of https:/​/​ in the Turtle syntax. Prefixes omitted for brevity.

Continuing with the examples shown above, a module of version 0.10.0 of the N3 bundle can be found at https:/​/​, while the IRI in our namespace is https:/​/​ Similarly, many of the tags are mapped by the context, while other tags had to be modified to provide more relevant triples. An example of some of the data generated for this module can be seen in Listing 9.

An important part of an npm package description are the dependencies and their semantic versions. For example, N3 0.10.0 has a dependency on async ^2.0.1. ^2.0.1 is a semantic version and corresponds to any version number of async that has a major version of 2. As can be seen in the JSON-LD, this async dependency is converted to https:/​/​, with %5E being the URL-encoded character ^. If accessed, the server detects the highest matching version number and redirects to that module. Additionally, the body of the redirect contains the relevant metadata describing this, which in this case results in the following triple (prefixed for clarity):

async:\%5E2.0.1 npm:maxSatisfying async:2.4.0.

Additionally, to properly describe which modules are being used on a machine, we created a tool that outputs the actual dependencies used by a specific package installation in RDF. This way the exact installation that was used can be described, without having to rely on the interpretation of semantic versions which can change over time.

The semantic description of software metadata provides a useful platform for simplifying tasks that require a lot manual work, such as discovering license incompatibilities between projects, which are now possible using a SPARQL query. All 700,000 npm packages produced 300,000,000+ triples, which we publish through a Triple Pattern Fragments [8] interface. These are located at https:/​/​, together with subject pages for each bundle, module and user. The triples are collected and republished daily to stay up-to-date with the available information on npm.

SELECT DISTINCT ?project ?projectName ?description WHERE {
  <> doap:release ?version.
  ?dependingversion npm:dependency ?version.
  ?project doap:release ?dependingversion.
  ?project doap:name ?projectName.
  ?project dc:abstract ?description.

Listing 10: SPARQL query to discover all dependencies of a package.

  <> doap:maintainer ?iauthor.
  ?author owl:sameAs ?iauthor.
  ?author foaf:name ?name.
  ?author foaf:mbox ?mail.

Listing 11: SPARQL query to discover the author of a package.

Queries are executed using a Triple Pattern Fragments browser client, which can provide insights that the Web was intended to give [21]. Examples are given in Listing 10 and Listing 11, which answer the questions Where is this module used? and Who wrote this code?.

As a complete example, the documentation of Components.js has been made self-instantiatable using its own framework and is available at http:/​/​


The scientific process preaches standing on the shoulders of giants. This entails continuing existing work to derive new work, but also, enabling others to build upon our work. An essential aspect of this process is reproducibility of experimental results. This concept also applies to Web research, but the Web itself, is also an ideal platform to improve the scientific process.

Unfortunately, a large number of computer science articles report software only by name or version number. This information is insufficient for readers to understand which exact version of the software, which versions of its dependencies, and which detailed configuration of the software’s components has obtained the reported results. Therefore, potential users do not necessarily obtain the correct software installation that will behave according to the article’s conclusions. Moreover, other researchers might fail in reproducing the same results because of differences in any such aspects.

Components.js deliberately adds semantics and Linked Data to dependency injection in order to accurately refer to exact software configurations used in an article, and automatically instantiating that configuration, thereby improving replication of experimental software. Furthermore, it facilitates independently replacing and evaluating components and publishing these configurations in turn.

[description diagram]

Fig. 3:research article is based on result data, which are the outcomes of an experiment. The experiment in turn also has (multiple) provenance chains, and this article provides mechanisms to describe software configurations and software modules.

Claerbout’s Principle [22] explains that an article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. This stresses the importance of reproducibility, and essentially mandates a detailed description of the executed experiment, all of the involved artifacts and actors, and the processing of the retrieved data. In essence, this entails the complete chain of provenance illustrated in Fig. 3, that links the research article to the data and the experiment that generates it, as well as all aspects surrounding that experiment.

Through this work, we make it easier to build sustainable research platforms, which helps pave the stairs to the shoulders of giants. The Linked Data Fragments server [8], for instance, is a reusable research platform that uses Components.js to achieve modularity. Furthermore, the Comunica [9] research platform for Linked Data querying also extensively uses Components.js to achieve its high modularity and flexibility. Components.js enables tools such as the LDF server and Comunica to be compatible with multiple APIs, support multiple features, and use multiple, interchangeable algorithms. Only one “core version” should exist, while many alternative configurations can co-exist. Support for different APIs and algorithms are simply pluggable components that are referred to within a configuration. Since components and configurations are identified by an IRI, they can exist anywhere on the Web. Based on an IRI, the injection framework can therefore instantiate software, and wire its dependent components together. The power of the Web is thereby leveraged, simplifying the replication of existing experiments and the creation of new ones. For example, the following config file allows a Comunica query engine to be recreated: https:/​/​


In this article, we introduced Components.js, a dependency injection framework that (i) interprets semantically described software components and their configuration, and thereby (ii) automatically instantiates experimental JavaScript Applications. Semantic Dependency Injection brings the Linked Data merits to empirical software engineering, enabling experimental software setups to be more transparent, flexible, and unambiguously citable. This enables joint discovery of experimental software and research articles by means of querying. Furthermore, experiment reporting can be extended with an IRI to the software configuration. The use of semantic configuration files can also facilitate more advanced static program analysis.

In future work, we aim to make the creation of semantic component files more developer-friendly. A smart editor can automatically parse source code and derive the appropriate semantic description on how components can be instantiated using which parameters. Additionally, these semantic component definition files provide an interesting platform for validating software dependency relations. Reasoning could for instance be done on parameter restrictions to check whether or not different bundle versions will break certain component invocations.