vO.1.5 Release Notes

This release focuses on several BlueGraph’s dependecy issues, in particular, this release

  • adds more dependency constraints for the library;

  • adds pinned versions for the cord19kg.apps part (in order to make it reproducible);

  • moves gensim (together with its dependency tensorflow) to optional dependencies (can be installed with extras, for example, pip install bluegraph[gensim], pip install bluegraph[stellargraph], pip install bluegraph[dev] or pip install bluegraph[all]) which makes the installation process for the basic library faster.

It also implements several bugfixes and features described below.

Backend support


  • Bugfixes in StellarGraph-based embeddings.

  • Inductive node embedding models (e.g. attri2vec or GraphSAGE) now support biased random walks.

Graph preprocessing with BlueGraph

Co-occurrence generation

Support for multi-set co-occurrence is added. Consider the following example:

from bluegraph import PandasPGFrame
from bluegraph.preprocess import CooccurrenceGenerator

graph = PandasPGFrame()
graph.add_nodes(["node1", "node2"])
        ["node1", ["a", "a", "b", "b", "c", "c"]],
        ["node2", ["a", "b", "b", "c", "c", "c"]]
    ], columns=["@id", "factor"])

We want to generate co-occurrence edges for the given nodes using their property factor. Note that the property values are lists with some elements occurring multiple times (they are multi-sets).

generator = CooccurrenceGenerator(graph)
edges = generator.generate_from_nodes(
    "factor", compute_statistics=["frequency"])

The multi-set of common factors for the two nodes is the following:

>>> edges[["common_factors"]].to_dict()
{'frequency': {('node1', 'node2'): ["a", "b", "b", "c", "c"]}}

Therefore, the total co-occurrence frequency of node1 and node2 is 5:

>>> edges[["frequency"]].to_dict()
{'frequency': {('node1', 'node2'): 5}}



This release - fixes nexus-forge mappings to local files (in services.embedder); - fixes Dockerimage to fetch nexusforge from the source.