toolkit.model.37.0.0.source-code.DependencyGraph.kt Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of model Show documentation
Show all versions of model Show documentation
Part of the OSS Review Toolkit (ORT), a suite to automate software compliance checks.
/*
* Copyright (C) 2021 The ORT Project Authors (see )
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-License-Identifier: Apache-2.0
* License-Filename: LICENSE
*/
package org.ossreviewtoolkit.model
import com.fasterxml.jackson.annotation.JsonIgnore
import com.fasterxml.jackson.annotation.JsonInclude
import com.fasterxml.jackson.databind.annotation.JsonSerialize
import java.util.SortedSet
import org.ossreviewtoolkit.model.utils.DependencyGraphEdgeSortedSetConverter
import org.ossreviewtoolkit.model.utils.DependencyReferenceSortedSetConverter
import org.ossreviewtoolkit.model.utils.PackageLinkageValueFilter
/**
* Type alias for a [Map] that associates a [DependencyGraphNode] with the nodes representing its dependencies.
*/
typealias NodeDependencies = Map>
/**
* A data class that represents the graph of dependencies of a project.
*
* This class holds information about a project's scopes and their dependencies in a format that minimizes the
* consumption of memory. In projects with many scopes there is often a high degree of duplication in the
* dependencies of the scopes. To avoid this, this class aims to share as many parts of the dependency graph as
* possible between the different scopes. Ideally, there is only a single dependency graph containing the dependencies
* used by all scopes. This is not always possibles due to inconsistencies in dependency relations, like a package
* using different dependencies in different scopes. Then the dependency graph is split into multiple fragments, and
* each fragment has a consistent view on the dependencies it contains.
*
* When constructing a dependency graph the dependencies are organized as a connected structure of
* [DependencyReference] objects in memory. Originally, the serialization format of a graph was based on this
* structure, but that turned out to be not ideal: During serialization, sub graphs referenced from multiple nodes
* (e.g. libraries with transitive dependencies referenced from multiple projects) get duplicated, which can cause a
* significant amount of redundancy. Therefore, the data representation has been changed again to a form, which can be
* serialized without introducing redundancy. It consists of the following elements:
*
* - *packages*: A list with the coordinates of all the packages (free of duplication) that are referenced by the
* graph. This allows extracting the packages directly, but also has the advantage that the package coordinates do
* not have to be repeated over and over: All the references to packages are expressed by indices into this list.
* - *nodes*: An ordered list with the nodes of the dependency graph. A single node represents a package, and
* therefore has a reference into the list with package coordinates. It can, however, happen that packages occur
* multiple times in the graph if they are in different subtrees with different sets of transitive dependencies.
* Then there are multiple nodes for the packages affected, and a *fragmentIndex* is used to identify them uniquely.
* Nodes also store information about issues of a package and their linkage.
* - *edges*: Here the structure of the graph comes in. Each edge connects two nodes and represents a directed
* *depends-on* relationship. The nodes are referenced by numeric indices into the list of *nodes*.
* - *scopes*: This is a map that associates the scopes used by projects with their direct dependencies. A single
* dependency graph contains the dependencies of all the projects processed by a specific package manager.
* Therefore, the keys of this map are scope names qualified by the coordinates of a project; which makes them
* unique. The values are references to the nodes in the graph that correspond to the packages the scopes depend on
* directly.
*
* So to navigate this structure, start with a *scope* and gather the references to its direct dependency *nodes*.
* Then, by following the *edges* starting from these *nodes*, the set of transitive dependencies can be determined.
* The numeric indices can be resolved via the *packages* list.
*/
@JsonInclude(JsonInclude.Include.NON_DEFAULT)
data class DependencyGraph(
/**
* A list with the identifiers of the packages that appear in the dependency graph. This list is used to resolve
* the numeric indices contained in the [DependencyGraphNode] objects.
*/
val packages: List = emptyList(),
/**
* Stores the dependency graph as a list of root nodes for the direct dependencies referenced by scopes. Starting
* with these nodes, the whole graph can be traversed. The nodes are constructed from the direct dependencies
* declared by scopes that cannot be reached via other paths in the dependency graph. Note that this property
* exists for backwards compatibility only; it is replaced by the lists of nodes and edges.
*/
val scopeRoots: SortedSet = sortedSetOf(),
/**
* A mapping from scope names to the direct dependencies of the scopes. Based on this information, the set of
* [Scope]s of a project can be constructed from the serialized form.
*/
val scopes: Map> = emptyMap(),
/**
* A list with the nodes of this dependency graph. Nodes correspond to packages, but in contrast to the [packages]
* list, there can be multiple nodes for a single package. The order of nodes in this list is relevant; the
* edges of the graph reference their nodes by numeric indices.
*/
val nodes: List = emptyList(),
/**
* A set with the edges of this dependency graph. By traversing the edges, the dependencies of packages can be
* determined.
*/
@JsonSerialize(converter = DependencyGraphEdgeSortedSetConverter::class)
val edges: Set = emptySet()
) {
companion object {
/**
* A comparator for [DependencyReference] objects. Note that the concrete order does not really matter, it
* just has to be well-defined.
*/
val DEPENDENCY_REFERENCE_COMPARATOR = compareBy({ it.pkg }, { it.fragment })
/**
* Return a name for the given [scope][scopeName] that is qualified with parts of the identifier of the given
* [project]. This is used to ensure that the scope names are unique when constructing a dependency graph from
* multiple projects.
*/
fun qualifyScope(project: Project, scopeName: String): String = qualifyScope(project.id, scopeName)
/**
* Return a name for the given [scope][scopeName] that is qualified with parts of the given [projectId]. This
* is used to ensure that the scope names are unique when constructing a dependency graph from multiple
* projects.
*/
fun qualifyScope(projectId: Identifier, scopeName: String): String =
"${projectId.namespace}:${projectId.name}:${projectId.version}:$scopeName"
/**
* Extract the plain (un-qualified) scope name from the given qualified [scopeName]. If the passed in
* [scopeName] is not qualified, return it unchanged.
*/
fun unqualifyScope(scopeName: String): String =
// To handle the case that the scope contains the separator character, cut off the parts for the
// namespace, the name, and the version.
scopeName.split(':', limit = 4).getOrElse(3) { scopeName }
}
/**
* A mapping that allows fast access to the dependencies of a node in this graph.
*/
@get:JsonIgnore
val dependencies: NodeDependencies by lazy { constructNodeDependencies() }
/**
* Stores a mapping from dependency indices to [PackageReference] objects. This is needed when converting the
* data of this object to the classical layout of dependency information. The structure is created once and then
* used to convert parts of this graph.
*/
private val referenceMapping: Map by lazy { constructReferenceMapping() }
/**
* Transform the data stored in this object to the classical layout of dependency information, which is a set of
* [Scope]s referencing the packages they depend on.
*/
fun createScopes(): Set = createScopesFor(scopes, unqualify = true)
/**
* Transform a subset of the data stored in this object to the classical layout of dependency information. This is
* analogous to [createScopes], but only the provided [scopeNames] are taken into account. If [unqualify] is
* *true*, remove qualifiers from scope names before constructing the [Scope]s.
*/
fun createScopes(scopeNames: Set, unqualify: Boolean = true): Set =
createScopesFor(scopes.filterKeys { it in scopeNames }, unqualify)
/**
* Convert the given [map] with scope information to a set of [Scope]s. [Optionally][unqualify] remove qualifiers
* from scope names.
*/
private fun createScopesFor(map: Map>, unqualify: Boolean): Set =
map.mapTo(mutableSetOf()) { entry ->
val dependencies = entry.value.mapTo(mutableSetOf()) { index ->
referenceMapping[index.toKey()] ?: error("Could not resolve dependency index $index.")
}
val scopeName = if (unqualify) unqualifyScope(entry.key) else entry.key
Scope(scopeName, dependencies)
}
/**
* Construct a mapping from dependency indices to [PackageReference] objects. Based on this mapping, the
* structure with [Scope]s can be generated.
*/
private fun constructReferenceMapping(): Map {
val refMapping = mutableMapOf()
val allNodes = nodes.takeUnless { it.isEmpty() } ?: scopeRoots.map(DependencyReference::toGraphNode)
allNodes.forEach { constructReferenceTree(it, refMapping) }
return refMapping
}
/**
* Construct the tree with [PackageReference]s by navigating the dependency graph starting with [node] and
* populate the given [refMapping].
*/
private fun constructReferenceTree(
node: DependencyGraphNode,
refMapping: MutableMap
): PackageReference {
val indexKey = RootDependencyIndex.generateKey(node.pkg, node.fragment)
return refMapping.getOrPut(indexKey) {
val refDependencies = dependencies[node].orEmpty().mapTo(mutableSetOf()) {
constructReferenceTree(it, refMapping)
}
PackageReference(
id = packages[node.pkg],
dependencies = refDependencies,
linkage = node.linkage,
issues = node.issues
)
}
}
/**
* Construct a mapping that allows fast navigation from a graph node to its dependencies.
*/
private fun constructNodeDependencies(): NodeDependencies =
when {
nodes.isNotEmpty() -> constructNodeDependenciesFromGraph(nodes, edges)
else -> constructNodeDependenciesFromScopeRoots(scopeRoots)
}
/**
* Return a map of all de-duplicated [Issue]s associated by [Identifier].
*/
fun collectIssues(): Map> {
val collectedIssues = mutableMapOf>()
fun addIssues(pkg: Int, issues: Collection) {
if (issues.isNotEmpty()) {
collectedIssues.getOrPut(packages[pkg]) { mutableSetOf() } += issues
}
}
fun addIssues(ref: DependencyReference) {
addIssues(ref.pkg, ref.issues)
ref.dependencies.forEach { addIssues(it) }
}
for (ref in scopeRoots) {
addIssues(ref)
}
nodes.forEach { node ->
addIssues(node.pkg, node.issues)
}
return collectedIssues
}
}
/**
* A data class representing the index of a root dependency of a scope.
*
* Instances of this class are used to reference the direct dependencies of scopes in the shared dependency graph.
* These dependencies form the roots of the dependency trees spawned by scopes.
*/
data class RootDependencyIndex(
/**
* The index of the root dependency referenced by this object. Each package acting as a dependency is assigned a
* unique index. Storing an index rather than an identifier reduces the amount of memory consumed by this
* representation.
*/
val root: Int,
/**
* The index of the fragment of the dependency graph this reference points to. This is used to distinguish between
* packages that occur multiple times in the dependency graph with different dependencies.
*/
@JsonInclude(JsonInclude.Include.NON_DEFAULT)
val fragment: Int = 0
) {
companion object {
/**
* Generate a string representation for the given [root] and [fragment] that is used internally as key
* in maps.
*/
fun generateKey(root: Int, fragment: Int): String = "$root.$fragment"
}
/**
* Generate a string key to represent this index.
*/
fun toKey(): String = generateKey(root, fragment)
}
/**
* A class to model a tree-like structure to represent the dependencies of a project.
*
* Instances of this class are used to store the relations between dependencies in fragments of dependency trees in an
* Analyzer result. The main purpose of this class is to define an efficient serialization format, which avoids
* redundancy as far as possible. Therefore, dependencies are represented by numeric indices into an external table.
* As a dependency can occur multiple times in the dependency graph with different transitive dependencies, the class
* defines another index to distinguish these cases.
*
* Note: This is by intention no data class. Equality is tested via references and not via the values contained.
*/
@JsonInclude(JsonInclude.Include.NON_DEFAULT)
class DependencyReference(
/**
* Stores the numeric index of the package dependency referenced by this object. The package behind this index can
* be resolved by evaluating the list of identifiers stored in [DependencyGraph] at this index.
*/
val pkg: Int,
/**
* Stores the index of the fragment in the dependency graph where the referenced dependency is contained. This is
* needed to uniquely identify the target if the dependency occurs multiple times in the graph.
*/
val fragment: Int = 0,
/**
* A set with the references to the dependencies of this dependency. That way a tree-like structure is established.
*/
@JsonSerialize(contentConverter = DependencyReferenceSortedSetConverter::class)
val dependencies: Set = emptySet(),
/**
* The type of linkage used for the referred package from its dependent package. As most of ORT's supported
* package managers / languages only support dynamic linking or at least default to it, also use that as the
* default value here to not blow up ORT result files.
*/
@JsonInclude(value = JsonInclude.Include.CUSTOM, valueFilter = PackageLinkageValueFilter::class)
val linkage: PackageLinkage = PackageLinkage.DYNAMIC,
/**
* A list of [Issue]s that occurred handling this dependency.
*/
val issues: List = emptyList()
) : Comparable {
/**
* Define an order on [DependencyReference] instances. Instances are ordered by their indices and fragment indices.
*/
override fun compareTo(other: DependencyReference): Int =
if (pkg != other.pkg) {
pkg - other.pkg
} else {
fragment - other.fragment
}
}
/**
* A data class representing a node in the dependency graph.
*
* A node corresponds to a package, which is referenced by a numeric index. A package may, however, occur multiple
* times in the dependency graph with different transitive dependencies. In this case, different fragment indices are
* used to distinguish between these occurrences.
*/
@JsonInclude(JsonInclude.Include.NON_DEFAULT)
data class DependencyGraphNode(
/**
* Stores the numeric index of the package dependency referenced by this object. The package behind this index can
* be resolved by evaluating the list of identifiers stored in [DependencyGraph] at this index.
*/
val pkg: Int,
/**
* Stores the index of the fragment in the dependency graph where the referenced dependency is contained. This is
* needed to uniquely identify the target if the dependency occurs multiple times in the graph.
*/
val fragment: Int = 0,
/**
* The type of linkage used for the referred package from its dependent package. As most of ORT's supported
* package managers / languages only support dynamic linking or at least default to it, also use that as the
* default value here to not blow up ORT result files.
*/
@JsonInclude(value = JsonInclude.Include.CUSTOM, valueFilter = PackageLinkageValueFilter::class)
val linkage: PackageLinkage = PackageLinkage.DYNAMIC,
/**
* A list of [Issue]s that occurred handling this dependency.
*/
val issues: List = emptyList()
)
/**
* A data class representing an edge in the dependency graph.
*
* An edge corresponds to a directed depends-on relationship between two packages. The packages are identified by the
* numeric indices into the list of nodes.
*/
data class DependencyGraphEdge(
/** The index of the source node of this edge. */
val from: Int,
/** The index of the destination node of this edge. */
val to: Int
)
/**
* Convert this [DependencyReference] to a [DependencyGraphNode].
*/
private fun DependencyReference.toGraphNode() = DependencyGraphNode(pkg, fragment, linkage, issues)
/**
* Construct a mapping of dependencies based on the given [roots].
*/
private fun constructNodeDependenciesFromScopeRoots(roots: Set): NodeDependencies {
val mapping = mutableMapOf>()
fun construct(refs: Set) {
refs.forEach { ref ->
val node = ref.toGraphNode()
if (node !in mapping) {
mapping[node] = ref.dependencies.map(DependencyReference::toGraphNode)
construct(ref.dependencies)
}
}
}
construct(roots)
return mapping
}
/**
* Construct a mapping of dependencies based on the lists of graph [nodes] and [edges].
*/
private fun constructNodeDependenciesFromGraph(
nodes: List,
edges: Set
): NodeDependencies {
val mapping = mutableMapOf>()
edges.forEach { edge ->
val srcNode = nodes[edge.from]
val dstNode = nodes[edge.to]
val dependencies = mapping.getOrPut(srcNode) { mutableListOf() }
dependencies += dstNode
}
// Add entries for nodes without dependencies.
nodes.filter { it !in mapping }.forEach { mapping[it] = mutableListOf() }
return mapping
}