-->

2020-06-11

Proposing authorization for Gremlin Server

Introduction

Gremlin Server is an important part of the Apache TinkerPop ecosystem for supporting various ways of access to graph systems. In particular, Gremlin Server offers client applications based on a Gremlin Language Variant (GLV) remote access using web socket connections. In addition, Gremlin Server caters for various security mechanisms such as authentication, read-only access, subgraph access, timeouts and sandboxing.

While this sounds like a comprehensive suite of features, there is a catch: it is not possible to discriminate between authenticated users in providing or limiting access. In other words, Gremlin Server does not provide authorization.

Recent interest in authorization solutions became apparent in the TinkerPop related user forums:

Some authentication and authorization details

The definitions of authentication (is this the user who he/she claims to be?) and authorization (what is this user allowed to do?) are generally known, but the following characteristics of the underlying processes are not always immediately realized:
  1. authentication is a one-off procedure when logging in to an application, while authorization is required at the start of each new usage context (e.g. a database query or selecting some feature of an application).
  2. authorization is implicit in an application that has a private store of authentication credentials of a limited set of users (like Gremlin Server's SimpleAuthenticator). In this case, the authorization has a binary nature: the user is yes/no allowed to run every possible operation inside the application.
  3. managing authorizations for an instance of an application in a dedicated store is time-consuming and error-prone. Scalable authorization solutions manage accessible resources and allowed operations per user in a centralized store (e.g. LDAP, Apache Ranger, Active Directory).

What to authorize

Before zooming in on specific characteristics of Gremlin Server, it is useful to consider the authorization needs of graph applications in general. Like for relational databases, access to data can be granted at a number of levels. Because authorization schemes in SQL are more developed than in graph systems (the SQL language standard even has the GRANT/REVOKE keywords), we will pursue the analogy a bit further, see the table below.

 Level
 SQL RDBMS          
 graph system
 database GRANT CONNECT
 ON moviedb
 TO users;
 access to subset of graphs
 table GRANT SELECT
 ON titles
 TO users;
 access to subset of vertices and edges, based on their labels
 column GRANT SELECT
 ON titles.director
 TO users;
 access to subset of vertex and edge properties, based on their property keys
 row CREATE POLICY age_policy
 FOR SELECT
 ON titles
 TO child_users 
 USING (age_limit < 18);
 access to vertices and edges in a graph, based on a predicate on property values

Some further comments on this table:
  • The row level security construct is specific to postgresql, but other RDBMSs have similar constructs.
  • Apart from the CONNECT operation at the database level and the SELECT operation at the table and row level, various other operations exist, e.g. for creation/insertion, deletion, etc.
  • The principle of using ACL-like (meta)properties on an object (vertex, edge, property) for column and row level security is already mentioned in the Apache TinkerPop reference documentation.
  • The fact that RDBMSs can be used to implement graph systems, suggests that these authorization types should suffice for a graph system. Of course, there may be some additional possible permissions like the number of results a query may return or the maximum duration of a query.

Impact on Gremlin Server

Gremlin Server already has some functionalities in place that can be leveraged to implement authorization. Gremlin Server exposes a graph to GLV clients in the form of a GraphTraversalSource. A GraphTraversalSource can have so-called strategies applied to it, i.e. automatically applied traversal steps, some of which can limit access to the graph. Of particular interest in this respect are the ReadOnlyStrategy and the SubgraphStrategy. The ReadOnlyStrategy raises an exception when a traversal intends to mutate the underlying graph. The SubgraphStrategy adds filter steps to a traversal, using predicates on the vertex and edge labels and vertex properties. Custom strategies for time limits or result size limits could easily be constructed.
    The Gremlin Server admin can configure different GraphTraversalSource instances for different groups of users and apply suitable strategies per user group. Then, an authorization module can easily check whether granted permissions are consistent with the GraphTraversalSource a user requests to access.

    Apart from the bytecode requests with GLVs over websockets, Gremlin Server accepts older type string-based requests which are run as regular groovy scripts. This access mode does not lend itself to implementing authorization policies based on the ReadOnlyStrategy and SubgraphStrategy, because a groovy script running on Gremlin Server can simply access the underlying graph with a "g.getGraph()" call. While bytecode requests do not suffer from this link between the TinkerPop APIs, bytecode requests could still circumvent the authorization policies by using lambdas or by simply removing the controlling strategies. This implies that authorization in Gremlin Server requires some bytecode inspection to prevent usage of these mechanisms.

    Proposal

    Introducing authorization to Gremlin Server need not ₋ and probably should not ₋ be a big-bang operation. The current proposal has a minimum first step in its scope, which includes:
    • Open up Gremlin Server to authorization plugins in a way analogous to the authentication plugin mechanism. This implies that for user requests that do not get authorized a suitable error message is returned to the user. Note that the presence of an authorization plugin would be optional, so the mechanism would be non-breaking for existing applications.
    • The authorization plugin would receive the authenticated user as ascertained by the authentication plugin as well as the user request in the form of either a bytecode request or a string-based request. Typically, an authorization plugin would use the user request (traversal) to derive some relevant features, like the GraphTraversalSource accessed and the application of lambdas, OLAP, strategies, etc. It is matter of taste whether to only provide the raw request on the plugin API or also the derived features that are of interest to multiple plugins.
    • To illustrate the use of the authorization plugin mechanism a sample WhiteListAuthorizer is provided that allows for configuring a limited authorization policy through a yaml file. Typical use case is a data science team that wants to be able to directly query the graph store of a large graph application that only has authorization functionality on its GUI.
    So, this first step supports an initial use case and lowers the barrier for more ambitious integrations with larger stores for authorization data, like an LDAP store, Apache Ranger and Active Directory.

    Readers wanting to follow the fate of this proposal can do so by visiting:

    [Edited] A pull request implementing this proposal and additional comments from the TinkerPop community was merged on December 24, 2020.