How pg_graphql works

2024-01-24

14 minute read

Supabase’s GraphQL API is powered by pg_graphql. In this post, we will look at the internal workings of pg_graphql. Since it is an extension written in the Rust programming language, familiar with Rust will help - although it’s not a requirement to understand this post.

This article will give you a deeper understanding of pg_graphql , helping you to:

  • Make design decisions about how to use GraphQL in your application.
  • Understand how a GraphQL query is processed internally by pg_graphql.
  • Understand, or even fix, simple bugs and implement beginner-friendly features in pg_graphql.

What is pg_graphql?

pg_graphql is a Postgres extension that reads the SQL schema in a database and exposes it as a GraphQL schema. The GraphQL interface is made available through a SQL function graphql.resolve(...) which allows any programming language to use GraphQL without any additional servers, processes, or libraries. It is also possible to call the graphql.resolve(...) function from PostgREST, or any other HTTP proxy, to safely expose the GraphQL API via HTTP/S.

GraphQL request

When a client sends a GraphQL request to pg_graphql it gets a response back. But do you know what happens inside pg_graphql to serve that request? Let’s take a look at the life of a GraphQL request.

The entry point of a request is the graphql.resolve(query, ..) function. This function calls the graphql._internal_resolve(query) function which is written in Rust and where the real magic starts. The first line in that function uses the graphql-parser crate to parse the GraphQL query into a tree structure. Let’s see how that works.

Parsing GraphQL

Parsing converts a query string into an abstract syntax tree (AST). This is a two-step process, first, a tokenizer converts the input string into tokens, and then a parser organizes the tokens into nodes of the AST.

Tokenizer

A tokenizer (aka a lexer) reads the query string and spits out tokens in the language. For example, take a look at the following query:


_10
query {
_10
bookCollection {
_10
edges {
_10
node {
_10
id
_10
}
_10
}
_10
}
_10
}

It will be turned into the following tokens by the tokenizer: query, {, bookCollection, {, edges, {, node, {, id, }, }, } and }. How does the tokenizer know where a token starts and ends? The tokenizer relies on lexical grammar to figure out token boundaries. It looks at the next character in the text to first find the kind of token to expect and then uses the grammar rules for that token to find where it ends. For example, the first character in our example is q which means it must be a Name because its grammar looks like this:

lexical grammar

The grammar tells the tokenizer that if the next character is a Letter or an underscore then it is the start of a Name. And when the tokenizer finds a character that is not a NameContinue it ends the token. In our example, the Name token ends before the first whitespace after query.

Note that the tokenizer’s job is to just produce valid lexical tokens, even if those tokens do not make a valid GraphQL query. For example, the tokenizer will happily produce the tokens }, query, and { for an input string } query {. It is the parser’s job to reject these sequences of tokens as invalid.

There are also tokens that the tokenizer ignores. There’s good reason, for example, to ignore whitespace because it allows you to format your code as you please. But a quirk of the lexical structure of GraphQL is that it also ignores commas. This means, you can stick a comma just about anywhere and the query would still be valid. E.g. the last example can also be rewritten as:


_10
query, {,
_10
bookCollection, {,
_10
edges, {,
_10
node, {,
_10
id,
_10
},
_10
},
_10
},
_10
},

It’s possible you haven’t heard that before. We suggest not abusing the comma; use it thoughtfully to write queries that are easy to read.

Parser

The list of tokens produced by the tokenizer are consumed by the parser to generate the AST. The parsing grammar dictates how the parser makes sense of the tokens. Since pg_graphql needs to execute the query, it expects an ExecutableDocument. So the parser tries to parse an ExecutableDocument which is defined in the grammar like this:

executable document

An ExecutableDocument contains a list of ExecutableDefinitions which is defined like this:

executable definitions

An ExecutableDefinition can be either an OperationDefinition or a FragmentDefinition. Which one should the parser try to parse? Similar to how the tokenizer looks at the next character, the parser can look at the next token to know which definition lies next. The next token in our example is query. Can query appear at the beginning of OperationDefinition? Let’s check its definition:

operation definition

The parser again has two choices since OperationDefinition can either be an OperationType Name(opt) VariableDefinitions(opt) Directives(opt) SelectionSet or a SelectionSet. So which of those two can start with query? The first one starts with OperationType which is defined like this:

operation definition

Et voilà! The query token can start an OperationType, which means the parser now knows that it has to parse the first arm of the OperationDefinition. Which in turn means that it is going to parse an OperationDefinition arm of the ExecutableDefinition. Can’t the other arm of the definition also start with query? No, the language designers designed the grammar to avoid such ambiguities.

Now that the parser knows it will parse OperationType Name(opt) VariableDefinitions(opt) Directives(opt) SelectionSet, it skips the query token and looks at the next token, which is {. A { It can’t appear in the beginning of Name(opt), VariableDefinitions(opt) or Directives(opt), but it can start a SelectionSet:

operation definition

So the parser skips past the { token and then tries to parse a list of Selections. The parser parses the rest of the input string using the same process. It rejects an invalid list of tokens like }, query and { because no grammar rule starts with an }. For our example query, the parser generates the following AST:


_40
Document {
_40
definitions: [Operation(Query(Query {
_40
name: None,
_40
variable_definitions: [],
_40
directives: [],
_40
selection_set: SelectionSet {
_40
items: [Field(Field {
_40
alias: None,
_40
name: "bookCollection",
_40
arguments: [],
_40
directives: [],
_40
selection_set: SelectionSet {
_40
items: [Field(Field {
_40
alias: None,
_40
name: "edges",
_40
arguments: [],
_40
directives: [],
_40
selection_set: SelectionSet {
_40
items: [Field(Field {
_40
alias: None,
_40
name: "node",
_40
arguments: [],
_40
directives: [],
_40
selection_set: SelectionSet {
_40
items: [Field(Field {
_40
alias: None,
_40
name: "id",
_40
arguments: [],
_40
directives: [],
_40
selection_set: SelectionSet { items: [] },
_40
})],
_40
},
_40
})],
_40
},
_40
})],
_40
},
_40
})],
_40
},
_40
}))],
_40
}

An AST in graphql-parser is just Rust structs and enums. For example, the Document is a struct:


_10
#[derive(Debug, Clone, PartialEq)]
_10
pub struct Document<'a, T: Text<'a>> {
_10
pub definitions: Vec<Definition<'a, T>>,
_10
}

A Definition is an enum:


_10
#[derive(Debug, Clone, PartialEq)]
_10
pub enum Definition<'a, T: Text<'a>> {
_10
Operation(OperationDefinition<'a, T>),
_10
Fragment(FragmentDefinition<'a, T>),
_10
}

However, it is not enough to parse the query into an AST. Why? Similar to how a valid list of tokens produced by the tokenizer doesn’t mean it will produce a valid AST, a valid AST doesn’t mean that the AST can be executed by pg_graphql. So before execution, pg_graphql validates the AST.

Validation

To understand why a valid AST doesn’t mean that the query can be executed take the following example. Here the query produced a valid AST but it is still invalid because there can't be a named operation together with an anonymous operation in a GraphQL query:


_19
{
_19
bookCollection {
_19
edges {
_19
node {
_19
id
_19
}
_19
}
_19
}
_19
}
_19
_19
query getBookNames {
_19
bookCollection {
_19
edges {
_19
node {
_19
name
_19
}
_19
}
_19
}
_19
}

The above was just one example of the kind of validations performed. The GraphQL spec defines many other types of validations like:

pg_graphql performs these validations and returns errors if they fail. For example:

Some of the validations need to know the types in the reflected GraphQL schema. For example, for the field selection validation, the validation code must know which fields are defined on a GraphQL object. This information is contained in the reflected GraphQL schema.

Schema Reflection

pg_graphql builds GraphQL schema by reading information from many system catalogs. The load_sql_context function reads this information into a Context object by running the query in the load_sql_context.sql file. load_sql_context.sql loads information about enums, composite and non-composite types, foreign keys, schemas, tables and functions.

It might look like the context is loaded for each GraphQL query, but this function is memoized for performance by the cached attribute. This means it will only be called again if its input argument _config of type Config changes from the last time it was called.Config has three parts:

  1. Current search path. It is a list of schemas that are searched in order when looking up a database object.
  2. Current role. It is the role under which any SQL statements will be executed.
  3. Current schema version. It is the current version of the database’s schema. The version is updated by a couple of triggers that detect changes in schema.

It makes sense to reload the schema if any Config fields change because they can potentially alter the results of running transpiled SQL statements.

The Context object is wrapped in a __Schema object which is used to not only serve introspection queries but also provide information to run validations and for transpilation to SQL. For example, take a look at how tables are converted into<table_name>Collections. This code adds Collection objects to the QueryType object by iterating over tables and adding a <table_name>Collectionfield. Notice howQueryType's schema field is used extensively throughout this code snippet.

Transpilation and Query Execution

Transpilation is a two-step process. First, builder objects are constructed from the AST and __Schema, and then the builder objects are converted into SQL. A builder object contains all the information needed to produce a SQL query.

For example, when a table is added as a collection object to the QueryType object, its type_ field is set to __Type::Connection(_). This type is then checked in the resolve_selection_set function and converted to an appropriate builder type. A builder implements the QueryEntrypoint trait. QueryEntrypoint has only one required method named to_sql_entrypoint which the ConnectionBuilder implements by calling its to_sql method. The to_sql method contains the meat of the logic to generate SQL.

An important aspect of the SQL generation code is how it calls quote_ident and quote_literal functions to avoid SQL injection. Without them, a caller could potentially send a specially crafted input to execute arbitrary SQL code.

The generated SQL code is then run in the execute method of QueryEntrypoint trait. The transpiled queries return a jsonb object which is deserialized into a pgrx::JsonB object. Since pgrx::JsonB is a wrapper over a serde_json::Value object, it is trivial for the code to unwrap it and return this as a JSON response to the client.

Conclusion

In this post, we looked at how pg_graphql processes a GraphQL request. A request goes through the steps of tokenization, parsing, validation, transpilation, and execution. We looked in detail at the actions pg_graphql performs in each step. This knowledge should equip you to understand how pg_graphql works internally and help you make more informed decisions about how you can better use GraphQL APIs. If you feel ambitious, you can also start contributing to pg_graphql which we always welcome.

More pg_graphql

Share this article

Build in a weekend, scale to millions