Performance

This page explores the performance of Oso across three main axes:

1. In practice. How does Oso perform under typical workloads?

2. Internals and Micro-benchmarks. How is Oso built? What are the micro-benchmarks?

3. Scaling. What is the theoretical complexity of a query?

In Practice

There are two main areas to consider when measuring the performance of Oso queries: the time to evaluate a query relative to a policy, and the time needed to fetch application data.

In a complex policy, the time it takes to run a single query depends on the complexity of the answer. For example, a simple rule that says anyone can “GET” the path “/” will execute in less than 1 ms. On the other hand, rules that use HTTP path mapping, resource lookups, roles, inheritance, etc. can take approximately 1-20 ms. (These numbers are based on queries executing against a local SQLite instance to isolate Oso’s performance from the time to perform database queries.)

The time needed to fetch application data is, of course, dependent on your specific environment and independent of Oso. Aggressive caching can be used to reduce some of the effect of such latencies.

Profiling

Oso does not currently have built-in profiling tools, but this is a high-priority item on our near-term roadmap. Our benchmark suite uses Rust’s statistical profiling package, but is currently better suited to optimizing the implementation than to optimizing a specific policy.

Oso has a default maximum query execution time of 30s. If you hit this maximum, it likely means that you have created an infinite loop in your policy. You can use the Polar debugger to help track down such bugs.

For performance issues caused by slow database queries or too many database queries, we recommend that you address these issues at the data access layer, i.e., in the application. See, for example, our guidance on The “N+1 Problem”.

Internals and Micro-benchmarks

The core of Oso is the Polar virtual machine, which is written in Rust. (For more on the architecture and implementation, see Internals.) A single step of the virtual machine takes approximately 1-2 us, depending on the instruction or goal. Simple operations like comparisons and assignment typically take just a few instructions, whereas more complex operations like pattern matching against an application type or looking up application data need a few more. The debugger can show you the VM instructions remaining to be executed during a query using the goals command.

The current implementation of Oso has not yet been aggressively optimized for performance, but several low-hanging opportunities for optimizations (namely, caches and indices) are on our near-term roadmap. We do ensure that all memory allocated during a query is reclaimed by its end, and our use of Rust ensures that the implementation is not vulnerable to many common classes of memory errors and leaks.

You can check out our current benchmark suite in the repository, along with instructions on how to run it. We would be delighted to accept any example queries that you would like to see profiled; please feel free to email us at engineering@osohq.com.

Scaling

At its core, answering queries against a declarative policy is a depth-first search problem: nodes correspond to rules, and nodes are connected if a rule references another rule in its body.

As a result, the algorithmic complexity of a policy is in theory very large — exponential in the number of rules. However, in practice there shouldn’t be that many distinct paths that need to be taken to make a policy decision. Oso filters out rules that cannot be applied to the inputs early on in the execution. What this means is that if you are hitting a scaling issue, you can make your policies perform better by either by splitting up your rules to limit the number of possibilities, or by adding more specializers to your rule heads.

For example, suppose you have 20 different resources, ResourceA, ResourceB, …, and each has 10 or so allow(actor, action, resource: ResourceA) rules. The performance of evaluating a rule with input of type ResourceA will primarily depend on those 10 specific rules, and not the other 190 rules. In addition, you might consider refactoring this rule to allow(actor, action, resource: ResourceA) if allowResourceA(actor, action, resource). This would mean there are only 20 allow rules to sort through, and for a given resource only one of them will ever need to be evaluated.

The performance of evaluating policies is usually independent of the number of users or resources in the application when fetching data is handled by your application. However, as in any programming system, you need to be on the lookout for linear and super-linear searches. For example, if you have a method user.expenses() that returns a list of the user’s expenses, the check expense in user.expenses() will require O(n) VM instructions, where n is the length of the list. It would be better to replace the linear search with a single comparison, e.g. expense.user_id = user.id. Be especially careful when nesting such rules.

Summary

Oso typically answers simple authorization queries in less than 1 ms, but may take (much) longer depending on the complexity of your rules, the latency of application data access, and algorithmic choices. Some simple solutions such as caching and refactoring may be used to improve performance where needed.

The "N+1 Problem"
The “N+1 Problem” A core part of understanding how Oso will perform under regular workloads is recognizing that Oso applies a search algorithm to evaluate the policy. Since it is common in policies to iterate over members or attributes in order to look for matching information, it can be common to encounter variants of the N+1 problem. For example, given the following policy: has_grandchild_called(grandparent: Person, name) if child in grandparent.children and grandchild in child.children and grandchild.name = name; This can potentially exhibit this N+1 behavior. It will first call the Person.children method on the input grandparent, expecting a list to iterate over with the in operator. This might translate into a DB query by the application. For each child returned from this method, Person.children is again called, which may make another DB query, ultimately resulting in N+1 queries - one for the initial query, and one for each grandchild. The answer to solving this ultimately lies in how your application accesses data. Since this problem is not unique to Oso and authorization queries, there already exist a few patterns for this, such as eager-loading ORMs and dataloaders for GraphQL. Here we will show how these patterns can be leveraged in Oso. Option 1. Implement a lookup method that accepts as input a list. For example: class Person: @classmethod def batch_lookup_children(cls, people: List[Person]): parent_ids = [p.id for p in people] children = db.query( "select id, name from people, children \ where people.id = children.child_id, children.parent_id in ?", parent_ids ) return children has_grandchild_called(grandparent: Person, name) if children = grandparent.children # gets the _list_ of children and grandchild in Person.batch_lookup_children(children) and grandchild.name = name; This has the benefit of being the simplest, and most explicit. But does not leverage any data access abstractions. Any optimization method works fine here, for example if this were a sufficiently common use case, then a grandchildren method and DB index could be added to improve performance. Option 2. Implement some form of dataloader/eager loading in your application. This is the common approach to solve these in ORMs, like Ruby on Rails. The Person model in this case could be configured to eager load children when looking up records. In this case, each child record returned by the grandparent.children method call will already have loaded and locally stored the child.children attribute. For example, in a Django application you might write: has_grandchild_called(grandparent: Person, name) if child in grandparent.children.prefetch_related("children") and grandchild in child.children.all() and grandchild.name = name; Since Oso is able to work directly with native objects, using the existing Django methods to prefetch the grandchildren in this case can be applied directly where it’s used.

Connect with us on Slack

If you have any questions, or just want to talk something through, jump into Slack. An Oso engineer or one of the thousands of developers in the growing community will be happy to help.