Filter Data

When you call authorize(actor, action, resource) , Oso evaluates the allow rule(s) you have defined in your policy to determine if actor is allowed to perform action on resource. For example, if jane wants to "edit" a document, Oso may check that jane = document.owner. But what if you need the set of all documents that Jane is allowed to edit? For example, you may want to render them as a list in your application.

One way to answer this question is to take every document in the system and call is_allowed on it. This isn’t efficient and many times is just impossible. There could be thousands of documents in a database but only three that have the owner "steve". Instead of fetching every document and passing it into Oso, it’s better to ask the database for only the documents that have the owner "steve". Using Oso to filter the data in your data store based on the logic in your policy is what we call “Data Filtering”.

ORM Integrations

If you are using one of our ORM adapter libraries like sqlalchemy-oso or django-oso then data filtering is already built in and you won’t have to worry about integrating it yourself. See docs for the ORM library instead.

You can use data filtering to enforce authorization on queries made to your data store. Oso will take the logic in the policy and turn it into a query for the authorized data. Examples could include an ORM filter object, an HTTP request or an elastic-search query. The query object and the way the logic maps to a query are both user defined.

Data filtering is initiated through two methods on Oso.

authorized_resources returns a list of all the resources a user is allowed to do an action on. The results of a built and executed query.

authorized_query returns the query object itself. This lets you add additional filters or sorts or any other data to it before executing it.

You must define how to build queries and a few other details when you register classes to enable these methods.

Implementing data filtering

Query Functions

There are three Query functions that must be implemented. These define what a query is for your application, how the logic in the policy maps to them, how to execute them and how to combine two queries.

Build a Query

build_query takes a list of Filters and returns a Query

Filters are individual pieces of logic that must apply to the data being fetched.

Filters have a kind, a field and a value. Their meaning depends on the kind field.

  • Eq means that the field must be equal to the value.
  • Neq means that the field must not be equal to the value.
  • In means that the field must be equal to one of the values in value. Value will be a list.
  • Nin means that the field must not be equal to one of the values in value. Value will be a list.
  • Contains means that the field must contain the value. This only applies if the field is a list.

The condition described by a Filter applies to the data stored in the attribute field of a resource. The field of a Filter may be None, in which case the condition applies to the resource directly.

Execute a Query

exec_query takes a query and returns a list of the results.

Combine Queries

combine_query takes two queries and returns a new query that returns the union of the other two. For example if the two queries are SQL queries combine could UNION them. If they were HTTP requests combine_query could put them in an array and could handle executing an array of queries and combining the results.

You can define functions that apply to all types with set_data_filtering_query_defaults. Or you can pass type specific ones when you register a class.

Fields

The other thing you have to provide to use data filtering is type information for registered classes. This lets Oso know what the types of an object’s fields are. Oso needs this information to handle specializers and other things in the policy when we don’t have a concrete resource. The fields are a dictionary from field name to type.

Example

In this example we’ll model access to code repositories in a simple Git hosting application.

data_filtering_example_a.py
# We're using sqlalchemy here, but you can use data filtering with any ORM
from sqlalchemy import create_engine
from sqlalchemy.types import String, Boolean, Integer
from sqlalchemy.schema import Column, ForeignKey
from sqlalchemy.orm import sessionmaker, relationship
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()


class Repository(Base):
    __tablename__ = "repos"

    id = Column(String(), primary_key=True)


class User(Base):
    __tablename__ = "users"

    id = Column(String(), primary_key=True)


class RepoRole(Base):
    __tablename__ = "repo_roles"
    id = Column(Integer, primary_key=True)
    user_id = Column(String, ForeignKey("users.id"), nullable=False)
    repo_id = Column(String, ForeignKey("repos.id"), nullable=False)
    user = relationship("User", backref="repo_roles", lazy=True)
    name = Column(String, index=True)


engine = create_engine("sqlite:///:memory:")

Session = sessionmaker(bind=engine)
session = Session()

Base.metadata.create_all(engine)

# Here's some data to work with ...
ios = Repository(id="ios")
oso_repo = Repository(id="oso")
demo_repo = Repository(id="demo")

leina = User(id="leina")
steve = User(id="steve")

role_1 = RepoRole(user_id="leina", repo_id="oso", name="contributor")
role_2 = RepoRole(user_id="leina", repo_id="demo", name="maintainer")

objs = {
    "leina": leina,
    "steve": steve,
    "ios": ios,
    "oso_repo": oso_repo,
    "demo_repo": demo_repo,
    "role_1": role_1,
    "role_2": role_2,
}
for obj in objs.values():
    session.add(obj)
session.commit()

For each class we need to register it and define the query functions.

data_filtering_example_a.py
# build_query takes a list of filters and returns a query
def build_query(filters):
    query = session.query(Repository)
    for filter in filters:
        assert filter.kind in ["Eq", "In"]
        field = getattr(Repository, filter.field)
        if filter.kind == "Eq":
            query = query.filter(field == filter.value)
        elif filter.kind == "In":
            query = query.filter(field.in_(filter.value))
    return query


# exec_query takes a query and returns a list of resources
def exec_query(query):
    return query.all()


# combine_query takes two queries and returns a new query
def combine_query(q1, q2):
    return q1.union(q2)


from oso import Oso

oso = Oso()

oso.register_class(
    Repository,
    fields={
        "id": str,
    },
    build_query=build_query,
    exec_query=exec_query,
    combine_query=combine_query,
)

oso.register_class(User, fields={"id": str, "repo_roles": list})

Then we can load a policy and query it.

policy_a.polar
actor User {}

resource Repository {
	permissions = ["read", "push", "delete"];
	roles = ["contributor", "maintainer", "admin"];

	"read" if "contributor";
	"push" if "maintainer";
	"delete" if "admin";

	"maintainer" if "admin";
	"contributor" if "maintainer";
}

allow(actor, action, resource) if has_permission(actor, action, resource);

has_role(user: User, role_name: String, repository: Repository) if
	role in user.repo_roles and
	role.name = role_name and
	role.repo_id = repository.id;
data_filtering_example_a.py
oso.load_str(policy_a)
# Verify that the policy works as expected
leina_repos = list(oso.authorized_resources(leina, "read", Repository))
assert leina_repos == [demo_repo, oso_repo]

Relations

Often you need data that is not contained on the object to make authorization decisions. This comes up when the role required to do something is implied by a role on it’s parent object. For instance, you want to check the organization for a repository but that data isn’t embedded on the repository object. You can add a Relation type to the type definition that states how the other resource is related to this one. Then you can access this field in the policy like any other field and it will fetch the data when it needs it (via the query functions).

Relations are a special type that tells Oso how one Class is related to another. They specify what the related type is and how it’s related.

  • kind is either “one” or “many”. “one” means there is one related object and “many” means there is a list of related objects.
  • other_type is the class of the related objects.
  • my_field Is the field on this object that matches other_field.
  • other_field Is the field on the other object that matches this_field.

The my_field / other_field relationship is similar to a foreign key. It lets Oso know what fields to match up with building a query for the other type.

Example

This time our data will be a little more complicated in order to model a more sophisticated policy.

data_filtering_example_b.py
from sqlalchemy import create_engine
from sqlalchemy.types import String, Boolean, Integer
from sqlalchemy.schema import Column, ForeignKey
from sqlalchemy.orm import sessionmaker, relationship
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()


class Organization(Base):
    __tablename__ = "orgs"

    id = Column(String(), primary_key=True)


# Repositories belong to Organizations
class Repository(Base):
    __tablename__ = "repos"

    id = Column(String(), primary_key=True)
    org_id = Column(String, ForeignKey("orgs.id"), nullable=False)


class User(Base):
    __tablename__ = "users"

    id = Column(String(), primary_key=True)


class RepoRole(Base):
    __tablename__ = "repo_roles"
    id = Column(Integer, primary_key=True)
    user_id = Column(String, ForeignKey("users.id"), nullable=False)
    repo_id = Column(String, ForeignKey("repos.id"), nullable=False)
    user = relationship("User", backref="repo_roles", lazy=True)
    name = Column(String, index=True)


class OrgRole(Base):
    __tablename__ = "org_roles"
    id = Column(Integer, primary_key=True)
    user_id = Column(String, ForeignKey("users.id"), nullable=False)
    org_id = Column(String, ForeignKey("orgs.id"), nullable=False)
    user = relationship("User", backref="org_roles", lazy=True)
    name = Column(String, index=True)


engine = create_engine("sqlite:///:memory:")

Session = sessionmaker(bind=engine)
session = Session()

Base.metadata.create_all(engine)

# Here's some more test data
osohq = Organization(id="osohq")
apple = Organization(id="apple")

ios = Repository(id="ios", org_id="apple")
oso_repo = Repository(id="oso", org_id="osohq")
demo_repo = Repository(id="demo", org_id="osohq")

leina = User(id="leina")
steve = User(id="steve")

role_1 = OrgRole(user_id="leina", org_id="osohq", name="owner")

objs = {
    "leina": leina,
    "steve": steve,
    "osohq": osohq,
    "apple": apple,
    "ios": ios,
    "oso_repo": oso_repo,
    "demo_repo": demo_repo,
    "role_1": role_1,
}
for obj in objs.values():
    session.add(obj)
session.commit()

We now have two sets of query functions. Our build_query function depends on the class but our exec_query and combine_query functions are the same for all types so we can set them with set_data_filtering_query_defaults.

data_filtering_example_b.py
# The query functions are the same.
def build_query_cls(cls):
    def build_query(filters):
        query = session.query(cls)
        for filter in filters:
            assert filter.kind in ["Eq", "In"]
            field = getattr(cls, filter.field)
            if filter.kind == "Eq":
                query = query.filter(field == filter.value)
            elif filter.kind == "In":
                query = query.filter(field.in_(filter.value))
        return query

    return build_query


def exec_query(query):
    return query.all()


def combine_query(q1, q2):
    return q1.union(q2)


from oso import Oso, Relation

oso = Oso()

# All the combine/exec query functions are the same, so we
# can set defaults.
oso.set_data_filtering_query_defaults(
    exec_query=exec_query, combine_query=combine_query
)

oso.register_class(
    Organization,
    fields={
        "id": str,
    },
    build_query=build_query_cls(Organization),
)

oso.register_class(
    Repository,
    fields={
        "id": str,
        # Here we use a Relation to represent the logical connection between an Organization and a Repository.
        # Note that this only goes in one direction: to access repositories from an organization, we'd have to
        # add a "many" relation on the Organization class.
        "organization": Relation(
            kind="one", other_type="Organization", my_field="org_id", other_field="id"
        ),
    },
    build_query=build_query_cls(Repository),
)

oso.register_class(User, fields={"id": str, "repo_roles": list})
policy_b.polar
actor User {}

resource Organization {
	permissions = ["add_member", "read", "delete"];
	roles = ["member", "owner"];

	"add_member" if "owner";
	"delete" if "owner";

	"member" if "owner";
}

# Anyone can read.
allow(_, "read", _org: Organization);

resource Repository {
	permissions = ["read", "push", "delete"];
	roles = ["contributor", "maintainer", "admin"];
	relations = { parent: Organization };

	"read" if "contributor";
	"push" if "maintainer";
	"delete" if "admin";

	"maintainer" if "admin";
	"contributor" if "maintainer";

	"contributor" if "member" on "parent";
	"admin" if "owner" on "parent";
}

has_relation(organization: Organization, "parent", repository: Repository) if
	repository.organization = organization;

has_role(user: User, role_name: String, repository: Repository) if
	role in user.repo_roles and
	role.name = role_name and
	role.repo_id = repository.id;

has_role(user: User, role_name: String, organization: Organization) if
	role in user.org_roles and
	role.name = role_name and
	role.org_id = organization.id;

allow(actor, action, resource) if has_permission(actor, action, resource);
data_filtering_example_b.py
oso.load_str(policy_a)
leina_repos = list(oso.authorized_resources(leina, "read", Repository))
assert leina_repos == [oso_repo, demo_repo]

Evaluation

When Oso is evaluating data filtering methods it uses queries to fetch objects. If there are multiple types involved it will make multiple queries and substitute in the results when needed. In the above example we are fetching Repositories, but we are basing our fetch on some information about their related Organization. To resolve the query Oso first fetches the relevant Organizations (based in this case on role assignments), and then uses the Relation definition to substitute in their ids to the query for Repositories. This is the main reason to use Relations, they let Oso know how different classes are related so we can resolve data filtering queries. Relation fields also work when you are not using data filtering methods and are just using authorize or another method where you have an object to pass in. In that case the query functions are still called to get related objects so if you’re using a Relation to a type, you must define query functions for that type.

Limitations

There are a few limitations to what you can do while using data filtering. You can not call any methods on the passed in resource and you can not pass the resource as an argument to any methods. Many cases where you would want to do this are better handled by Relation fields.

Some Polar expressions are not supported. not, cut and forall are not allowed in policies that want to use data filtering. Numeric comparisons with the < > <= and >= are not currently supported either.

Relations only support matching on a single field. For example, relating a Student to their classmates with matching school_id and homeroom_id fields isn’t currently possible.

Set up a 1x1 with an Oso Engineer

Our team is happy to help you get started with Oso. If you'd like to learn more about using Oso in your app or have any questions about this guide, schedule a 1x1 with an Oso engineer.


Was this page useful?