Welcome to the Winnow documentation

Winnow is a Python package for safely building SQL where clauses from untrusted user input. It’s designed to be expressive, extensible, and fast. Winnow’s inputs look something like this:

{
    "logical_op": "&",
    "filter_clauses": [
        {
            "data_source": "Created",
            "operator": "before",
            "value": "2015-03-01"
        },
        {
            "data_source": "Owner",
            "operator": "any of",
            "value": [
                {"name": "Steven", "id": 23},
                {"name": "Margaret", "id": 41},
                {"name": "Evan", "id": 90}
            ]
        }
    ]
}

And its outputs looks like this:

(
  "WHERE created_date < %s::timestamp AND owner_id = ANY(VALUES (%s),(%s),(%s))",
  ('2015-03-01', 23, 41, 90)
)

Quickstart

Installation

Winnow is available from PyPI (I mean, it’s not right now, but it will be).

# TODO upload to PyPI and update the docs here.

Define your Sources

Your sources will usually be a list of the columns you want to make available for filtering. Each source needs a display name and a list of the value_types it supports. See valuetypes for a list of included value types.

sources = [
    {
        'display_name': 'Order Date',
        'column': 'order_date',
        'value_types': ['absolute_date', 'relative_date'] },
    {
        'display_name': 'Number Scoops',
        'column': 'num_scoops',
        'value_types': ['numeric', 'nullable'] },
    {
        'display_name': 'Flavor',
        'column': 'flavor',
        'value_types': ['collection'],
        'picklist_options': [
            'Mint Chocolate Chip',
            'Cherry Garcia',
            'Chocolate',
            'Cookie Dough',
            'Rocky Road',
            'Rainbow Sherbet',
            'Strawberry',
            'Vanilla',
            'Coffee',
         ]},
]

Create a Filter

Use the sources you defined to build a JSON filter. The value types specified on each source determine which operators are available.

ice_cream_filt = {
    'logical_op': '&',
    'filter_clauses': [
        {'data_source': 'Number Scoops', 'operator': '>=', 'value': '2'},
        {'data_source': 'Flavor', 'operator': 'any of', 'value': [
            'Strawberry',
            'Chocolate',
         ]}
    ]
}

Get a SQL Query

Now initialize a Winnow() instance using your sources, and the name of the table you’re filtering against. Turn your filter into a query.

ice_cream_winnow = Winnow('ice_cream', sources)
query, params = ice_cream_filt.query(ice_cream_filt)
# query => SELECT * FROM ice_cream WHERE ((num_scoops >= %s) AND (flavor IN (%s,%s) ))
# params => (2, 'Strawberry', 'Chocolate')

Value Types and Operators

The existing value types and operators can be easily extended by subclassing Winnow.

numeric

The numeric value type provides operators for >=, <=, >, <, is, and is not.

{
    'data_source': 'Number Scoops',
    'operator': '>=',
    'value': 3,
}

string

Data sources marked a supporting the string value type can use the is, is not, contains, starts with, more than __ words, and fewer than __ words operators.

{
    'data_source': "Scooper's Name",
    'operator': 'more than __ words',
    'value': 3,
},
{
    'data_source': "Scooper's Name",
    'operator': 'contains',
    'value': 'Heidi',
}

collection

To use the collection operators, a data source will usually need to provide a list of picklist_options to the client. It’s fine to include those directly on the data source object:

{
    'display_name': 'Flavor',
    'column': 'flavor',
    'value_types': ['collection'],
    'picklist_options': [
        'Mint Chocolate Chip',
        'Cherry Garcia',
        'Chocolate',
        'Cookie Dough',
        'Rocky Road',
        'Rainbow Sherbet',
        'Strawberry',
        'Vanilla',
        'Coffee',
     ]
}

Collections have access to any of and not any of operators.

{
    'data_source': 'Flavor',
    'operator': 'any of',
    'value': ['Strawberry', 'Chocolate'],
}

Datetime Operators

Datetime operators are broken down into two sets, relative and absolute. Most timestamp sources will want to support both.

absolute_date

Absolute date values are ISO8601 strings, like "2017-03-22T18:14:30". The supported operators are before and after.

{
    'data_source': 'Purchase Date',
    'operator': 'after',
    'value': '2017-03-22T18:14:30',
}

relative_date

Relative date values are also strings, but they’re things like "last_30_days" and "current_month". I’m not very happy with how these are designed, so they will likely change in a future version. Please let me know if you have any advice. Maybe there’s already a standard way to refer to intervals of time that aren’t anchored to a particular day?

{
    'data_source': 'Purchase Date',
    'operator': 'within',
    'value': 'last_7_days',
}

The list of available values is found in relative_dates.py.

Extending Winnow

Adding Operators

Adding value types

relative_to_date_field

This will allow us to say “Expected close Date before <any other date field>”.

historical

Stage was ‘Prospecting’ as of <date>

Adding custom fields

User-specific fields, generated dynamically

Replacing relative date handling

“We want to start our year in February.”

Indices and tables