PostgreSQL offsets and pagination performance

PostgreSQL offsets and pagination performance

September 23, 2014

PostgreSQL is notoriously bad at performing well with high offsets, mainly because it needs to do a full table scan and count all the rows up to the offset. This can be a problem is you want to paginate over a full table without holding all the rows in memory.

On the surface of it PostgreSQL Cursors may seem like a good solution to this problem--rather than executing a whole query at once a cursor encapsulates the query allowing you to read a few rows at a time. However in practice cursors use a transaction behind the scenes and you'll quickly run out of memory.

A good solution is to use an indexed auto-incrementing ID on the table and then paginate based on that.

At Clearbit, we use a UUIDs for identifiers rather than integers, but we have an internal iid column which is a indexed numeric sequence. If you're not using the default integer IDs than you'll need to do something similar and add a column.

The example below is using Sequel, but can be easily ported to your ORM of choice. We're essentially paging over the ordered records using the iid column in a WHERE clause instead of an OFFSET.

batch.rb

def batch_page(page_size = 2_000)
  last_record = order(:iid.asc).first
  last_iid    = last_record && last_record.iid || 0
  last_iid   -= 1

  loop do
    result = order(:iid.asc)
      .where { iid > last_iid }
      .limit(page_size)
      .all

    break if result.empty?
    yield result

    last_iid = result.last.iid
  end

  self
end

The generated SQL query ends up looking like this--it's very fast to run since the iid column is indexed.

SELECT * FROM "people" WHERE ("iid" > 2006) ORDER BY "iid" ASC LIMIT 2000

Company Logo API

Engineeringby Alex MacCaw on January 01, 2021

Clearbit's free Logo API is still available here in 2021 — and still completely free. We never found anything that catered well to company logos. And yet there's a lot of clear use-cases ranging from setting an organization's default image on signup to pulling in logos next to job listings. Clearbit Logo API The API is incredibly simple, taking a company's domain and returning an image. GET https://logo.clearbit.com/:domain Behind the scenes we're using Clearbit's Company API [https://clear

Introducing ultimate parent to Clearbit Enrichment API

Engineeringby Emily Brown on April 16, 2019

We've added a new data attribute to Clearbit's Company Enrichment API: ultimate parent. Bring full context of your accounts' hierarchy into view — so your team can stay up to speed on new acquisitions and know when they're in conversation with the same parent company.

The Standard in B2B Data

Now reinvented with Artificial Intelligence—Clearbit is the first AI Native Data Provider. Enrich your records, reveal buying intent, and connect with your ideal customers.

image-hero