How to Build a Secure, Authorized Chatbot Using Oso and Timescale

Posted by

Jacky Liang

The rush to integrate large language models (LLMs) into production apps has exposed a common failure mode: without proper authorization in place, they can easily expose sensitive data to the wrong users. Combine that with complex infrastructure (vector databases, sync pipelines, separate stores for embeddings and metadata), and you’re shipping a fragile system that puts user data at risk.

At Timescale and Oso, we think there’s a better way.

In this webinar, we show how you can build a secure, scalable AI chatbot using Postgres—and only Postgres—by leveraging Timescale’s pgai library and Oso’s authorization platform as a service.

Here are the webinar highlights, summarized for you in chapters for easy reference.

(To deploy our sample app for authorized secure chatbot built using Oso and pgai, see this open-source code.)

Why Most AI Chatbot Demos Fail in Production

[08:30–11:50]

Why do simple chatbots break in production? Demo chatbots are easy: embed your docs, slap on an OpenAI API key, and you’re done.

But in a real business environment, Bob (the employee) should never see Alice’s harsh performance review feedback. Only Alice, their manager and HR should. Sales shouldn’t see engineering tickets.

Without authorization boundaries, your chatbot becomes a data leak waiting to happen.

Many demos fall short because they:

Expose all content to all users
Ignore org-specific permissions (e.g., team-level access control)
Assume static or role-based authorization models
Rely on dual data systems (e.g., Postgres + Vector DB), causing data synchronization difficulties.

The fix? Build with authorization and data consistency as first principles.

Why We Combined Postgres, pgvector, and Oso

[13:34–17:47]

We introduced an end-to-end reference stack that solves both the data synchronization and authorization complexity problem. The solution uses:

Timescale + pgai for real-time, in-database vector search and updates
Oso Cloud for relationship-based access controls, enforced natively via PostgreSQL
No glue code or ETL scripts between systems

The result: you get a secure, performant, and authorized chat system with zero duplicated data.

💡

“Chatbot demos are simple. Business-grade AI is hard. We’re going to show you how to make the hard, easy.” — Jacky, Developer Advocate, Timescale

Real-Time Vector Sync With pgai Vectorizer

[14:33–20:45]

Instead of bolting a vector database on top of your existing Postgres database, pgai Vectorizer keeps your embeddings automatically synchronized with your source data in Postgres.

Create vectorizers via Python
Ingest from S3, Hugging Face, or existing Postgres tables
Bring your own embedding model (OpenAI, Nomic, etc.)
Chunk and embed documents with configurable rules
Never worry about mismatched records again

SELECT ai.create_vectorizer(
  'blog'::regclass,
  loading => ai.loading_column(column_name => 'content'),
  embedding => ai.embedding_openai(model => 'text-embedding-3-small', dimensions => 768),
  destination => ai.destination_table('blog_embeddings')
);

Run your vectorizer worker:

pgai vectorizer worker -d postgresql://...

No extra queues, pipelines, or lambdas needed. Just Python and Postgres.

Authorization That Follows Relationships, Not Just Roles

[21:43–28:14]

Many apps rely on Role-Based Access Control (RBAC). But real-world permissions often depend on relationships:

“Bob can view reviews only if he’s the owner of the document”
“Diane (HR) can see feedback others can’t”
“Support engineers can access sensitive logs only during active shifts”

Oso lets you model this in code:

resource Folder{
 roles = ["viewer"];
 permissions = ["view"];
 relations = { team: Team };


 "viewer" if "member" on "team";
 "viewer" if global "hr";
 "viewer" if is_public(resource);


 "view" if "viewer";
}

It also incorporates your Postgres data using native SQL, so you don’t need to sync users, roles, or groups into a second system.

Putting It Together: Authorized Retrieval Augmented Generation (RAG)

[30:44–37:32]

Here’s how the architecture works:

A user (Bob or Diane) sends a question to the chatbot.
The app queries Oso to determine what data the user is authorized to access.
That filter is converted to a SQL query that joins source + embedding data in Timescale.
Only the authorized context is sent to the LLM (e.g., OpenAI) to generate a final response.

The result: the same chatbot provides personalized, secure answers based on who’s asking—without leaking data or requiring redundant systems.

What You’ll Learn From the Demo

[29:01–48:00]

How to build a business-grade RAG stack without a separate vector DB
How to enforce field-level access control in LLM-based apps
How Timescale + pgai + Oso make Postgres the only data system you need
Why prompt engineering, chunking, and system prompts matter in retrieval quality
How to embed PDF, DOCX, and S3-based documents securely

Next Steps

We’ve open-sourced the reference app and walkthrough:

If you’re building AI agents, chat interfaces, or internal copilots—don’t wait to layer in security and data correctness.

Your users will thank you. Your auditors will too.

Date published

May 13, 2025

Posted by

Jacky Liang

Get Started Free with Tiger CLI

Date published

May 13, 2025

Posted by

Jacky Liang

Get Started Free with Tiger CLI