Languages
Languages
Navigation Menu
supabase-community
/
chatgpt-your-files
Type / to search
Code
Issues
9
Pull requests
2
Actions
Projects
Security
Insights
Owner avatar
chatgpt-your-files
Public
supabase-community/chatgpt-your-files
Go to file
t
Add file
Folders and files
Name
Latest commit
gregnr
gregnr
Merge pull request #37 from supabase-community/feat/upgrade-model
2e34056
·
3 months ago
History
.vscode
chore: project setup
last year
app
fix: DynamicServerError during nextjs build
10 months ago
assets
chore: initial commit from create-next-app
last year
components
feat: frontend styles/components/layout
last year
lib
feat: usePipeline() hook
last year
sample-files
chore: initial commit from create-next-app
last year
supabase
feat: upgrades model to gpt-3.5-turbo-0125
3 months ago
.env.local.example
chore: initial commit from create-next-app
last year
.gitignore
feat: convert transformers.js to Supabase.ai on edge
5 months ago
README.md
feat: upgrades model to gpt-3.5-turbo-0125
3 months ago
components.json
feat: frontend styles/components/layout
last year
middleware.ts
chore: initial commit from create-next-app
last year
next.config.js
feat: chat with RAG
last year
package-lock.json
feat: convert transformers.js to Supabase.ai on edge
5 months ago
package.json
feat: convert transformers.js to Supabase.ai on edge
5 months ago
postcss.config.js
chore: initial commit from create-next-app
last year
tailwind.config.js
feat: frontend styles/components/layout
last year
tsconfig.json
fix: exclude supabase functions from nextjs build
10 months ago
Repository files navigation
README
Code of conduct
pgvector to Prod in 2 hours
☑️ Features
Interactive Chat Interface: Interact with your documentation, leveraging the
capabilities of OpenAI’s GPT models and retrieval augmented generation (RAG).
Login With <3rd Party>: Integrate one-click 3rd party login with any of our 18 auth
providers and user/password.
Document Storage: Securely upload, store, and retrieve user uploaded documents.
REST API: Expose a flexible REST API that we’ll consume to build the interactive
front-end.
Row-level Security: Secure all of your user data user data with production-ready
row-level security.
🎥 YouTube video
This entire workshop was recorded as a YouTube video. Feel free to watch it here:
https://www.youtube.com/watch?v=ibzlEQmgPPY
📄 Workshop Instructions
Thanks for joining! Let's dive in.
Workshop instructions
Git checkpoints: The workshop is broken down into steps (git tags). There's a step
for every major feature we are building.
Feel free to follow along live with the presenter. When it's time to jump to the
next step, run:
🧱 Pre-req’s
Unix-based OS (if Windows, WSL2)
Docker
Node.js 18+
💿 Sample Data
This repository includes 3 sample markdown files that we'll use to test the app:
./sample-files/roman-empire-1.md
./sample-files/roman-empire-2.md
./sample-files/roman-empire-3.md
🪜 Step-by-step
Jump to a step:
Storage
Documents
Embeddings
Chat
Database Types (Bonus)
You're done!
Step 0 - Setup (Optional)
Step 0 - Setup
Use this command to jump to the step-0 checkpoint.
npm i -D [email protected]
Initialize Supabase project.
Create import_map.json with dependencies for our Supabase Edge Functions. We'll
talk more about this in step 2.
"@supabase/supabase-js": "https://esm.sh/@supabase/[email protected]",
"openai": "https://esm.sh/[email protected]",
"common-tags": "https://esm.sh/[email protected]",
"ai": "https://esm.sh/[email protected]",
"mdast-util-from-markdown": "https://esm.sh/[email protected]",
"mdast-util-to-markdown": "https://esm.sh/[email protected]",
"mdast-util-to-string": "https://esm.sh/[email protected]",
"unist-builder": "https://esm.sh/[email protected]",
"mdast": "https://esm.sh/v132/@types/[email protected]/index.d.ts",
"https://esm.sh/v132/[email protected]/esnext/decode-
named-character-reference.mjs": "https://esm.sh/decode-named-character-
[email protected]?target=deno"
}
}
EOF
Scaffold Frontend
We use shadcn/ui for our UI components.
Initialize shadcn-ui.
Build layouts.
Step 1 - Storage
Use this command to jump to the step-1 checkpoint.
Install dependencies
First install NPM dependencies.
npm i
Setup Supabase stack
When developing a project in Supabase, you can choose to develop locally or
directly on the cloud.
Local
Start a local version of Supabase (runs in Docker).
Store Supabase URL & public anon key in .env.local for Next.js.
NEXT_PUBLIC_SUPABASE_URL=<api-url>
NEXT_PUBLIC_SUPABASE_ANON_KEY=<anon-key>
You can get the project API URL and anonymous key from the API settings page.
await supabase.storage
.from('files')
.upload(`${crypto.randomUUID()}/${selectedFile.name}`, selectedFile);
Improve upload RLS policy
We can improve our previous RLS policy to require a UUID in the uploaded file path.
Storage
Documents
Embeddings
Chat
Database Types (Bonus)
You're done!
Use these commands to jump to the step-2 checkpoint.
Let's create a documents and document_sections table to store our processed files.
Documents ER diagram
We'll use pg_net later to send HTTP requests to our edge functions.
Unlike IVFFlat indexes, HNSW indexes can be create immediately on an empty table.
select vault.create_secret(
'http://api.supabase.internal:8000',
'supabase_url'
);
If you are developing directly on the cloud, open up the SQL Editor and set this to
your Supabase project's API URL:
select vault.create_secret(
'<api-url>',
'supabase_url'
);
You can get the project API URL from the API settings page.
return null;
end;
$$;
Import maps aren't required in Deno, but they can simplify imports and keep
dependency versions consistent. All of our dependencies come from NPM, but since
we're using Deno we fetch them from a CDN like https://esm.sh or
https://cdn.jsdelivr.net.
{
"imports": {
"@std/": "https://deno.land/[email protected]/",
"@supabase/supabase-js": "https://esm.sh/@supabase/[email protected]",
"openai": "https://esm.sh/[email protected]",
"common-tags": "https://esm.sh/[email protected]",
"ai": "https://esm.sh/[email protected]",
"mdast-util-from-markdown": "https://esm.sh/[email protected]",
"mdast-util-to-markdown": "https://esm.sh/[email protected]",
"mdast-util-to-string": "https://esm.sh/[email protected]",
"unist-builder": "https://esm.sh/[email protected]",
"mdast": "https://esm.sh/v132/@types/[email protected]/index.d.ts",
"https://esm.sh/v132/[email protected]/esnext/decode-
named-character-reference.mjs": "https://esm.sh/decode-named-character-
[email protected]?target=deno"
}
}
Note: URL based imports and import maps aren't a Deno invention. These are a web
standard that Deno follows as closely as possible.
(Optional) If you are using VS Code, you may get prompted to cache your imported
dependencies. You can do this by hitting cmd+shift+p and type >Deno: Cache
Dependencies.
Create Supabase client and configure it to inherit the original user’s permissions
via the authorization header. This way we can continue to take advantage of our RLS
policies.
if (!authorization) {
return new Response(
JSON.stringify({ error: `No authorization header passed` }),
{
status: 500,
headers: { 'Content-Type': 'application/json' },
}
);
}
if (!document?.storage_object_path) {
return new Response(
JSON.stringify({ error: 'Failed to find uploaded document' }),
{
status: 500,
headers: { 'Content-Type': 'application/json' },
}
);
}
Use the Supabase client to download the file by storage path.
if (!file) {
return new Response(
JSON.stringify({ error: 'Failed to download storage object' }),
{
status: 500,
headers: { 'Content-Type': 'application/json' },
}
);
}
if (error) {
console.error(error);
return new Response(
JSON.stringify({ error: 'Failed to save document sections' }),
{
status: 500,
headers: { 'Content-Type': 'application/json' },
}
);
}
console.log(
`Saved ${processedMd.sections.length} sections for file '${document.name}'`
);
Return a success response.
At the top of the component, fetch documents using the useQuery hook:
if (error) {
toast({
variant: 'destructive',
description: 'Failed to fetch documents',
});
throw error;
}
return data;
});
In each document's onClick handler, download the respective file.
if (error) {
toast({
variant: 'destructive',
description: 'Failed to download file. Please try again.',
});
return;
}
window.location.href = data.signedUrl;
Step 3 - Embeddings
Jump to a step:
Storage
Documents
Embeddings
Chat
Database Types (Bonus)
You're done!
Use these commands to jump to the step-3 checkpoint.
return null;
end;
$$;
Add embed trigger to document_sections table
The first specifies which column contains the text content to embed.
The second specifies the destination column to save the embedding into.
There are also 2 more optional trigger arguments available:
Just like before, grab the Supabase variables and check for their existence (type
narrowing).
// These are automatically injected
const supabaseUrl = Deno.env.get('SUPABASE_URL');
const supabaseAnonKey = Deno.env.get('SUPABASE_ANON_KEY');
if (!authorization) {
return new Response(
JSON.stringify({ error: `No authorization header passed` }),
{
status: 500,
headers: { 'Content-Type': 'application/json' },
}
);
}
if (selectError) {
return new Response(JSON.stringify({ error: selectError }), {
status: 500,
headers: { 'Content-Type': 'application/json' },
});
}
Generate an embedding for each piece of text and update the respective rows.
for (const row of rows) {
const { id, [contentColumn]: content } = row;
if (!content) {
console.error(`No content available in column '${contentColumn}'`);
continue;
}
if (error) {
console.error(
`Failed to save embedding on '${table}' table with id ${id}`
);
}
console.log(
`Generated embedding ${JSON.stringify({
table,
id,
contentColumn,
embeddingColumn,
})}`
);
}
Return a success response.
Storage
Documents
Embeddings
Chat
Database Types (Bonus)
You're done!
Use these commands to jump to the step-4 checkpoint.
Update Frontend
Install dependencies
npm i @xenova/transformers ai
We'll use Transformers.js to perform inference directly in the browser.
<Input
type="text"
autoFocus
placeholder="Send a message"
value={input}
onChange={handleInputChange}
/>
Generate an embedding and submit messages on form submit.
if (!generateEmbedding) {
throw new Error('Unable to generate embeddings');
}
const {
data: { session },
} = await supabase.auth.getSession();
if (!session) {
return;
}
handleSubmit(e, {
options: {
headers: {
authorization: `Bearer ${session.access_token}`,
},
body: {
embedding,
},
},
});
Disable send button until the component is ready.
Note: Our embeddings are normalized, so inner product and cosine similarity are
equivalent in terms of output. Note though that pgvector's <=> operator is cosine
distance, not cosine similarity, so inner product == 1 - cosine distance.
Note: match_threshold is negated because <#> is a negative inner product. See the
pgvector docs for more details on why <#> is negative.
together.ai
fireworks.ai
endpoints.anyscale.com
local models served with Ollama
Whichever provider you choose, you can reuse the code below (that uses the OpenAI
lib) as long as they offer an OpenAI-compatible API (all of providers listed above
do). We'll discuss how to do this in each step using Ollama, but the same logic
applies to the other providers.
if (!supabaseUrl || !supabaseAnonKey) {
return new Response(
JSON.stringify({
error: 'Missing environment variables.',
}),
{
status: 500,
headers: { 'Content-Type': 'application/json' },
}
);
}
if (!authorization) {
return new Response(
JSON.stringify({ error: `No authorization header passed` }),
{
status: 500,
headers: { 'Content-Type': 'application/json' },
}
);
}
if (matchError) {
console.error(matchError);
const injectedDocs =
documents && documents.length > 0
? documents.map(({ content }) => content).join('\n\n')
: 'No documents found';
You're only allowed to use the documents below to answer the question.
Documents:
${injectedDocs}
`,
},
...messages,
];
Note: the codeBlock template tag is a convenience function that will strip away
indentations in our multiline string. This allows us to format our code nicely
while preserving the intended indentation.
Note: we must also return CORS headers here (or anywhere else we send a response).
Storage
Documents
Embeddings
Chat
Database Types (Bonus)
You're done!
Use these commands to jump to the step-5 checkpoint.
In React
Looks like we found a type error in ./app/files/page.tsx! Let's add this check to
top of the document's click handler (type narrowing).
if (!document.storage_object_path) {
toast({
variant: 'destructive',
description: 'Failed to download file, please try again.',
});
return;
}
You're done!
🎉 Congrats! You've built your own full stack pgvector app in 2 hours.
If you would like to jump directly to the completed app, simply checkout the main
branch:
Storage
Documents
Embeddings
Chat
Database Types (Bonus)
You're done!
🚀 Going to prod
If you've been developing the app locally, follow these instructions to deploy your
app to a production Supabase project.
📈 Next steps
Feel free to extend this app in any way you like. Here are some ideas for next
steps:
Record message history in the database (and generate embeddings on them for RAG
memory)
Support more file formats than just markdown
Pull in documents from the Notion API
Restrict chat to user-selected documents
Perform RAG on images using CLIP embeddings
💬 Feedback and issues
Please file feedback and issues on the on this repo's issue board.
youtu.be/ibzlEQmgPPY
Topics
ai vector ml embeddings db rag supabase
Resources
Readme
Code of conduct
Code of conduct
Activity
Custom properties
Stars
326 stars
Watchers
11 watching
Forks
113 forks
Report repository
Releases
6 tags
Packages
No packages published
Contributors
4
@gregnr
gregnr Greg Richardson
@louisguitton
louisguitton Louis Guitton
@dependabot[bot]
dependabot[bot]
@kirso
kirso Kirill So
Languages
TypeScript
85.8%
PLpgSQL
9.2%
JavaScript
3.1%
CSS
1.9%
Footer
© 2024 GitHub, Inc.
Footer navigation
Terms
Privacy
Security
Status
Docs
Contact
Manage cookies
Do not share my personal information