You picked a vector store before picking a problem
You opened three tabs—Pinecone pricing, a Weaviate Docker compose file, and a blog post about pgvector—and your brain quietly refused to commit. Same. Every option has fans, benchmarks, and a Slack thread where someone says they migrated twice.
The real question is not "which vector database wins." It is which one matches your team, your existing Postgres bill, and how much vector search you actually need at query time versus ingest time.
By the end of this vector database comparison, you will know when Pinecone, Weaviate, or pgvector makes sense for a Node.js or TypeScript RAG backend—and when the boring choice is the correct one.
What vector databases do in a Node.js RAG stack
A RAG pipeline turns documents into embeddings, stores them with metadata, and retrieves the closest vectors when a user asks a question. Your Express, Fastify, or Next.js API calls an embedding model, runs a similarity search, and passes the top chunks to an LLM.
Vector databases specialize in approximate nearest neighbor search at scale. They index high-dimensional floats efficiently so you are not scanning every row on each request. Metadata filters—tenant ID, doc version, product area—narrow the search before or during the vector scan.
Postgres with pgvector can do the same job for smaller workloads. Pinecone and Weaviate push that work onto purpose-built engines with hosted or self-managed ops models. The architecture diagram changes less than the on-call rotation does.
Pinecone: managed vectors without running infra
Pinecone is a fully managed vector database. You create an index, upsert vectors with metadata, and query by vector plus optional filters. No shards to tune, no JVM heap to babysit, no replica lag dashboards unless you want them.
Indexes are serverless or pod-based depending on plan and latency targets. Serverless fits spiky dev traffic and early products. Pods give predictable performance when you know your QPS and dimension count.
Pinecone speaks HTTP APIs and official SDKs including TypeScript. That fits a Node.js backend where ingest runs in a worker and the API only queries. You trade operational control for speed to production.
How Pinecone fits a TypeScript ingest worker
A typical ingest job embeds batches, then upserts records with stable IDs derived from content hashes. Pinecone expects dense vectors and JSON metadata with size limits on stored text.
import { Pinecone } from "@pinecone-database/pinecone";
const pc = new Pinecone({ apiKey: process.env.PINECONE_API_KEY! });
const index = pc.index("docs-prod");
await index.upsert([
{
id: chunk.id,
values: chunk.embedding,
metadata: {
tenant_id: chunk.tenantId,
source_url: chunk.sourceUrl,
heading_path: chunk.headingPath.join(" > "),
},
},
]);Query time stays thin: embed the question, call index.query with a metadata filter on tenant_id, hydrate full chunk text from PostgreSQL if you truncated metadata. That split keeps Pinecone lean and your source of truth in a relational store you already trust.
Weaviate: schema-first search with vectors baked in
Weaviate is an open-source vector database you can self-host or run on Weaviate Cloud. It models data as classes with properties, vectorizers, and optional cross-references— closer to a search engine that also does vectors than a pure vector index.
You define a schema up front: property types, indexing, vectorizer modules, and whether properties participate in BM25 keyword search. Hybrid search—combining vector similarity with keyword scoring— is a first-class feature, not a bolt-on.
Teams that want semantic search plus traditional filters without gluing Elasticsearch and a vector store often land here. The tradeoff is schema ceremony and operational surface when self-hosting.
GraphQL, modules, and hybrid search in Weaviate
Weaviate exposes GraphQL for queries. You can fetch objects, vectors, and hybrid scores in one request. Modules connect embedding providers or bring-your-own vectors.
import weaviate, { ApiKey } from "weaviate-ts-client";
const client = weaviate.client({
scheme: "https",
host: process.env.WEAVIATE_HOST!,
apiKey: new ApiKey(process.env.WEAVIATE_API_KEY!),
});
await client.data.creator()
.withClassName("DocumentChunk")
.withProperties({
text: chunk.text,
tenantId: chunk.tenantId,
sourceUrl: chunk.sourceUrl,
})
.withVector(chunk.embedding)
.do();Hybrid queries help when users search with exact product codes, error strings, or API method names where pure embedding similarity wobbles. You configure alpha to weight vector versus keyword contribution. That is valuable for developer docs and support corpora in production RAG systems.
pgvector: Postgres extension when you already run Postgres
pgvector adds a vector column type and similarity operators to PostgreSQL. If your app already stores users, billing, and chunk manifests in Postgres, you can store embeddings in the same database without a new network hop or vendor.
Indexes include IVFFlat and HNSW for approximate search. Exact search works for small tables. You write normal SQL with ORDER BY embedding <=> query_vector LIMIT k and combine it with JSONB filters, row-level security, and transactions.
pgvector shines when vector count stays in the low millions per logical index, when your team knows Postgres cold, and when strong consistency with relational data matters more than sub-10ms p99 at billions of vectors.
SQL joins, transactions, and the pgvector tradeoff
The killer feature is the join. Fetch top chunks and join to documents, tenants, or permissions in one query. No dual-write sync between Postgres and an external vector store.
import pg from "pg";
import format from "pg-format";
const pool = new pg.Pool({ connectionString: process.env.DATABASE_URL });
const embeddingLiteral = `[${queryVector.join(",")}]`;
const { rows } = await pool.query(
`SELECT c.id, c.text, c.source_url,
c.embedding <=> $1::vector AS distance
FROM document_chunks c
JOIN documents d ON d.id = c.document_id
WHERE c.tenant_id = $2
AND d.published = true
ORDER BY c.embedding <=> $1::vector
LIMIT 8`,
[embeddingLiteral, tenantId]
);At scale, you manage vacuum, index build times, and connection pooling yourself. pgvector is not free ops—it is familiar ops. That distinction matters when your platform team is three people and already on-call for Postgres.
Side-by-side vector database comparison
Numbers shift with workload, region, and index settings. This table captures the decision shape most Node.js teams care about when comparing Pinecone, Weaviate, and pgvector for RAG.
| Dimension | Pinecone | Weaviate | pgvector |
|---|---|---|---|
| Hosting model | Fully managed SaaS | Self-host or Weaviate Cloud | Your Postgres (RDS, Cloud SQL, etc.) |
| Best fit scale | Millions to billions of vectors | Millions+, hybrid search heavy | Thousands to low millions per index |
| Query features | Vector + metadata filters | Vector, BM25, hybrid, GraphQL | Vector + full SQL, joins, RLS |
| Schema flexibility | Metadata JSON, index per namespace | Typed classes and properties | Relational tables you design |
| Ops burden | Low | Medium to high if self-hosted | Low if Postgres already exists |
| TypeScript SDK | Official, mature | Official weaviate-ts-client | pg, Drizzle, Prisma via SQL |
| Typical RAG pattern | Vectors in Pinecone, text in Postgres | Chunks and vectors in Weaviate | Everything in Postgres |
| Migration friction | Export IDs + vectors, re-upsert elsewhere | Backup/restore or dual-write | pg_dump, extension version upgrades |
None of these rows declare a universal winner. They declare fit. A B2B SaaS with strict tenant isolation and existing RDS often starts with pgvector. A team shipping semantic search across messy wiki content may prefer Weaviate hybrid queries. A startup that wants zero vector infra before product-market fit often picks Pinecone.
When Pinecone is the right call
Choose Pinecone when vector search is core to the product and you do not want to hire for index tuning this quarter. Managed uptime, automatic scaling, and predictable SDK ergonomics matter more than squeezing every millisecond from custom hardware.
Multi-region latency requirements and high QPS search also push teams toward Pinecone pods or serverless with regional indexes. You are paying to not become a vector database operator.
Pinecone fits when metadata filters are straightforward JSON predicates and you are fine keeping canonical document text outside the index. Most production RAG backends already do that hydration step.
- You have no Postgres expertise on the team and no mandate to adopt it.
- Vector count will grow past comfortable pgvector single-node limits.
- Ingest is batch-heavy but query latency SLAs are tight.
- You want the fastest path from prototype to paid tier without running clusters.
When Weaviate earns its place
Pick Weaviate when hybrid search is a product requirement, not a nice-to-have. Users will type exact identifiers, log lines, and SKU codes alongside natural language questions. Pure vector similarity alone will frustrate them.
Weaviate also earns its place when you want object-level modeling—cross-references between chunks, authors, and products—in one system. GraphQL can simplify frontend data fetching if your React app queries search directly through a BFF.
Self-hosting Weaviate on Kubernetes is real work: persistence, backups, version upgrades, and module compatibility. Weaviate Cloud reduces that load if budget allows. Budget the ops line honestly before committing.
- Keyword plus semantic search in one API surface.
- Rich property typing and inverted indexes on text fields.
- Open-source requirement with optional managed hosting.
- Teams comfortable operating JVM-adjacent search infrastructure.
When pgvector is hard to beat
pgvector wins on simplicity when Postgres is already the system of record. Your chunks, permissions, audit logs, and billing rows live there. Adding a vector column avoids a second consistency model and a second on-call rotation.
Early-stage RAG features with tens or hundreds of thousands of vectors per tenant often perform well with HNSW indexes on modern Postgres. You can ship, measure, and defer a dedicated vector store until metrics prove you need one.
Regulated environments sometimes prefer pgvector because data stays inside an approved Postgres boundary. Export controls, encryption at rest, and backup policies you already audited apply to embeddings too.
- Strong transactional requirements between chunks and source records.
- Complex authorization via SQL joins or row-level security.
- Cost sensitivity and existing Postgres capacity headroom.
- Vector counts that fit one well-tuned instance with room to grow.
TypeScript patterns that work across all three
The biggest mistake is letting Pinecone or Weaviate types leak into every route handler. Wrap vector operations behind an interface your RAG service owns. Swap implementations when scale or search requirements change.
A thin vector store interface in Node.js
export type VectorRecord = {
id: string;
embedding: number[];
metadata: Record<string, string | number | boolean>;
};
export interface VectorStore {
upsert(records: VectorRecord[]): Promise<void>;
query(params: {
embedding: number[];
topK: number;
filter?: Record<string, unknown>;
}): Promise<{ id: string; score: number }[]>;
delete(ids: string[]): Promise<void>;
}Implement PineconeStore, WeaviateStore, and PgVectorStore behind that interface. Your ingest worker and chat route depend on VectorStore, not vendor SDK shapes. Integration tests hit an in-memory fake; staging hits the real backend.
Keep embedding generation separate too. The same OpenAI, Cohere, or local model output feeds any store. Version embedding models in metadata so re-index jobs are explicit when you change dimensions or providers.
Log retrieval traces: query text hash, top IDs, scores, latency, store name. When someone asks why answers got worse after a migration, you want diffs—not guesses.
Cost, ops, and the stuff nobody puts in the README
Pinecone bills on storage, read units, and write units—or pod capacity. Spiky reindex jobs can surprise you if you embed an entire corpus twice because chunk overlap changed. Rate-limit ingest and watch the dashboard during migrations.
Weaviate self-hosted costs are EC2, disk, and engineer hours. Cloud pricing bundles some of that. Factor in backup storage and restore drills—search indexes are stateful and painful to rebuild from scratch.
pgvector costs look like Postgres costs: bigger instance, more IOPS, longer backups. Often cheaper at small scale, but you pay in query tuning and index maintenance. Run EXPLAIN ANALYZE on hot retrieval queries the same way you would for billing reports.
Dimension count and distance metric must match at ingest and query time. Cosine versus inner product versus L2 trips teams constantly. Pick one, document it in your ingest config, and enforce it in CI with a schema check.
Metadata size limits differ. Pinecone caps metadata payload per record. Weaviate property types constrain what you store. Postgres accepts large text but you should not store megabyte chunks in the vector row—store pointers, hydrate after retrieval.
Migration paths when you outgrow your first choice
Teams outgrow pgvector when single-node CPU, index build times, or cross-region read patterns hurt more than migration cost. The usual path: dual-write new chunks to Pinecone or Weaviate, backfill historical vectors in batches, flip a feature flag on query path, drain the old index.
Moving off Pinecone means exporting vectors and metadata, then upserting elsewhere. Stable chunk IDs make that boring. Changing IDs creates duplicate retrieval ghosts—avoid that.
Weaviate migrations often hinge on schema mapping. Property names and class structures must translate cleanly. Script the transform in TypeScript, validate record counts, and run shadow queries comparing top-k overlap before cutover.
There is no shame in starting with pgvector and migrating later. Many production RAG systems do exactly that. The interface pattern above turns a six-week migration into a two-week one.
Summary
Pinecone, Weaviate, and pgvector are not interchangeable badges—they solve overlapping problems with different ops contracts. Pinecone optimizes for managed scale and fast Node.js integration. Weaviate optimizes for hybrid search and rich object modeling. pgvector optimizes for teams already living in Postgres who want vectors beside relational truth.
For a TypeScript RAG backend, pick based on search behavior, existing infra, and who wakes up when the index blinks—not based on which logo appeared most in conference talks. Wrap a thin VectorStore interface, log retrieval traces, and measure hit rate on real user questions. The right vector database comparison is the one that still feels correct six months after launch.
FAQ
Which vector database is best for a small Node.js RAG MVP?
If you already run Postgres, start with pgvector on the same instance and keep chunks in SQL. If you do not run Postgres and want zero ops, Pinecone serverless gets you querying fastest. Weaviate is worth the setup when you know hybrid keyword search will matter on day one.
Can I use pgvector and Pinecone together?
Yes, often during migration. Postgres remains the source of truth for text and permissions while Pinecone serves low-latency search at scale. Dual-write ingest with stable chunk IDs until you cut over query traffic completely.
Does Weaviate replace Elasticsearch for text search?
For many RAG products, yes—especially when hybrid vector plus BM25 covers your needs. If you rely on heavy aggregations, complex analyzers, or log-scale observability search, keep Elasticsearch for that workload and use Weaviate for document chunks.
How many vectors can pgvector handle before I need a dedicated store?
There is no fixed number. Hundreds of thousands to a few million 1536-dimension vectors on a well-sized instance with HNSW is common. Past that, watch p99 query latency, index rebuild duration, and vacuum pressure. Let metrics push you out, not blog post thresholds.
What embedding dimensions do these databases support?
All three support common embedding sizes—768, 1024, 1536, 3072—subject to version and index type. Your index must be created with the same dimension as your model output. Changing models usually means a new index and a full re-embed job.
Should I store full chunk text in the vector database?
Usually no for large chunks. Store a snippet or pointer in metadata, keep full text in Postgres or object storage, and hydrate after retrieval. That keeps metadata limits happy and avoids duplicating large strings across systems you pay to store.