Cognee + LanceDB: Simplifying RAG for Developers

We love building with Retrieval-Augmented Generation (RAG), but if you’ve worked with it, you’re likely familiar with the challenges this framework brings. While RAG enhances Large Language Models (LLMs) by integrating external data sources, the data preparation process, which involves ETL pipelines, metadata management, and vector database integration, is often anything but smooth. Even experienced developers can find production-grade RAG systems tricky to manage, and that complexity only grows when you’re running code in parallel.

We discovered this first-hand while building cognee. Our automated tests kept stepping on each other’s toes whenever they hit the same vector database collections. We found the remedy in LanceDB, a multimodal AI database built on top of the Lance columnar data format. It gave us a simple, easy-to-destroy local store for vectors, which enabled ups to run a local instance for each test environment. Today, LanceDB powers all our automated tests, and we also use it as the default vector database during development for the same reason.

By pairing cognee’s graph-driven approach with LanceDB, developers can build robust RAG systems without the usual operational headaches. In this article, we’ll explore the limitations of traditional RAG implementations, how cognee redefines workflows, and how LanceDB simplifies the process. Plus, we’ll do a walkthrough of a practical example to show how it all fits together.

Why RAG Falls Short

LLMs promise intuitive access to data, yet conventional approaches like RAG, which are supposed to make their outputs more accurate and relevant, often stumble due to:

Static Representations: RAG methods depend on precomputed vectors, which fail to adapt dynamically as data changes.
Limited Context Understanding: Relationships between data points are often lost in vector-only systems.
Inefficiency at Scale: Large-scale retrieval from unstructured data stores becomes computationally expensive.

The Power of Graphs

Graphs serve as powerful tools for representing relationships between data points. Unlike static vector systems, graphs map dynamic, interconnected data structures, making them especially useful for:

Dynamic Data Representation: Graphs adapt as data changes, preserving the validity of data. Unlike RAG’s reliance on precomputed (and potentially stale) vectors, graph-based approaches stay up-to-date, keeping context and relationships current.
Enhanced Semantic Understanding: Graph-based structures align naturally with human reasoning, creating relationships between relevant information.
Scalable Data Handling: Graphs enable efficient querying and retrieval, even in complex datasets.

Cognee leverages these advantages by marrying semantic graph models with vector databases for real-time interaction.

How Cognee Transforms RAG

Cognee is a memory engine that provides a unified semantic layer for data management. At its core, cognee combines graph semantics (contextual relationships) with modular VectorDB adapters (fast similarity search) to deliver an accurate and efficient data interaction experience for AI apps and agents.

Key Features of Cognee

Unified Ingestion: One of cognee’s standout features is its ability to process diverse data formats. By enabling seamless integration of structured, semi-structured, and unstructured data, cognee creates a consistent memory layer, enabling efficient querying, analysis, and application development without the need for format-specific preprocessing.

mp3_file_path = os.path.join(
        pathlib.Path(__file__).parent.parent.parent,
        ".data/multimedia/text_to_speech.mp3",
    )
png_file_path = os.path.join(
        pathlib.Path(__file__).parent.parent.parent,
        ".data/multimedia/example.png",
    )

# Add the files, and make it available for cognify
await cognee.add([mp3_file_path, png_file_path])

DataPoint Model: In cognee, every piece of data—whether it’s a file, database record, or chunk of information—is represented as a “DataPoint.” Each DataPoint is stored as a node in a graph and enhanced with semantic metadata and vector embeddings. This approach preserves the context and relationships among DataPoints, making it easier to perform accurate semantic searches, navigate interconnected data, and retrieve information more efficiently.
Graph + Vector Hybrid: By merging the scalability of vectors with the contextual depth of graphs, cognee delivers both performance and semantic depth for real-time data interactions.

Database Adapters

With its modular adapter design, cognee can integrate with multiple VectorDBs. Adapters in cognee act as bridges between its internal systems (like data ingestion, embedding, and querying) and infrastructure that it runs on. They allow users to interact with various storage backends by eliminating database-specific complexities.

VectorDB Adapters and LanceDB

Cognee’s LanceDB adapter leverages Apache Arrow’s in-memory columnar storage format for lightning-fast queries and analytics.

LanceDB innovates on traditional VectorDB design by:

Using Apache Arrow for efficient, memory-optimized data handling. Apache Arrow provides:
- Columnar Data Storage: Optimizes memory usage and speeds up analytical queries by aligning data contiguously in memory.
- Interoperability: Arrow’s format supports seamless data sharing between different systems and programming languages.
- Zero-Copy Reads: Data operations in Arrow avoid serialization/deserialization overhead, making it ideal for high-performance systems like LanceDB.
In LanceDB, vectors and metadata are stored as Arrow tables, enabling rapid access and updates while maintaining scalability.
Supporting dynamic schema evolution to accommodate changing data models.
Providing seamless query integration with graph-based systems.

Cognee also embraces emerging data formats like Iceberg, an open table format designed for data lakes. Iceberg simplifies the management of large-scale datasets by supporting features like schema evolution, time travel queries, and efficient partitioning. Unlike traditional table formats, Iceberg maintains high performance even when working with petabyte-scale data.

For developers, Iceberg’s time travel capabilities allow for analysis of historical data states, enabling auditing and debugging without impacting current workflows. Schema evolution in Iceberg allows data structures to adapt to new requirements seamlessly, providing the flexibility needed for dynamic and scalable RAG systems. Read more about it in this blog post.

How Cognee Utilizes LanceDB

While building the knowledge graph, cognee needs a way to index graph entities and relationships. These indices serve as shortcuts to nodes in the graph, enabling their fast retrieval later. This is useful in search scenarios, where entities and relationships extracted from the query don’t perfectly match those in the graph. LanceDB's similarity search helps us match these entities with existing ones, allowing the search to focus on the right nodes.

LanceDB Integration with Cognee

Cognee interacts with LanceDB via the LanceDBAdapter. Here’s how:

Setup

LanceDB can store data either in a local file database or on a remote instance. When no api_key is provided, cognee defaults to using the local file database. The LanceDBAdapter uses an EmbeddingEngine to convert textual or raw data into vector representations. During adapter instantiation, we specify which EmbeddingEngine to use.
```
embedding_engine = LiteLLMEmbeddingEngine(...)

LanceDBAdapter(
    url=LANCEDB_URL_OR_FILE_PATH,
    api_key=API_KEY_IF_REMOTE_LANCEDB,
    embedding_engine=embedding_engine
)
```
Connection Management

The LanceDBAdapter establishes an asynchronous connection to LanceDB, ensuring non-blocking interactions. It uses the lancedb.AsyncConnection to interact with the database.
```
async def get_connection(self):
    if self.connection is None:
        self.connection = await lancedb.connect_async(self.url, api_key=self.api_key)
    return self.connection
```
This method ensures that connections are lazily initialized, reducing overhead until a connection is required.
Vectorization

Data in cognee is represented as DataPoint objects, which define what information needs to be vectorized before storage. Here, we use the embedding engine we passed earlier.
```
async def embed_data(self, data: list[str]) -> list[list[float]]:
    return await self.embedding_engine.embed_text(data)
```
Embedding ensures that raw data becomes compatible with LanceDB’s vector search capabilities.
Schema Definition with Arrow

LanceDB requires a schema for its collections. The LanceDBAdapter dynamically defines schemas based on the DataPoint structure using LanceModel.
```
class LanceDataPoint(LanceModel):
    id: str
    vector: Vector(vector_size)
    payload: BaseModel
```
- id: A unique identifier for the data point, stored as a string.
- vector: The embedded vector representation, stored as an Arrow vector column.
- payload: Additional metadata, stored as Arrow-compatible fields for efficient querying.

Data Insertion

The create_data_points method embeds raw data, converts it into the LanceDB schema, and inserts it into the database using Arrow’s optimized storage.

async def create_data_points(self, collection_name: str, data_points: list[DataPoint]):
    connection = await self.get_connection()
    collection = await connection.open_table(collection_name)

    data_vectors = await self.embed_data(
        [DataPoint.get_embeddable_data(data_point) for data_point in data_points]
    )

    lance_data_points = [
        LanceDataPoint(
            id=str(data_point.id),
            vector=vector,
            payload=get_own_properties(data_point)
        )
        for data_point, vector in zip(data_points, data_vectors)
    ]

    await collection.merge_insert("id").execute(lance_data_points)

Key Features:

Batch Insertion: Groups data points for efficient insertion.
Merge Insert: Ensures that existing records are updated, and new records are inserted seamlessly.
Arrow Table: Data points are converted into Arrow-compatible rows for optimized storage.

Querying

LanceDB supports fast, vector-based querying:

async def search(self, collection_name: str, query_vector: List[float], limit: int):
    connection = await self.get_connection()
    collection = await connection.open_table(collection_name)

    results = await collection.vector_search(query_vector).limit(limit).to_pandas()
    return [ScoredResult(id=result["id"], payload=result["payload"], score=result["score"]) for result in results.to_dict("index").values()]

Search in three steps:
- Uses vector_search to find similar vectors in the collection.
- Converts results to Pandas DataFrames for easy manipulation.
- Normalizes and returns scores for relevance ranking.

Data Management

The adapter includes robust support for data deletion and pruning:

Deletion:

async def delete_data_points(self, collection_name: str, data_point_ids: list[str]):
    connection = await self.get_connection()
    collection = await connection.open_table(collection_name)
    await collection.delete(f"id IN {tuple(data_point_ids)}")

Pruning Graph Data:

async def prune(self):
    if self.url.startswith("/"):
        LocalStorage.remove_all(self.url)

See the full implementation of the LanceDBAdapter here.

Example Notebook with Cognee—Inputting Multimedia Files

Implementing cognee is as easy as shown below in the multimedia example. We mentioned that in cognee, each piece of data is captured as a DataPoint and placed in a semantic graph, while its vector representation is stored through a VectorDB adapter. When you run cognify(), cognee analyzes these DataPoints, linking them into a graph structure based on contextual cues and relationships.

Under the hood, the adapter handles embedding and indexing, allowing cognee to blend graph-based relationships with efficient vector retrieval. This architecture ensures that developers can navigate data semantically while still leveraging high-performance vector queries, without having to manage the complexities of low-level storage.

import cognee
from cognee.api.v1.search import SearchType

# Create a clean slate for cognee -- reset data and system state
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)

# Add multimedia files and make them available for cognify
await cognee.add([mp3_file_path, png_file_path])

# Create knowledge graph with cognee
await cognee.cognify()

# Query cognee for summaries of the data in the multimedia files
search_results = await cognee.search(
    SearchType.SUMMARIES,
    query_text="What is in the multimedia files?",
)

# Display search results
for result_text in search_results:
    print(result_text)
    
# In the results, we see a summary of the input files.

For a detailed, interactive walkthrough of this implementation, you can go over the Colab notebook here.

Why We Chose LanceDB

LanceDB solves a very real problem for developers: external infrastructure. We discovered LanceDB while developing our first automated tests. We needed an environment that could be easily built and destroyed for each test run. A key challenge was preventing data interference when running multiple tests in parallel—tests accessing the same vector database collections could affect each other's results.

Our solution was to use a separate LanceDB local vector database for each test run. This approach provides a clean environment for every test, ensuring complete data isolation.

Today, LanceDB powers all our automated tests, and several team members use it as their primary vector database during development.

The Future of Scalable Data Workflows Starts Here

Powered by LanceDB’s powerful vector storage features, its innovative work with Arrow, and its flexibility to embrace emerging formats like Iceberg, cognee is transforming the way we handle data. This approach simplifies data interaction by enabling developers to efficiently build scalable and dynamic RAG workflows that adapt to changing data and enhance application performance.

If you’re curious to see what cognee and LanceDB can do, check out cognee’s repo or visit cognee.ai to learn more - then be sure to join the vibrant community of data enthusiasts who are shaping this exciting frontier together.

Join the cognee community to hear about new releases, use cases, and all the things we're working on.

Cognee News

Exploring AI Memory on the AI Engineering Podcast

Learn about differences between episodic and semantic memory systems, real-world use cases, challenges. In the AI Engineering Podcast, explore cognee's AI memory.

3 mins read

Vasilije Markovic

Deep Dives

Going beyond Langchain + Weaviate and towards a production ready modern data platform

In 2023, 7,000 AI projects launched, fueled by model advances and collaboration. Yet, many remain basic, highlighting the need for a unified LLM platform.

17 mins read

Vasilije Markovic

Aug 1, 2024

Deep Dives

From APIs and Relational Data to Knowledge Graphs

Build Knowledge Graphs with cognee by running efficient data pipelines, and turning API data into interconnected semantic graphs. Boost your data insights today!

11 mins read

Hande Kafkas

Mar 5, 2025