Memoria: A Technical Overview of Venice's Memory System

Memoria: A Technical Overview of Venice's Memory System

A deep dive into how Venice remembers your conversations while keeping your data private.

Venice.ai

Venice is proud to introduce Memoria, our privacy-preserving memory system that enables AI to remember context from your past conversations. Unlike traditional cloud-based memory systems, Memoria stores all data locally in your browser, ensuring that your memories never leave your device.

You can toggle Memory in the settings on Venice.

This technical overview explains how Memoria works, what data it stores, and answers common questions about the feature.

How Memoria Works

Memoria uses vector embeddings to understand and retrieve relevant memories. When you chat with Venice:

1. During conversation: Your messages are converted into mathematical representations (vectors) that capture their meaning

2. Automatic extraction: Every few messages, Venice extracts key information and insights from the conversation

3. Intelligent retrieval: When you start a new conversation, Memoria searches for relevant past memories and provides them as context to the AI

This creates a more personalized experience where the AI can reference things you've discussed before, remember your preferences, and build on previous conversations.

What Data Is Stored?

Memoria stores the following data locally in your browser:

Data Type

Description

Example

Memory Text

Summary of conversation insights

"User is learning Python programming"

Vector Embedding

1024-dimensional mathematical representation

Compressed to ~1.4KB per memory

Sparse Tokens

Keywords for hybrid search

["python", "programming", "learning"]

Source

Where the memory came from

"venice" (auto-extracted), "manual", or file identifier

Memory Type

Classification of the memory

"extracted_summary", "user"

Importance Score

1-10 rating of relevance

Lower = more important

Timestamps

Creation and access times

ISO date strings

Privacy Guarantees

1. Local-only storage: All memory data is stored in IndexedDB within your browser. It never leaves your device unless explicitly shared.

2. Vector salting: Even the mathematical representations of your memories are transformed using a user-specific cryptographic salt derived from your encryption key. This means:

  • Your embeddings are unique to you

  • They cannot be correlated with other users' data

  • Even if intercepted, they cannot be reverse-engineered

3. No server-side storage: Venice servers only generate embeddings transiently. They do not store your memories or the content used to create them.

4. Anonymized model protection: You can disable memory sharing with third-party ("anonymized") models that may have different privacy guarantees than Venice's private models.

Chat Memory vs. Character Memory

Memoria provides two separate memory pools:

Chat Memory

  • Used in regular conversations (non-character chats)

  • Enabled via Settings → Memory → Chat Memory toggle

  • Memories are scoped to a special "chat_memory" identifier

  • Ideal for general knowledge about you, your preferences, and ongoing projects

Character Memory

  • Used when chatting with AI characters

  • Each character has their own isolated memory pool

  • Memories are scoped by character ID

  • Enables characters to "remember" your relationship and past conversations

Important: Chat Memory and Character Memory are completely separate. Memories from regular chats won't appear in character conversations, and vice versa.

Document Uploads

You can enhance memory by uploading documents (PDF or text files):

How Document Processing Works

1. Text extraction: Documents are parsed to extract readable text

2. Chunking: Large documents are split into overlapping segments (~1200 characters with 200-character overlap)

3. Embedding: Each chunk is converted to a vector embedding

4. Storage: Chunks are stored with a unique source identifier derived from the filename

Using Document Memories

  • Enable/Disable per document: You can toggle individual documents on/off without deleting them

  • Source filtering: Disabled documents are excluded from memory searches

  • File limits: Up to 50 documents for Chat Memory, 15 per character

Best Practices for Documents

  • Upload reference material you want the AI to remember

  • Use clear, descriptive filenames (the filename is included in the first chunk)

  • Supported formats: PDF, plain text, and other text-based files

  • Password-protected PDFs are not supported

Frequently Asked Questions

Does disabling Chat Memory delete all my memories?

No. Disabling Chat Memory only stops the AI from accessing and creating new memories. Your existing memories remain stored in your browser and will be available again if you re-enable the feature.

To actually delete memories, you must explicitly delete them through:

Settings → Memory → Interactions → Delete individual memories

Does clearing browser data / logging out delete my memories?

Yes, potentially. Since memories are stored in IndexedDB (browser local storage):

- Clearing site data for venice.ai will delete all memories

- Clearing all browser data will delete all memories

- Logging out does not delete memories (they remain in IndexedDB)

- Using a different browser or device means you won't have access to memories stored elsewhere

Recommendation: Consider your memories as browser-specific. If you use multiple devices, each will have its own independent memory store.

Does restoring from a backup restore my memories?

No, not currently. The backup/restore system backs up your conversations, characters, and settings, but memories are not included in backups.

This is a known limitation. Memories are stored in a separate database structure optimized for vector search, and backup integration is planned for a future release.

How do I best use this feature?

For Chat Memory:

1. Enable Chat Memory in Settings → Memory

2. Optionally disable "Auto-generate memories" if you prefer manual control

3. Add important facts manually using the "Add Memory" button

4. Upload reference documents you want the AI to remember

5. Periodically review and clean up irrelevant memories

For Character Memory:

1. Enable Character Memory in the character's settings

2. Use the "Extraction prompt" field to customize what the AI remembers about conversations

3. Upload character-specific documents (lore, backstory, reference material)

4. Review memories in the character's Memory tab

Pro tips:

- Memories work best when they're concise and factual

- The AI retrieves the most relevant memories based on your current message

- You can edit memories to correct or refine them

- Toggle off documents temporarily rather than deleting if you might need them later

What data is shared with anonymized (third-party) models?

When using models not hosted directly by Venice (marked as "Anonymized"):

- By default: Memories are NOT shared with these models

- If enabled: The "Share memories with anonymized models" toggle allows memory context to be sent

- Privacy note: Third-party providers may have different data retention policies than Venice

We recommend keeping this toggle off unless you specifically need memory context with a particular third-party model.

Technical Deep Dive

Hybrid Search Algorithm

Memoria uses a sophisticated hybrid search combining:

1. Dense vector search: FAISS-based similarity search using compressed int8 vectors

2. Sparse BM25-style search: Keyword matching for precise term recall

3. Reciprocal Rank Fusion (RRF): Combines both approaches with adaptive weighting

The search dynamically adjusts its strategy based on your query:

- Short queries (≤4 words) favor keyword matching

- Longer queries favor semantic similarity

- Strong keyword matches boost the sparse component

Memory Extraction

Every 3rd assistant response, Memoria automatically extracts insights:

1. Collects the last 5 user messages and 5 assistant messages

2. Sends to an extraction model with a specialized prompt

3. Receives a summary and importance score

4. Filters out unimportant memories (importance > 8)

5. Stores the memory with a salted embedding

Compression & Efficiency

To minimize storage impact:

- Embeddings are quantized from float32 to int8 (~75% reduction)

- Embeddings are base64 encoded for storage (~67% total savings)

- Each memory uses approximately 1.4KB of storage

- The browser quota for IndexedDB is typically 100MB-1GB, supporting thousands of memories

Privacy Summary

Aspect

Status

Memory storage location

Your browser only (IndexedDB)

Server-side memory storage

None

Cross-device sync

Not supported

Encryption

Salted embeddings with user-specific keys

Third-party model access

Off by default, user-controlled

Backup inclusion

Not yet supported

Data portability

Browser-specific

Memory is now on Venice with Memoria

Memoria represents Venice's commitment to AI capabilities without compromising privacy. By keeping your memories local and using cryptographic techniques to protect even the mathematical representations of your data, we've built a system that gives you the benefits of persistent AI memory while maintaining the privacy principles Venice was founded on.

Have questions or feedback about Memoria? Join the conversation in our Discord.

Back to all posts
Room