Skip to content

Add LangCache API integration for semantic caching #321

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from

Conversation

abrookins
Copy link
Collaborator

@abrookins abrookins commented Apr 17, 2025

Description

This PR adds integration with the LangCache SDK to provide enhanced LLM response
caching capabilities in RedisVL. The new LangCache class extends our existing
LLM caching functionality to provide a RedisVL-compatible interface that backed by
the LangCache API for semantic caching.

Features

  • New LangCache class that implements the BaseLLMCache interface
  • Semantic similarity search for finding relevant cached responses
  • Configurable distance thresholds for controlling cache hit precision
  • Optional TTL support for cache entries
  • Support for entry scopes to organize and manage cache entries
  • Full async support for all operations

Implementation Details

  • Integrates with the LangCache SDK for core functionality
  • Maintains compatibility with existing RedisVL patterns and interfaces
  • Includes comprehensive test coverage for all features
  • Supports both Redis client and URL-based initialization

Dependencies

  • Vendors the langcache package in the repository (temporary, until SDK is on PyPI)

Gaps / Open Questions

  • No integration tests with the live API, which I may leave as a TODO
  • This is available as a top-level import: should it be in an experimental module or otherwise indicate that it's experimental?

@abrookins abrookins changed the base branch from main to 0.6.0 April 18, 2025 00:56
@abrookins abrookins changed the title WIP on LangCache integration Add LangCache API integration for semantic caching Apr 23, 2025
@abrookins abrookins marked this pull request as ready for review April 23, 2025 17:34
@@ -160,7 +160,7 @@ def test_empty_list_to_bytes():
assert array_to_buffer(array, dtype="float32") == expected


@pytest.mark.parametrize("dtype", ["float64", "float32", "float16", "bfloat16"])
@pytest.mark.parametrize("dtype", ["float64", "float32", "float16"])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will restore

# In xdist, the config has "workerid" in workerinput
workerinput = getattr(request.config, "workerinput", {})
worker_id = workerinput.get("workerid", "master")

# construct a search index from the schema
index = AsyncSearchIndex.from_dict(
{
"index": {
"name": "user_index",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update both the index name AND the prefix to do proper isolation here

# In xdist, the config has "workerid" in workerinput
workerinput = getattr(request.config, "workerinput", {})
worker_id = workerinput.get("workerid", "master")

index = SearchIndex.from_dict(
{
"index": {
"name": "user_index",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, index name

return AsyncSearchIndex.from_yaml("schemas/test_json_schema.yaml")
def async_index_from_yaml(request):
# In xdist, the config has "workerid" in workerinput
workerinput = getattr(request.config, "workerinput", {})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can make worker_id a fixture in conftest ?

import json
from typing import Any, Dict, List, Optional, Union

from langcache import LangCache as LangCacheSDK
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be treated as optionals and lazy loaded??

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, definitely, and especially because you can't actually try the service yet.

def check(
self,
prompt: Optional[str] = None,
vector: Optional[List[float]] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make vector an optional **kwarg on the base model?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Isn't it already optional?

raise TypeError("return_fields must be a list of field names")

# Use the provided threshold or fall back to the instance default
threshold = (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure we use the same metric for the threshold? I thought LangCache used similarity 0-1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SemanticCache does, but the 0-2 range doesn't seem to be encoded anywhere else (in the base classes) as an assumption or validation. 🤔 Hmm! I guess we need to treat thresholds coming to this class as 0-2 if it's a swappable component with SemanticCache, and hopefully one day soon normalize the value everywhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I normalized and denormalized. We'll have to return to this later, but hoping that does the trick.

@abrookins
Copy link
Collaborator Author

We're going to pull this from 0.6.0 for now, but I'll roll the text fixes into a separate PR.

@abrookins
Copy link
Collaborator Author

Pushed my latest changes up before closing (for now). This PR will ride again. 🐴

@abrookins abrookins closed this Apr 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants