-
Notifications
You must be signed in to change notification settings - Fork 45
Add LangCache API integration for semantic caching #321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tests/unit/test_utils.py
Outdated
@@ -160,7 +160,7 @@ def test_empty_list_to_bytes(): | |||
assert array_to_buffer(array, dtype="float32") == expected | |||
|
|||
|
|||
@pytest.mark.parametrize("dtype", ["float64", "float32", "float16", "bfloat16"]) | |||
@pytest.mark.parametrize("dtype", ["float64", "float32", "float16"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will restore
tests/conftest.py
Outdated
# In xdist, the config has "workerid" in workerinput | ||
workerinput = getattr(request.config, "workerinput", {}) | ||
worker_id = workerinput.get("workerid", "master") | ||
|
||
# construct a search index from the schema | ||
index = AsyncSearchIndex.from_dict( | ||
{ | ||
"index": { | ||
"name": "user_index", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should update both the index name AND the prefix to do proper isolation here
# In xdist, the config has "workerid" in workerinput | ||
workerinput = getattr(request.config, "workerinput", {}) | ||
worker_id = workerinput.get("workerid", "master") | ||
|
||
index = SearchIndex.from_dict( | ||
{ | ||
"index": { | ||
"name": "user_index", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, index name
return AsyncSearchIndex.from_yaml("schemas/test_json_schema.yaml") | ||
def async_index_from_yaml(request): | ||
# In xdist, the config has "workerid" in workerinput | ||
workerinput = getattr(request.config, "workerinput", {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can make worker_id
a fixture in conftest ?
import json | ||
from typing import Any, Dict, List, Optional, Union | ||
|
||
from langcache import LangCache as LangCacheSDK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be treated as optionals and lazy loaded??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, definitely, and especially because you can't actually try the service yet.
def check( | ||
self, | ||
prompt: Optional[str] = None, | ||
vector: Optional[List[float]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make vector an optional **kwarg on the base model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Isn't it already optional?
raise TypeError("return_fields must be a list of field names") | ||
|
||
# Use the provided threshold or fall back to the instance default | ||
threshold = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we sure we use the same metric for the threshold? I thought LangCache used similarity 0-1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SemanticCache does, but the 0-2 range doesn't seem to be encoded anywhere else (in the base classes) as an assumption or validation. 🤔 Hmm! I guess we need to treat thresholds coming to this class as 0-2 if it's a swappable component with SemanticCache, and hopefully one day soon normalize the value everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I normalized and denormalized. We'll have to return to this later, but hoping that does the trick.
We're going to pull this from 0.6.0 for now, but I'll roll the text fixes into a separate PR. |
Pushed my latest changes up before closing (for now). This PR will ride again. 🐴 |
Description
This PR adds integration with the LangCache SDK to provide enhanced LLM response
caching capabilities in RedisVL. The new
LangCache
class extends our existingLLM caching functionality to provide a RedisVL-compatible interface that backed by
the LangCache API for semantic caching.
Features
LangCache
class that implements theBaseLLMCache
interfaceImplementation Details
Dependencies
langcache
package in the repository (temporary, until SDK is on PyPI)Gaps / Open Questions
experimental
module or otherwise indicate that it's experimental?