-
Notifications
You must be signed in to change notification settings - Fork 70
implement GetAllDocuments() #110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -477,6 +477,40 @@ | |
| } | ||
| } | ||
|
|
||
| func TestCollection_ListDocuments(t *testing.T) { | ||
| // Create collection | ||
| db := NewDB() | ||
| name := "test" | ||
| metadata := map[string]string{"foo": "bar"} | ||
| vectors := []float32{-0.40824828, 0.40824828, 0.81649655} // normalized version of `{-0.1, 0.1, 0.2}` | ||
| embeddingFunc := func(_ context.Context, _ string) ([]float32, error) { | ||
| return vectors, nil | ||
| } | ||
| c, _ := db.CreateCollection(name, metadata, embeddingFunc) | ||
|
|
||
| // Add documents | ||
| ids := []string{"1", "2", "3", "4"} | ||
| metadatas := []map[string]string{{"foo": "bar"}, {"a": "b"}, {"foo": "bar"}, {"e": "f"}} | ||
| contents := []string{"hello world", "hallo welt", "bonjour le monde", "hola mundo"} | ||
| c.Add(context.Background(), ids, nil, metadatas, contents) | ||
|
Check failure on line 495 in collection_test.go
|
||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here the returned error should be checked |
||
|
|
||
| // Get all documents | ||
| docs, err := c.ListDocuments() | ||
| if err != nil { | ||
| t.Fatal("expected nil, got", err) | ||
| } | ||
| if len(docs) != 4 { | ||
| t.Fatal("expected 4 documents, got", len(docs)) | ||
| } | ||
|
|
||
| // since the documents are returned in a random order, we just check if there is a document with the content "hello world" | ||
| for _, doc := range docs { | ||
| if doc.Content == "hello world" { | ||
| break | ||
| } | ||
| } | ||
|
Comment on lines
+507
to
+511
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here the test doesn't assert whether the content was found or not. You can introduce a new variable |
||
| } | ||
|
|
||
| func TestCollection_Delete(t *testing.T) { | ||
| // Create persistent collection | ||
| tmpdir, err := os.MkdirTemp(os.TempDir(), "chromem-test-*") | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The slice is new and can be modified without affecting the internal
c.documents, but the documents themselves are not full copies. When you dox := y, then y's simple fields (int, string) are entirely new, but maps and slices are only_shallow_ copies.Demonstration: https://go.dev/play/p/0OccI4ibtS2
See the above
GetByIDwhere theMetadataandEmbeddingfields are cloned separately to create an entirely new document.So here we have two options:
c.Documentsmap, or delete them, and it doesn't affect the returned slice. Here's an example in chromem-go where something similar is done:chromem-go/db.go
Lines 517 to 522 in 8311eb0
GetByIDfor each document, or by copying the code from that method. The former leads to less code, but one extra operation per document (thec.Documentslookup).