Skip to content

Fix: Ensure max_sim Function Persists After MultiVectorStore Initialization (#184)#185

Closed
non-cpu wants to merge 1 commit into
morphik-org:mainfrom
non-cpu:fix/184-max-sim-undefinedfunction
Closed

Fix: Ensure max_sim Function Persists After MultiVectorStore Initialization (#184)#185
non-cpu wants to merge 1 commit into
morphik-org:mainfrom
non-cpu:fix/184-max-sim-undefinedfunction

Conversation

@non-cpu

@non-cpu non-cpu commented Jun 4, 2025

Copy link
Copy Markdown
Contributor

Summary

This PR resolves the issue where the max_sim function and other database schema changes were not persisting after MultiVectorStore initialization, leading to UndefinedFunction errors during vector queries.

Context

Addressed in #184. The problem stemmed from missing explicit commit() calls after DDL operations in core/vector_store/multi_vector_store.py, causing them to be implicitly rolled back despite initial success logs.

Fix

Explicit conn.commit() calls are now added immediately after each DDL statement (e.g., CREATE EXTENSION, CREATE TABLE, CREATE FUNCTION max_sim) within the MultiVectorStore.initialize method.

Verification

Direct database inspection confirmed the max_sim function and relevant index now persist. Query execution also functions without UndefinedFunction errors.

morphik=# \df max_sim
                            List of functions
 Schema |  Name   | Result data type |     Argument data types     | Type 
--------+---------+------------------+-----------------------------+------
 public | max_sim | double precision | document bit[], query bit[] | func
(1 row)

morphik=# \di idx_multi_vector_document_id
                                 List of relations
 Schema |             Name             | Type  |  Owner  |          Table          
--------+------------------------------+-------+---------+-------------------------
 public | idx_multi_vector_document_id | index | morphik | multi_vector_embeddings
(1 row)

Related Issue

Closes #184

@jazzberry-ai

jazzberry-ai Bot commented Jun 4, 2025

Copy link
Copy Markdown

Bug Report

Name Severity Example test case Description
Missing Transaction Rollback Protection High Inject an error into the SQL code for creating the max_sim function in MultiVectorStore.initialize. Run the quick_setup.py script. The conn.commit() call is within the try block. If an error occurs during the function creation (e.g., a syntax error in the SQL), the conn.commit() is not called, and the function is not created. However, because the transaction is not explicitly rolled back, the program proceeds as if the initialization was successful, potentially leading to errors later on when the missing function is called.

Comments? Email us.

@ArnavAgrawal03

Copy link
Copy Markdown
Collaborator

taken care of in #183 iirc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants