[#124] Allow indexing rules to be invoked manually (main)#141
[#124] Allow indexing rules to be invoked manually (main)#141korydraughn wants to merge 9 commits intoirods:mainfrom
Conversation
|
Attempting to do full text indexing on a data object without appropriate permissions results in a |
To add to that, this is for the case where the rule is invoked as rods (a rodsadmin) on another user's data object via |
Yes, rodsadmins-only seems good
Agreed, 'all' seems good |
I've added logic that allows rodsadmins to invoke full text indexing on other user's data objects without needing explicit permission. I've confirmed the drop to the C API works as intended. This was done manually using |
| // Take the max to avoid passing an integer that's less than zero to the | ||
| // the string_view constructor. | ||
| {"data", std::string_view(buffer.data(), std::max(0, bytes_read))} |
There was a problem hiding this comment.
Add a statement explaining how bytes_read can result in an integer less than zero.
| #define IRODS_IO_TRANSPORT_ENABLE_SERVER_SIDE_API | ||
| #include <irods/dstream.hpp> | ||
| #include <irods/transport/default_transport.hpp> | ||
| #ifdef IRODS_HAS_FEATURE_ADMIN_MODE_FOR_DSTREAM_LIBRARIES |
There was a problem hiding this comment.
This feature test macro name is just a placeholder until irods/irods#7530 is resolved.
This PR isn't blocked by that work. Once the irods/irods issue is resolved, these preprocessor macros can be updated to match the real feature test macro.
| // irule <text> | ||
| if (rule_text.find("@external rule {") != std::string_view::npos) { | ||
| const auto start = rule_text.find_first_of('{') + 1; | ||
| const auto end = rule_text.rfind(" }"); | ||
|
|
||
| if (end == std::string_view::npos) { | ||
| auto msg = fmt::format("Received malformed rule text. " | ||
| "Expected closing curly brace following rule text [{}].", | ||
| rule_text); | ||
| log_re::error(msg); | ||
| return ERROR(SYS_INVALID_INPUT_PARAM, std::move(msg)); | ||
| } | ||
|
|
||
| rule_text = rule_text.substr(start, end - start); | ||
| } | ||
| // irule -F <script> | ||
| else if (const auto external_pos = rule_text.find("@external\n"); external_pos != std::string_view::npos) { | ||
| // If there are opening and closing curly braces following the "@external\n" prefix, then we | ||
| // can assume that the rule text most likely represents a JSON string. | ||
| if (const auto start = rule_text.find_first_of('{'); start != std::string_view::npos) { | ||
| const auto end = rule_text.rfind(" }"); | ||
|
|
||
| if (end == std::string_view::npos) { | ||
| auto msg = fmt::format("Received malformed rule text. " | ||
| "Expected closing curly brace following rule text [{}].", | ||
| rule_text); | ||
| log_re::error(msg); | ||
| return ERROR(SYS_INVALID_INPUT_PARAM, std::move(msg)); | ||
| } | ||
|
|
||
| rule_text = rule_text.substr(start, end - start); | ||
| } | ||
| // Otherwise, the rule text must represent something else. In this case, simply strip the | ||
| // "@external\n" prefix from the rule text and let the JSON parser throw an exception if the | ||
| // rule text cannot be parsed. This allows the REP to fail without causing the agent to crash. | ||
| else { | ||
| rule_text = rule_text.substr(external_pos + 10); | ||
| } | ||
| } |
There was a problem hiding this comment.
I've used this code in two plugins now. It probably needs to be provided by the irods-dev package so we avoid copying it everywhere.
There was a problem hiding this comment.
please make an issue - that seems good.
There was a problem hiding this comment.
I'm starting to think maybe this should be a documentation exercise. That code makes a few assumptions about the input which isn't code for general purpose use.
Will think on it a little more.
|
|
||
| json delay_obj; | ||
| delay_obj["rule-engine-operation"] = irods::indexing::policy::indexing; | ||
| if (irods::indexing::policy::object::index == op) { |
There was a problem hiding this comment.
irods::indexing::policy::object::index expands to the string, "irods_policy_indexing_object_index".
This is the rule name that must be used to do full text indexing of a single data object. It shares the same name as the rules which are fired as a result of triggering PEPs.
You can see the rest of the rule names here.
irods_capability_indexing/configuration.hpp
Lines 189 to 205 in 3f3529d
Are those the rule names we want admins to use or do we want to change them for manual execution contexts?
I can see value in the names staying as they are. It makes it easy for admins to know what happens because they will start to remember the names. However, admins may not be able to distinguish who/what invoked the rule.
All of that to say, perhaps the names should be changed to something like ...
indexing_index_data_objectindexing_index_collectionindexing_purge_data_object
Note: This PR doesn't support invoking indexing rules via the NREP yet. It should be doable, but that may require changing the the rule names for correct behavior.
There was a problem hiding this comment.
don't feel too strongly either way yet. consistency across our plugins is where i think i'd find the most value/good.
also noting that it's interesting our namespacing is not in the same order as the rule names...
'policy' and 'indexing'... switched places...
There was a problem hiding this comment.
I agree on the consistency thing.
As for the namespacing, that's not surprising to me. If the C++ code used irods::policy, the possibility of symbol collision rises since other plugins would likely follow suit and define things in the irods::policy namespace.
I don't know that "policy" is a term that's needed in the rule names since everything iRODS does is about policy.
There was a problem hiding this comment.
i think that convention started with policy composition... and hasn't really been codified/hardened yet. TBD...
Still need to implement tests.
I've verified at full text indexing on data objects works. I'm pretty sure the other rules work since they were lifted directly from
exec_rule_expression.The rules can only be invoked via the main indexing plugin, not the elasticsearch plugin.
Some questions ...
As implemented, the rules fire as the user who invoked them. Should any rules result in changing the identity of the RsComm to the user who owns the object(s)? Consider full text indexing being invoked on a data object which the rodsadmin doesn't have permissions on. I'll test this to find out what happens.
Thoughts?