Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 8 additions & 14 deletions imgsrv/apps/download.psgi
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ use Plack::Request;
use Plack::Util;
use Utils;

use SRV::Utils;

use Utils::Settings;
our $settings = Utils::Settings::load('imgsrv', 'download');

Expand All @@ -33,26 +35,22 @@ my $loader = sub {
}
};

sub under_server {
return ( ! defined $ENV{PSGI_COMMAND} );
}

builder {

if ( under_server() ) {
if ( SRV::Utils::under_server() ) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is, do we even need these at all? Applies to this file and imgsrv.psgi as well.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how I feel about URLFixer in general. If there are percent-escaped ';' or '=' characters in the URL, then it will unescape the entire URL and redirect to that. That's a problem we see with the ChatGPT-created queries to ssd.

I could imagine some kind of redirection attack though -- basically tricking imgsrv to redirecting to something unexpected - but I'm not sure exactly what the threat model would be.

It may be worthwhile seeing if there's evidence of legit requests to imgsrv that would trigger this (i.e. requests to imgsrv containing %3B or %3D), and if we can't find any, we just disable URLFixer entirely.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly can't tell. It appears that the redirect is sticking the URL with escaped values into the http_referer, so I can't see what the upstream source is, that is, whether it's one of ours or something else that might hold some meaning. I will rely on some more guidance to answer this question, perhaps, and expect to return to this question.

Copy link
Copy Markdown
Contributor Author

@moseshll moseshll Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: we can't completely remove under_server or URLFixer as things stand now. So here are two sample requests from the playwright tests. The second is called with Plack::App::Command so the actual download action is not under_server(). URLFixer is only enabled when under_server(), the (second, below) request that comes back from the frontend contains %3d and... for some reason it's bad to fix it??? It looks like doing so would result in decoding a part of the URL (download_url) that is meant to be left encoded. Enabling URLFixer across the board kills the playwright tests for some reason. Not sure why. I'm not sure if this is in scope for the current ticket. Why on earth are these URLs so bloody convoluted?

 - /cgi/imgsrv/download/epub?id=test.pd_open&callback=tunnelCallback&_=1774990829729
 - /epub?progress_filepath=/htapps/babel/cache/download/2K16.118abf7093dae5452990834aa55749f1430a1454497252f63dab189fcec3707780/2K16.118abf7093dae5452990834aa55749f1430a1454497252f63dab189fcec3707780__progress&download_url=/cgi/imgsrv/download/epub?id=test.pd_open%3Bmarker=2K16.118abf7093dae5452990834aa55749f1430a1454497252f63dab189fcec3707780%3Battachment=1&id=test.pd_open

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some tracing I understand better what is happening. image.psgi and download.psgi can both be called via Plack::App::Command->run and thus with under_server FALSE. I didn't realize that background processing was done by the bucket chain of

SRV::Volume::Base->_background (or subclass)
SRV::Utils::run_command
IPC::Run::run
bin/start.sh
bin/pdf (or similar)
Plack::App::Command->run(..., apps/download.psgi) (or imgsrv.psgi)

and by the time this gets to URLFixer (yes ultimately we're talking about that) the REQUEST_URI in the environment is munged all to heck. So I would guess that's why that particular middleware is selectively enabled.

Copy link
Copy Markdown
Contributor Author

@moseshll moseshll Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To help answer the original question, however, %3B and %3D only seem to be showing up in unauthenticated requests from outside the US. I would tentatively recommend disabling URLFixer, though perhaps not removing it right away.

enable 'URLFixer';
}

enable "PopulateENV", app_name => 'imgsrv';

enable_if { (under_server() && $ENV{HT_DEV}) } 'StackTrace';
enable_if { (SRV::Utils::under_server() && $ENV{HT_DEV}) } 'StackTrace';
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we think of any reason not to enable solely based on HT_DEV or (maybe better) the plack environment?


enable_if { (under_server() && ! $ENV{HT_DEV}) }
enable_if { (SRV::Utils::under_server() && ! $ENV{HT_DEV}) }
"HTErrorDocument", 500 => "/mdp-web/production_error.html";

enable_if { (under_server() && ! $ENV{HT_DEV}) } "HTTPExceptions", rethrow => 0;
enable_if { (SRV::Utils::under_server() && ! $ENV{HT_DEV}) } "HTTPExceptions", rethrow => 0;

if ( under_server() ) {
if ( SRV::Utils::under_server() ) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be inclined to just always enable these -- given we do local development in docker, I can't think of a case where we wouldn't be under_server.


enable 'Choke::Cache::Filesystem';

Expand Down Expand Up @@ -87,9 +85,5 @@ builder {
mount "/image" => $loader->('SRV::Volume::Image::Bundle');
mount "/remediated" => $loader->('SRV::Volume::Remediated::Bundle');
};
mount "/article" => builder {
mount "/pdf" => $loader->('SRV::Article::PDF');
mount "/epub" => $loader->('SRV::Article::EPUB');
}

};
};
24 changes: 7 additions & 17 deletions imgsrv/apps/imgsrv.psgi
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ use Utils;

use SRV::Image;
use SRV::Cover;
use SRV::Article::HTML;
use SRV::Metrics;
use SRV::Utils;
use SRV::Volume::Metadata;
use SRV::Volume::HTML;

Expand All @@ -41,31 +41,27 @@ my $covers_app = SRV::Cover->new(restricted => 0)->to_app;

umask 0002;

sub under_server {
return ( ! defined $ENV{PSGI_COMMAND} );
}

builder {

if ( under_server() ) {
if ( SRV::Utils::under_server() ) {
enable 'URLFixer';
}

enable "SSO";

enable "PopulateENV", app_name => 'imgsrv';

enable_if { (under_server() && $ENV{HT_DEV}) } 'StackTrace';
enable_if { (SRV::Utils::under_server() && $ENV{HT_DEV}) } 'StackTrace';

enable_if { (under_server() && ! $ENV{HT_DEV}) }
enable_if { (SRV::Utils::under_server() && ! $ENV{HT_DEV}) }
"HTErrorDocument", 500 => "/mdp-web/production_500.html";

enable_if { (under_server()) }
enable_if { (SRV::Utils::under_server()) }
"HTErrorDocument", 404 => "/mdp-web/graphics/404_image.jpg";

enable_if { (under_server() && ! $ENV{HT_DEV}) } "HTHTTPExceptions", rethrow => 0;
enable_if { (SRV::Utils::under_server() && ! $ENV{HT_DEV}) } "HTHTTPExceptions", rethrow => 0;

if ( under_server() ) {
if ( SRV::Utils::under_server() ) {
# choke policies
enable 'Choke::Cache::Filesystem';

Expand Down Expand Up @@ -154,12 +150,6 @@ builder {
mount "/meta" => $metadata_app;
mount "/cover" => $covers_app;
};
mount "/article" => builder {
mount "/image" => SRV::Image->new(watermark => 0);
mount "/cover" => $covers_app;
mount "/html" => SRV::Article::HTML->new;
# mount "/file" => ...;
};
mount "/metrics" => sub {
return $metrics->render;
};
Expand Down
136 changes: 0 additions & 136 deletions imgsrv/lib/Process/Article/Base.pm

This file was deleted.

46 changes: 0 additions & 46 deletions imgsrv/lib/Process/Article/EPUB.pm

This file was deleted.

Loading
Loading