Small refactor of is_valid_function_name by j-t-1 · Pull Request #440 · erocarrera/pefile

j-t-1 · 2024-11-24T22:37:02Z

Slightly increases readability.

j-t-1 · 2024-12-16T08:20:08Z

string.punctuation (string of ASCII characters which are considered punctuation characters in the C locale) adds ";" and "=". If these should not be there I can change to something like (string.punctuation - ";=").encode(), and add a reference.

nightlark · 2025-11-20T22:52:50Z

-    allowed_extra = b"._?@$()<>"
+    allowed_extra = b"$().<>?@_"
    if relax_allowed_characters:
-        allowed_extra = b"!\"#$%&'()*+,-./:<>?[\\]^_`{|}~@"


I'd be curious to know where these lists of characters originally came from. I'm not seeing many good references for the allowed set of characters in mangled function names, other than looking at the name mangling functions implemented in compilers and figuring out what set of characters can appear in them based on their code.

I think the first set are mangled / decorated characters. The dollar sign, question mark, @ sign and underscore are used by the Microsoft linker.

The less-than sign and greater-than sign were empirically added in issue #61.

I do not know where the parentheses and period are from, are they decoration from other compilers?

IDA has support for demangling names; @skochinsky would you be able to help us?
Doesb"._?@$()<>" correspond to mangled characters in PE files?

We use this in functions parse_export_directory and parse_imports.

@j-t-1
From ida.cfg:

// the following characters are allowed in mangled names. // they will be substituted with underscore during output if names // are output in a mangled form. MangleChars = "$:?([.)]" // watcom "@$%?" // microsoft "@$%&"; // borland

P.S. the PE file format does not impose any limitation on the characters used, and GetProcAddress should in theory work with any null-terminated byte string. So, I would suggest to not perform any explicit validation, just maybe invalid character replacement on output.

@skochinsky thank you.

I have updated this PR to include more characters, referencing ida.cfg (hope okay).

Changing the code to not perform explicit validation is a useful suggestion.

pefile/pefile.py

Lines 2316 to 2318 in 4b3b1e2

def is_valid_function_name(

s: Union[str, bytes, bytearray], relax_allowed_characters: bool = False

) -> bool:

This function is used by parse_export_directory (uses the extra characters) and parse_imports (does not use the extra characters):

pefile/pefile.py

Lines 5515 to 5517 in 4b3b1e2

if not is_valid_function_name(symbol_name, relax_allowed_characters=True):

export_parsing_loop_completed_normally = False

break

pefile/pefile.py

Lines 6064 to 6065 in 4b3b1e2

if not is_valid_function_name(imp_name):

imp_name = b"*invalid*"

pefile is using this for imports that are difficult to parse:

pefile/pefile.py

Lines 6091 to 6103 in 4b3b1e2

# The file with hashes:

#

# MD5: bfe97192e8107d52dd7b4010d12b2924

# SHA256: 3d22f8b001423cb460811ab4f4789f277b35838d45c62ec0454c877e7c82c7f5

#

# has an invalid table built in a way that it's parseable but contains

# invalid entries that lead pefile to take extremely long amounts of time to

# parse. It also leads to extreme memory consumption.

# To prevent similar cases, if invalid entries are found in the middle of a

# table the parsing will be aborted

#

if imp_ord is None and imp_name is None:

raise PEFormatError("Invalid entries, aborting parsing.")

So pefile could be modified to handle this kind of case differently.

I will keep this PR as is, but something to consider @erocarrera and @nightlark.

Also add docstring.

j-t-1 added 2 commits November 24, 2024 22:34

Small refactor of is_valid_function_name

3598bdc

Slightly increases readability.

Small change of is_valid_function_name

ea45151

Slightly increases readability.

nightlark reviewed Nov 20, 2025

View reviewed changes

Add characters of mangled names

6a14415

Also add docstring.

nightlark mentioned this pull request Jan 2, 2026

Combined PR #470

Closed

Add comment to line

82befd4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small refactor of is_valid_function_name#440

Small refactor of is_valid_function_name#440
j-t-1 wants to merge 4 commits intoerocarrera:masterfrom
j-t-1:is_valid_function_name

j-t-1 commented Nov 24, 2024

Uh oh!

j-t-1 commented Dec 16, 2024

Uh oh!

nightlark Nov 20, 2025 •

edited

Loading

Uh oh!

j-t-1 Nov 29, 2025

Uh oh!

skochinsky Dec 12, 2025

Uh oh!

skochinsky Dec 12, 2025

Uh oh!

j-t-1 Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	def is_valid_function_name(
	s: Union[str, bytes, bytearray], relax_allowed_characters: bool = False
	) -> bool:

	if not is_valid_function_name(symbol_name, relax_allowed_characters=True):
	export_parsing_loop_completed_normally = False
	break

	if not is_valid_function_name(imp_name):
	imp_name = b"invalid"

	# The file with hashes:
	#
	# MD5: bfe97192e8107d52dd7b4010d12b2924
	# SHA256: 3d22f8b001423cb460811ab4f4789f277b35838d45c62ec0454c877e7c82c7f5
	#
	# has an invalid table built in a way that it's parseable but contains
	# invalid entries that lead pefile to take extremely long amounts of time to
	# parse. It also leads to extreme memory consumption.
	# To prevent similar cases, if invalid entries are found in the middle of a
	# table the parsing will be aborted
	#
	if imp_ord is None and imp_name is None:
	raise PEFormatError("Invalid entries, aborting parsing.")

Conversation

j-t-1 commented Nov 24, 2024

Uh oh!

j-t-1 commented Dec 16, 2024

Uh oh!

nightlark Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

j-t-1 Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

skochinsky Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

skochinsky Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

j-t-1 Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nightlark Nov 20, 2025 •

edited

Loading