Skip to content

fix: add filename length safety check with random suffix #1515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 2, 2025

Conversation

Baraff24
Copy link
Contributor

@Baraff24 Baraff24 commented Feb 28, 2025

Description

This pull request enhances the file upload process by introducing a new helper function, _ensure_safe_length, which ensures that filenames do not exceed the maximum allowed length (255 characters by default). If a filename is too long, it is truncated and a random hexadecimal suffix is appended to maintain uniqueness and comply with database constraints.

Details
Safe Filename Truncation:
The new _ensure_safe_length function checks the length of the filename and, when necessary, truncates it by reserving space for a random suffix. This suffix (default length of 16 characters) is generated using uuid.uuid4().hex and appended to the truncated filename.

Integration with get_valid_filename:
The get_valid_filename function has been updated to utilize _ensure_safe_length. In addition to slugifying the filename (to remove umlauts and special characters), it now guarantees that the resulting filename complies with the maximum length requirement.

Robustness in Upload Handling:
The changes further strengthen error handling during file uploads by ensuring that invalid content lengths or mismatches between MIME types and file extensions trigger a clear UploadException.

Reference:
This update addresses Issue #1270, ensuring that filenames generated during uploads remain safe for storage.

Related resources

issue: #1270

Checklist

  • I have opened this pull request against master
  • I have added or modified the tests when changing logic
  • I have followed the conventional commits guidelines to add meaningful information into the changelog
  • I have read the contribution guidelines and I have joined #workgroup-pr-review on
    Slack to find a “pr review buddy” who is going to review my pull request.

Summary by Sourcery

Implements a filename length safety check to ensure filenames do not exceed the maximum allowed length. It truncates the filename and appends a random hexadecimal suffix to maintain uniqueness and comply with database constraints.

Bug Fixes:

  • Fixes issue where long filenames could cause errors during file uploads by truncating them and adding a random suffix.
  • Ensures filenames comply with the maximum length requirement by integrating _ensure_safe_length into get_valid_filename.
  • Strengthens error handling during file uploads by ensuring that invalid content lengths or MIME type mismatches trigger a clear UploadException.
  • Addresses issue Bug when source file name is too long #1270, ensuring that filenames generated during uploads remain safe for storage.

Copy link
Contributor

sourcery-ai bot commented Feb 28, 2025

Reviewer's Guide by Sourcery

This pull request introduces a mechanism to ensure that filenames do not exceed the maximum allowed length. It implements a new helper function, _ensure_safe_length, which truncates filenames and appends a random suffix if they are too long. The get_valid_filename function is updated to use this new helper, and comprehensive unit tests are added to verify the new functionality.

Sequence diagram for filename validation and truncation

sequenceDiagram
  participant User
  participant FilerUtils
  participant EnsureSafeLength

  User->>FilerUtils: Uploads file with long name
  FilerUtils->>FilerUtils: get_valid_filename(filename)
  FilerUtils->>EnsureSafeLength: _ensure_safe_length(filename, max_length)
  alt filename length > max_length
    EnsureSafeLength->>EnsureSafeLength: Truncate filename
    EnsureSafeLength->>EnsureSafeLength: Append random suffix
  end
  EnsureSafeLength-->>FilerUtils: safe_filename
  FilerUtils-->>User: Returns safe_filename
Loading

Updated class diagram for filer.utils.files

classDiagram
  class filer.utils.files {
    +slugify(string)
    +get_valid_filename(s)
  }
  class _ensure_safe_length {
    +filename: str
    +max_length: int
    +random_suffix_length: int
    +__init__(filename: str, max_length: int, random_suffix_length: int)
    +truncate_and_add_suffix(filename: str, max_length: int, random_suffix_length: int)
  }
  filer.utils.files -- _ensure_safe_length : uses
  note for filer.utils.files "get_valid_filename now uses _ensure_safe_length to limit filename length"
Loading

File-Level Changes

Change Details Files
Introduces a new utility function to ensure filenames are within the maximum allowed length.
  • Added _ensure_safe_length function to truncate filenames exceeding the maximum length and append a random suffix.
  • The function truncates the filename, reserving space for a 16-character hexadecimal suffix.
  • The suffix is generated using uuid.uuid4().hex.
filer/utils/files.py
Integrates the new length-checking utility into the filename validation process.
  • Modified get_valid_filename to use _ensure_safe_length to enforce maximum filename length.
  • The function now ensures that filenames are slugified and within the length limit.
filer/utils/files.py
Adds comprehensive unit tests for the filename validation logic.
  • Added tests to verify that short filenames remain unchanged.
  • Added tests to verify that long filenames are truncated and a suffix is appended.
  • Added tests to verify that the file extension is preserved.
  • Added tests to verify that unicode characters are handled correctly.
  • Added tests to verify that filenames of exact length are handled correctly.
tests/test_files.py

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Baraff24 - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider adding a setting to control the length of the random suffix.
Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟡 Testing: 4 issues found
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Member

@fsbraun fsbraun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I've added one comment to the tests. Also, can you go through the sourcery-ai comments on the tests. I think they make sense!

@Baraff24
Copy link
Contributor Author

This looks good. I've added one comment to the tests. Also, can you go through the sourcery-ai comments on the tests. I think they make sense!

I have integrated all the tests suggested by sourcery-ai. Tell me if you think I should change anything else, thanks :)

@fsbraun
Copy link
Member

fsbraun commented Mar 1, 2025

@Baraff24 I came across another related issue: #1093

The connection is this: Filer generates image thumbnails, these thumbnails have a fixed(see #1093) name being the original name plus some scaling info plus the original suffix. We need to account for space the the thumbnail scaling info. Maybe you can figure out what the longest possible scaling info is, otherwise I suggest to at least keep space for, say, 100 extra characters for the thumbnails.

@Baraff24
Copy link
Contributor Author

Baraff24 commented Mar 1, 2025

@Baraff24 I came across another related issue: #1093

The connection is this: Filer generates image thumbnails, these thumbnails have a fixed(see #1093) name being the original name plus some scaling info plus the original suffix. We need to account for space the the thumbnail scaling info. Maybe you can figure out what the longest possible scaling info is, otherwise I suggest to at least keep space for, say, 100 extra characters for the thumbnails.

@fsbraun The first solution that came to mind was to truncate to 155 characters with the same system in order to leave room for any other characters for all the image sizing settings. What do you think?

@fsbraun
Copy link
Member

fsbraun commented Mar 2, 2025

@Baraff24 Yes, I think that's necessary!

@Baraff24
Copy link
Contributor Author

Baraff24 commented Mar 2, 2025

@Baraff24 Yes, I think that's necessary!

@fsbraun in the last commit, I added two variables in the settings. Is this enough or do you want me to set up the variables in a static way?

@fsbraun
Copy link
Member

fsbraun commented Mar 2, 2025

Honestly, I prefer fixed values in the code (some constant). But it would need to be correct, e.g., 155, but not 255.

@Baraff24
Copy link
Contributor Author

Baraff24 commented Mar 2, 2025

Honestly, I prefer fixed values in the code (some constant). But it would need to be correct, e.g., 155, but not 255.

@fsbraun I applied your advices into my code. Check it now if you want

Copy link
Member

@fsbraun fsbraun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Baraff24 Thanks! Great work!

Copy link
Member

@fsbraun fsbraun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only thing left: Remove the unused imports... ;-)

@Baraff24
Copy link
Contributor Author

Baraff24 commented Mar 2, 2025

Only thing left: Remove the unused imports... ;-)

@fsbraun already done. Now can we merge?

@fsbraun fsbraun merged commit 7822a51 into django-cms:master Mar 2, 2025
28 of 34 checks passed
Comment on lines +27 to +35
hexadecimal suffix of length 16 is appended, resulting in exactly 255 characters.
"""
base = "a" * 300 # 300 characters base
original = f"{base}.jpg"
result = get_valid_filename(original)
self.assertEqual(
len(result),
self.max_length,
"Filename length should be exactly 255 characters."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 255s here should probably be 155.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants