-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TIFF image loading EXIF corruption #8559
Comments
Could you be more specific about which commit you're building from? Does this happen with the released Pillow 11.0.0? |
I can repro on all of the following on Linux
PIP:
However, on windows (now |
I'm unable to reproduce your problem. I've created https://github.com/radarhere/docker-images/tree/refs/heads/8559 containing a Dockerfile that runs from PIL import Image
print(Image.__version__)
im = Image.open('IMG_0282.CR2')
im.save('out.jpg') on Debian Bookworm amd64. You can see that it runs successfully at https://github.com/radarhere/docker-images/actions/runs/11884545565/job/33112887235#step:4:2119 |
Can you try with this? from PIL import Image
print(Image.__version__)
im = Image.open(open('IMG_0282.CR2', 'rb', buffering=1048576))
im.save('out.jpg') That's the default buffering value on my Linux system. This test case now repros the issue even on Windows for me |
Yes, that change does trigger the error - https://github.com/radarhere/docker-images/actions/runs/11908096108/job/33182903863#step:4:2134 |
May I suggest this as a fix? Working on all my machines now: diff --git a/src/PIL/TiffImagePlugin.py b/src/PIL/TiffImagePlugin.py
index 6bf39b75a..7f1d46fca 100644
--- a/src/PIL/TiffImagePlugin.py
+++ b/src/PIL/TiffImagePlugin.py
@@ -1216,10 +1216,6 @@ class TiffImageFile(ImageFile.ImageFile):
def _seek(self, frame: int) -> None:
self.fp = self._fp
- # reset buffered io handle in case fp
- # was passed to libtiff, invalidating the buffer
- self.fp.tell()
-
while len(self._frame_pos) <= frame:
if not self.__next:
msg = "no more images in TIFF file"
@@ -1303,10 +1299,6 @@ class TiffImageFile(ImageFile.ImageFile):
if not self.is_animated:
self._close_exclusive_fp_after_loading = True
- # reset buffered io handle in case fp
- # was passed to libtiff, invalidating the buffer
- self.fp.tell()
-
# load IFD data from fp before it is closed
exif = self.getexif()
for key in TiffTags.TAGS_V2_GROUPS:
@@ -1381,8 +1373,16 @@ class TiffImageFile(ImageFile.ImageFile):
logger.debug("have fileno, calling fileno version of the decoder.")
if not close_self_fp:
self.fp.seek(0)
+ # Save and restore the file position, because libtiff will move it
+ # outside of the python runtime, and that will confuse
+ # io.BufferedReader and possible others.
+ # NOTE: This must use os.lseek, and not fp.seek(), because
+ # fp.seek() may just adjust it's internal buffer pointer and not
+ # actually move the OS file handle.
+ pos = os.lseek(fp, 0, os.SEEK_CUR)
# 4 bytes, otherwise the trace might error out
n, err = decoder.decode(b"fpfp")
+ os.lseek(fp, pos, os.SEEK_SET)
else:
# we have something else.
logger.debug("don't have fileno or getvalue. just reading") |
The removed code was added in #5443 and #5575. The test suite passes without them now (even without your new code). Would you like to create a PR with your suggestion? If it could have a test with a significantly smaller test file (or even re-using of one of our existing ones), that would be helpful. |
Sure, I can make a PR. I'll see what I can do about a test image but mine are all pretty big |
Pushed #8560 |
Thanks.
I'm guessing this is a different value to normal? Could you expand on why you have a different setting, so that we could get a better picture of who this might affect? |
From the docs
So it looks like this could depend on the filesystem or even the individual file. I've been using ZFS or a CIFS mount, so maybe that explains the large block size. |
What did you do?
Load a TIFF with exif rotation info and save as JPG
What did you expect to happen?
Save image
What actually happened?
save() crashes
What are your OS, Python and Pillow versions?
Linux dev 6.8.12-2-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-2 (2024-09-05T10:03Z) x86_64 GNU/Linux
Debugging Notes
I have debugged this, and determined it's caused by:
self.fp: _io.BufferedReader
is corrupted after giving it to libtiffdecode.decode()
load_end()
callsexif_transpose()
callsgetexif()
callsImageFileDirectory_v2.load
fp
is corrupt, no valid tags are read. you can see this in the logs of many unknown tagsImage.transpose()
is not called inload_end()
Workarounds
I've come up with a handful of workarounds:
The buffer can't be corrupted if there isn't one.
This calls
ImageFileDirectory_v2.load
on a virgin fp, so it's not corrupt. Subsequent calls in Image.load are cached.Add after here https://github.com/python-pillow/Pillow/blob/main/src/PIL/TiffImagePlugin.py#L1385
This gets the buffer to drop its cache.
I got the idea from comments and similar code here: fd299e3
However, I don't think this is a safe idea. This is not a documented behavior and the buffering/caching behavior of this could change at any time. When I tried with just
seek(0)
, it did not help, and looking at the CPython code, it looks like seek() might be optimized if it's a location that's already buffered and won't actually move the file handle.pillow.zip
The text was updated successfully, but these errors were encountered: