Python PIL _getexif, TAGS 적용, AttributeError, JPEG, TIFF 차이

728x90

가장 표면적인 응용 기술 부분부터 살펴나간다.

1. PIL의 _getexif, TAGS

다음 코드를 통해 이미지 파일의 메타정보(촬영일, 촬영 장소 gps, 이미지 크기 등)를 얻어낼 수 있다.

from PIL import Image, ExifTags

image_file = Image.open("./열고자 하는 그림 파일")
exif = image_file._getexif()
if exif:
	exif_tag = {ExifTags.TAGS[k] for k, v in exif.items() if k in ExifTags.TAGS}

PIL은 Python의 Image 처리 라이브러리이다. 여기서 Image 클래스를 이용하여 Image들을 처리할 수 있다.

ExifTags는 이미지 파일의 메타 정보에 대한 key값들의 목록이다. 예를 들어 이미지 파일의 메타정보에 {"DataTimeOriginal" : "2022-12-31T12:31:00:00)이라고 저장되어있다면, 키 값인 "DateTimeOriginal"값이 들어있는 것이다. 따라서 _getexif() 메서드로 뽑아온 Dictionary 형태의 메타정보를 Comprehension 문법을 이용하여 뽑아올 수 있는 것이다.

2. AttributeError with TIFF Image

그런데, _getexif() 메서드를 TIFF 확장자를 갖는 Image 파일에 적용하면 PIL 라이브러리에서 AttributeError를 발생시킨다.

(class Image/__getattr__ 메서드)

def __getattr__(self, name):
    if name == "category":
        warnings.warn(
            "Image categories are deprecated and will be removed in Pillow 10 "
            "(2023-07-01). Use is_animated instead.",
            DeprecationWarning,
            stacklevel=2,
        )
        return self._category
    raise AttributeError(name)

왜 그런지 찾아보니, TIFF 파일은 메타 정보가 압축되어있지 않아서 굳이 _getexif() 메서드로 뽑아오는 것이 아니라 그냥 TAGS로 조회하면 된다고 한다.

https://stackoverflow.com/questions/46477712/reading-tiff-image-metadata-in-python

Reading tiff image metadata in Python

How can I read metada, like coordinates, from a TIFF image in Python? I tried foo._getexif() from PIL, but got the message: AttributeError: 'TiffImageFile' object has no attribute '_getexif' Is...

stackoverflow.com

즉 TIFF 파일의 경우 아래와 같이 코딩하면 메타 정보를 불러올 수 있다.

from PIL import Image
from PIL.TiffTags import TAGS

im1 = Image.open("tiff 이미지 파일 경로")
meta_dict = {TAGS[key] : im1.tag[key] for key in im1.tag.keys()}
print(meta_dict)

''' 결과 일부
{'ImageWidth': (1600,), 'ImageLength': (1300,), 'BitsPerSample': (16,), 
'Compression': (1,), 'PhotometricInterpretation': (1,), 
...
'''

다시 AttributeError가 났던 원인을 알아보면, TIFF 파일은 Image.open으로 파일을 열었을 때, 해당 파일의 클래스에 _getexif()라고 정의된 클래스, 메서드, 변수 등의 속성 값을 찾을 수 없었기 때문인 것이다.

Image.open의 소스 코드를 보면, preinit()을 통해 확장자별로 이미지 파일을 처리하는 클래스들을 불러오는 것을 볼 수 있다.

def open(fp, mode="r", formats=None):
    ### 중략

    prefix = fp.read(16)

    preinit()

    accept_warnings = []

preinit()

def preinit():
    """Explicitly load standard file format drivers."""

    global _initialized
    if _initialized >= 1:
        return

    try:
        from . import BmpImagePlugin

        assert BmpImagePlugin
    except ImportError:
        pass
    try:
        from . import GifImagePlugin

        assert GifImagePlugin
    except ImportError:
        pass
    try:
        from . import JpegImagePlugin

        assert JpegImagePlugin
    except ImportError:
        pass
    try:
        from . import PpmImagePlugin

        assert PpmImagePlugin
    except ImportError:
        pass
    try:
        from . import PngImagePlugin

        assert PngImagePlugin
    except ImportError:
        pass
    # try:
    #     import TiffImagePlugin
    #     assert TiffImagePlugin
    # except ImportError:
    #     pass

    _initialized = 1

여기서 JpegImagePlugin에 들어가보면 _getexif() 메서드가 있는 것을 볼 수 있다! 그러나 TiffImagePlugin/TiffImageFile에 가보면 _getexif() 메서드는 없고, 생성자에 tag, tag_v2가 있는 것을 볼 수 있다.

class TiffImageFile(ImageFile.ImageFile):

    format = "TIFF"
    format_description = "Adobe TIFF"
    _close_exclusive_fp_after_loading = False

    def __init__(self, fp=None, filename=None):
        self.tag_v2 = None
        """ Image file directory (tag dictionary) """

        self.tag = None
        """ Legacy tag entries """

        super().__init__(fp, filename)

3. JPEG, TIFF 차이

RAW, 래스터, 벡터

DSLR 카메라 등으로 촬영한 이미지는 무가공, 무압축의 원본 데이터로 이루어진 RAW 파일이다. RAW 파일을 픽셀별로 RGB 값을 갖게하여 수 많은 픽셀 점들의 모임으로 이미지를 압축하고 표현하는 것이 래스터 방식이다. 그리고 이미지 파일을 구성하는 그리드(격자 점)에 각 데이터의 정보를 수학적으로 정의하여 표현하는 것이 벡터 방식이다. JPEG, TIFF는 모두 래스터 방식으로 표현된 이미지이다.

JPEG, TIFF

같은 래스터 방식이나, JPEG는 좀 더 압축을 많이한 방식, TIFF는 좀 더 원본 데이터를 유지한 방식이다. 그래서 보통 TIFF 파일이 용량이 크다. 이미지의 유연한 편집이나 데이터 가공등을 위해서는 TIFF 파일을 사용해야한다. 좀 더 자세한 차이는 아래 링크에서 확인할 수 있다. 이런 압축 방식의 차이 때문에 이미지를 다룰 때는 각 확장자에 맞는 클래스, 메서드를 적용하여 처리해줄 필요가 있다.

https://www.adobe.com/kr/creativecloud/file-types/image/comparison/jpeg-vs-tiff.html

728x90

'Programming-[Backend] > Python' 카테고리의 다른 글

[링크] python circular import 해결 - import time, runtime (0)	2023.01.19
[TIL][링크] 파이썬 부모 생성자 호출, __init__ BaseError 상속 (0)	2023.01.02
[TIL] python 빈 리스트/None check, django test request값 안 변할 때 -> 테스트 메서드 이름 중복 (0)	2022.11.02
[탐험] ffmpeg-ffprobe로 AWS S3에 있는 Video 파일 메타데이터 추출: 비디오 스트림, AWS Presigned URL 방식 이해 2 (0)	2022.11.01
[탐험] ffmpeg-ffprobe로 AWS S3에 있는 Video 파일 메타데이터 추출: 비디오 스트림, AWS Presigned URL 방식 이해 1 (0)	2022.10.21

컴퓨터 탐험가 찰리

Python PIL _getexif, TAGS 적용, AttributeError, JPEG, TIFF 차이

1. PIL의 _getexif, TAGS

2. AttributeError with TIFF Image

3. JPEG, TIFF 차이

'Programming-[Backend] > Python' 카테고리의 다른 글

티스토리툴바

Python PIL _getexif, TAGS 적용, AttributeError, JPEG, TIFF 차이

1. PIL의 _getexif, TAGS

2. AttributeError with TIFF Image

3. JPEG, TIFF 차이

'Programming-[Backend] > Python' 카테고리의 다른 글

'Programming-[Backend]/Python' Related Articles

티스토리툴바