IT/python

to_pickle 설명 : python pandas 함수

WorldSeeker 2022. 3. 20. 21:56

1. 함수의 목적
pandas 함수로, pickle된 객체를 파일에 저장(write)한다.

* pickle이란?
주로 큰 크기의 데이터를 리스트, 딕셔너리 등의 객체 자체를 파일의 바이너로 저장하는 python package.


2. 샘플을 통한 개념 퀵뷰


3. 사용방법

pandas.DataFrame.to_pickle(path, compression='infer', protocol=5, storage_options=None)



4. 함수 PARAMETER 설명

[path] : str
저장될 파일위치 path

[compression] : str or dict, default ‘infer
압축을 사용함. '.gz', '.bz2', '.zip', '.xz' 또는 '.zst' 확장자로 압축방법을 선택가능.
압축하지 않으려면 NONE으로 설정. compression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1}

[protocol] : int
피클러가 사용해야 하는 프로토콜을 나타내는 Int, 기본값은 HIGHEST_PROTOCOL.
가능한 값은 0, 1, 2, 3, 4, 5.

  • Protocol version 0 is the original “human-readable” protocol and is backwards compatible with earlier versions of Python.
  • Protocol version 1 is an old binary format which is also compatible with earlier versions of Python.
  • Protocol version 2 was introduced in Python 2.3. It provides much more efficient pickling of new-style classes. Refer to PEP 307 for information about improvements brought by protocol 2.
  • Protocol version 3 was added in Python 3.0. It has explicit support for bytes objects and cannot be unpickled by Python 2.x. This was the default protocol in Python 3.0–3.7.
  • Protocol version 4 was added in Python 3.4. It adds support for very large objects, pickling more kinds of objects, and some data format optimizations. It is the default protocol starting with Python 3.8. Refer to PEP 3154 for information about improvements brought by protocol 4.
  • Protocol version 5 was added in Python 3.8. It adds support for out-of-band data and speedup for in-band data. Refer to PEP 574 for information about improvements brought by protocol 5.

[storage_option] : dict
특정 스토리지(S3 혹은 GCS등) 연결에 사용, 자세한 내용은 fsspec 및 urllib를 참조.
1.2.0버전에 새로 추가됨

5. 다양한 샘플표현


끝.