xlsx_to_csv

pyhelpers.store.xlsx_to_csv(path_to_xlsx, path_to_csv=None, engine=None, if_exists='replace', vbscript=None, sheet_name='1', ret_null=False, verbose=False, **kwargs)[source]

Convert a Microsoft Excel spreadsheet to a CSV file.

See also [STORE-XTC-1].

Parameters:
  • path_to_xlsx (str | os.PathLike) – The path of the Microsoft Excel spreadsheet (in .xlsx format).

  • path_to_csv (str | os.PathLike | None) –

    The path of the CSV file:

    • When path_to_csv=None (default), a temporary file is generated using tempfile.NamedTemporaryFile().

    • When path_to_csv="", the CSV file is generated in the same directory as the source Microsoft Excel spreadsheet.

    • Otherwise, it specifies a specific path.

  • engine (str | None) –

    The engine used for converting .xlsx/.xls to .csv:

    • When engine=None (default), a VBScript (Visual Basic Script) is used.

    • When engine='xlsx2csv', the function relies on xlsx2csv.

  • if_exists (str) – The action to take if the target CSV file exists; defaults to 'replace'.

  • vbscript (str | None) – The path of the VBScript used for converting .xlsx/.xls to .csv; defaults to None.

  • sheet_name (str) – The name of the target worksheet in the given Excel file; defaults to '1'.

  • ret_null (bool) – Whether to return a value depending on the specified engine; defaults to False.

  • verbose (bool | int) – Whether to print relevant information to the console; defaults to False.

  • kwargs – [Optional] Additional parameters for the function subprocess.run().

Returns:

The path of the generated CSV file or None when engine=None; an io.StringIO() buffer when engine='xlsx2csv'.

Return type:

str | _io.StringIO | None

Examples:

>>> from pyhelpers.store import xlsx_to_csv, load_csv
>>> from pyhelpers.dirs import cd
>>> import os
>>> path_to_test_xlsx = cd("tests\data", "dat.xlsx")
>>> path_to_temp_csv = xlsx_to_csv(path_to_test_xlsx, verbose=True)
Converting "tests\data\dat.xlsx" to a (temporary) CSV file ... Done.
>>> os.path.isfile(path_to_temp_csv)
True
>>> data = load_csv(path_to_temp_csv, index=0)
>>> data
             Longitude    Latitude
City
London      -0.1276474  51.5073219
Birmingham  -1.9026911  52.4796992
Manchester  -2.2451148  53.4794892
Leeds       -1.5437941  53.7974185
>>> # Set `engine='xlsx2csv'`
>>> temp_csv_buffer = xlsx_to_csv(path_to_test_xlsx, engine='xlsx2csv', verbose=True)
Converting "tests\data\dat.xlsx" to a (temporary) CSV file ... Done.
>>> # import pandas as pd; data_ = pandas.read_csv(io_buffer, index_col=0)
>>> data_ = load_csv(temp_csv_buffer, index=0)
>>> data_
            Longitude   Latitude
City
London      -0.127647  51.507322
Birmingham  -1.902691  52.479699
Manchester  -2.245115  53.479489
Leeds       -1.543794  53.797418
>>> data.astype('float16').equals(data_.astype('float16'))
True
>>> # Remove the temporary CSV file
>>> os.remove(path_to_temp_csv)