xls2slsx
Features
- Convert
.xls
files to.xlsx
using xlrd and openpyxl. - Convert
.htm
and.mht
files containing tables or excel contents to.xlsx
using beautifulsoup4 and openpyxl.
We attempt to support anything that the underlying packages used will support. For example, the following are supported for both input types:
- Multiple worksheets
- Text, Numbers, Dates/Times, Unicode
- Fonts, text color, bold, italic, underline, double underline, strikeout
- Solid and Pattern Fills with color
- Borders: Solid, Hair, Thin, Thick, Double, Dashed, Dotted; with color
- Alignment: Horizontal, Vertical, Rotated, Indent, Shrink To Fit
- Number Formats, including unicode currency symbols
- Hidden Rows and Columns
- Merged Cells
- Hyperlinks (only 1 per cell)
- Comments
These features are additionally supported by the .xls
input format:
- Freeze panes
These features are additional supported by the .htm
and .mht
input formats:
- Images
Not supported by either format:
- Conditional Formatting (the current stylings are preserved)
- Formulas (the calculated values are preserved)
- Charts (the image of the chart is handled by
.htm
and.mht
input formats) - Pivot tables (the current data is preserved)
- Text boxes (converted to an image by
.htm
and.mht
input formats) - Shapes and Clip Art (converted to an image by
.htm
and.mht
input formats) - Autofilter (the current filtered out rows are preserved)
- Rich text in cells (openpyxl doesn’t support this: only styles applied to the entire cell are preserved)
Installation
To install xls2xlsx, run this command in your terminal:
$ pip install xls2xlsx
This is the preferred method to install xls2xlsx, as it will always install the most recent stable release.
Usage
To use xls2xlsx from the command line:
$ xls2xlsx [-v] file.xls ...
This will create file.xlsx
in the current folder. file.xls
can be any .xls
, .htm
, or .mht
file and can also be a URL. The -v
flag will print the input and output filename.
To use xls2xlsx in a project:
from xls2xlsx import XLS2XLSX
x2x = XLS2XLSX("spreadsheet.xls")
x2x.to_xlsx("spreadsheet.xlsx")
Alternatively:
from xls2xlsx import XLS2XLSX
x2x = XLS2XLSX("spreadsheet.xls")
wb = x2x.to_xlsx()
The xls2xlsx.to_xlsx method returns the filename given. If no filename is provided, the method returns the openpyxl workbook.
The input file can be in any of the following formats:
- Excel 97-2003 workbook (
.xls
) - Web page (
.htm
,.html
), optionally including a _Files folder - Single file web page (
.mht
,.mhtml
)
The input specified can also be any of the following:
- A filename / pathname
- A url
- A file-like object (opened in Binary mode for
.xls
and either Binary or Text mode otherwise) - The contents of a
.xls
file as abytes
object - The contents of a
.htm
or.mht
file as astr
object
Note: The file format is determined by examining the file contents, not by looking at the file extension.
STATS ON GITHUB
- 54
- 18
- 14
- License: MIT
- Author: Joe Cool
- Last update: N/A