Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于表格图片数据生成 #13

Open
xiuboom opened this issue Sep 20, 2024 · 1 comment
Open

关于表格图片数据生成 #13

xiuboom opened this issue Sep 20, 2024 · 1 comment

Comments

@xiuboom
Copy link

xiuboom commented Sep 20, 2024

请问表格图片的生成pipeline可以开源吗?感谢!

@SpursGoZmy
Copy link
Owner

The rendered table images have 3 styles and each style has different rendering code.

Markdown Style: Read the markdown table data as a DataFrame object with Pandas python package and use dataframe_image python package to convert the dataframe into a image.
HTML Style: Get the html code of the original table or build a html code base on the table data, and use html2image python package to convert the html into a screentshot image. Then you need to write a script to clip the extra white space around the table.
Excel Style: Write the table data into an Excel. Use xlwings package to read the excel file and convert it to a image.
Data Augmentation: during rendering the table image, data augmentations like changing the font type or cell color can be added.

For different datasets, the difficulty of rendering tables into images is different and it indeed needs quite a lot of careful work. We will try to clean up our rendering code for open-source. But we really can not guarantee a very soon DDL. Thanks for your understanding.

我们会抽时间整理渲染图片的脚本,整理好后会一并开源,但不保证很快时间内能完成。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants