欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

项目中Excel导出大数据量记录解决方案及实战(POI,Hutools,EasyExcel)

程序员文章站 2022-07-13 15:39:24
...

目录

1、使用场景

2、代码实现

2.1 引入的jar maven配置

2.2 使用POI高级版功能(SXSSFWorkbook) V3.8版本以上

2.3 使用国内开源工具包Hutools实现

2.4 使用EasyExcel进行测试代码

3、成果展现

4、总结

5、参考文章


1、使用场景

        因为最近项目需要,有一个培训系统之中,培训任务下有3000多用户;以前开发这个导出方式使用POI,基础功能创建的导出功能函数。最后导致这个导出功能导出3000多条记录需要耗时26s(秒)。于是产品经理要求改善一下这块内容。因为我们项目之中针对Excel导出这块,由两位同事有两种实现方式。方式一:使用常规的POI创建Excel信息,最后执行导出。方式二:另外一个同事使用的自己封装一个模板方式导出Excel。另外同事感觉可以通过模板导出方式能够解决此问题。最后在他要求下修改为模板实现方式。最后虽然解决可以导出数据问题但是导出速度还是极其慢,耗时长达26s。最近寻求网上解决方案;于是候选一下三种方案:

  1. 使用POI高级版功能(SXSSFWorkbook) V3.8版本以上
  2. 使用国内开源工具包Hutools
  3. 使用EasyExcel进行测试

2、代码实现

     针对以上三种方案我选择使用同样的导出数据结构和记录数量;分别采样数据是1w、10w、100w数据进行导出测测试发现三种实现效果差别不大,这部分数据在成果展现部分进行说明。下面分别针对以上三种方案代码实现。

2.1 引入的jar maven配置

<!--XSSFSheet相关-->
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi</artifactId>
            <version>4.1.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml</artifactId>
            <version>4.1.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml-schemas</artifactId>
            <version>4.1.2</version>
        </dependency>
        <dependency>
            <groupId>xerces</groupId>
            <artifactId>xercesImpl</artifactId>
            <version>2.12.0</version>
        </dependency>
        <dependency>
            <groupId>cn.hutool</groupId>
            <artifactId>hutool-all</artifactId>
            <version>5.4.0</version>
        </dependency>
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>easyexcel</artifactId>
            <version>2.2.6</version>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <version>1.18.12</version>
            <scope>provided</scope>
        </dependency>

2.2 使用POI高级版功能(SXSSFWorkbook) V3.8版本以上

import org.apache.poi.hssf.usermodel.HSSFRichTextString;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.streaming.SXSSFWorkbook;

import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.OutputStream;
import java.net.URLEncoder;
import java.util.List;
import java.util.Map;

/**
 * Excel导出数据读取大数据量测试
 */
public class ExcelPoiUtil {

    private static String generateFileNameByDataTotal(int dataTotal){
        String excelFileName="导出数据总记录";
        int recordCount=0;
        if(dataTotal<=1000){
            excelFileName+="1K条";
        }else if(dataTotal>1000&&dataTotal<=10000){
            recordCount=dataTotal/1000;
            excelFileName+=recordCount+"K条";
        }else if(dataTotal>10000&&dataTotal<=100000){
            recordCount=dataTotal/10000;
            excelFileName+=recordCount+"W条";
        }else if(dataTotal>100000&&dataTotal<=1000000){
            recordCount=dataTotal/10000;
            excelFileName+=recordCount+"W条";
        }
        return excelFileName;
    }

    /**
     * 使用POI 3.8 以上版本 SXSSFWorkbook 导出大数据量数据到Excel之中
     * @param dataTotal
     * @throws IOException
     */
   public static  void exportExcelBySXSSFWorkbook(HttpServletRequest request, HttpServletResponse response, int dataTotal) throws IOException {
        long startTime =System.currentTimeMillis();
       String recordFileName="SXSSF_POI导出Excel_"+generateFileNameByDataTotal(dataTotal);
        Workbook wb = new SXSSFWorkbook(100); // keep 100 rows in memory, exceeding rows will be flushed to disk
        Sheet sh = wb.createSheet();
       List<Map<String, Object>> dataList=ExcelHuToolsUtil.getBigRowsDataMap(dataTotal);
       Row rowHeader = sh.createRow(0);
       String[] headers=new String[]{"姓名","年龄","成绩","是否合格","考试日期"};
       for (int i = 0; i < headers.length; i++) {
           Cell cell = rowHeader.createCell(i);
           HSSFRichTextString text = new HSSFRichTextString(headers[i]);
           cell.setCellValue(text);
       }
        for(int rownum = 0; rownum < dataList.size(); rownum++){
            Row row = sh.createRow(rownum+1);
            Map<String, Object> cellMap=dataList.get(rownum);
            int cellIndex=0;
            for (Map.Entry<String, Object> m : cellMap.entrySet()) {
                //System.out.println("key:" + m.getKey() + " value:" + m.getValue());
                Cell cell = row.createCell(cellIndex);
                cell.setCellValue(m.getValue().toString());
                cellIndex++;
            }
        }
        OutputStream os = null; //文件输出流
       String fileName = recordFileName+".xlsx";
       try {
           os = response.getOutputStream(); //重点突出输出到浏览器
           //解决浏览器兼容问题
           if (request.getHeader("User-Agent").toLowerCase().indexOf("firefox") > 0) {
               fileName = new String(fileName.getBytes("GB2312"),"ISO-8859-1");
           } else {
               // 对文件名进行编码处理中文问题
               fileName = java.net.URLEncoder.encode(fileName, "UTF-8");// 处理中文文件名的问题
               fileName= new String(fileName.getBytes("UTF-8"), "GBK");// 处理中文文件名的问题
           }
           response.reset();
           response.setContentType("application/vnd.ms-excel;charset=utf-8");
           response.setHeader("Content-Disposition", "attachment; filename="+ URLEncoder.encode(recordFileName, "UTF-8")+".xlsx");//中文名称需要特殊处理
           wb.write(os);
       } catch (Exception e) {
       }finally {
           try {
               if(os != null){
                   os.flush();
                   os.close();
                   os = null;
               }
           } catch (Exception e) {
           }
       }
        long endTime=System.currentTimeMillis();
       String exportRecord="";
       int recordCount=0;
       if(dataTotal<=1000){
           exportRecord+="1K条";
       }else if(dataTotal>1000&&dataTotal<=10000){
           recordCount=dataTotal/1000;
           exportRecord+=recordCount+"K条";
       }else if(dataTotal>10000&&dataTotal<=1000000){
           recordCount=dataTotal/10000;
           exportRecord+=recordCount+"W条";
       }
        System.out.println("导出"+exportRecord+" 数据 耗时: "+(endTime-startTime)/1000 +" 秒");
    }
}

2.3 使用国内开源工具包Hutools实现

import cn.hutool.core.date.DateUtil;
import cn.hutool.core.util.RandomUtil;
import cn.hutool.poi.excel.ExcelUtil;
import cn.hutool.poi.excel.ExcelWriter;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.ByteArrayOutputStream;
import java.net.URLEncoder;
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;

/**
 * 使用hutool工具导出大数据量验证
 */
public class ExcelHuToolsUtil {
    private static Logger logger = LoggerFactory.getLogger(ExcelHuToolsUtil.class);

    public  static List<Map<String, Object>> getBigRowsDataMap(int dataTotal){
        List<Map<String, Object>> dataList=new ArrayList<>();
        for(int i=0;i<dataTotal;i++){
            Map<String, Object> rowData = new LinkedHashMap<>();
            rowData.put("name", "name_"+(i+1));
            rowData.put("age", RandomUtil.randomInt(22,100));
            rowData.put("score", RandomUtil.randomDouble(85,100));
            rowData.put("pass", i/2==0?true:false);
            rowData.put("testDate", DateUtil.date());
            dataList.add(rowData);
        }
        return dataList;
    }

    public static void exportExcelByHuTools(HttpServletRequest request, HttpServletResponse response, int dataTotal){
        long startTime=System.currentTimeMillis();
        String recordFileName="HuTools导出Excel_"+generateFileNameByDataTotal(dataTotal);
        //通过工具类创建writer
        try {
            ExcelWriter writer =ExcelUtil.getBigWriter();
            writer.renameSheet("HuTools导出大数据");     //甚至sheet的名称
            //设置head的名称, 此时的顺寻就是导出的顺序, key就是RecordInfoDetailsDTO的属性名称, value就是别名
            writer.addHeaderAlias("name", "姓名");
            writer.addHeaderAlias("age", "年龄");
            writer.addHeaderAlias("score", "成绩");
            writer.addHeaderAlias("pass", "是否合格");
            writer.addHeaderAlias("testDate", "考试日期");
            //跳过当前行,既第一行,非必须,在此演示用
            //writer.passCurrentRow();
            List<Map<String, Object>> rowList= getBigRowsDataMap(dataTotal);
            writer.write(rowList, true);
            writer.setOnlyAlias(true);
            response.reset();
            response.setContentType("application/vnd.ms-excel;charset=utf-8");
            response.setHeader("Content-Disposition", "attachment; filename="+ URLEncoder.encode(recordFileName, "UTF-8")+".xlsx");
            //中文名称需要特殊处理
            writer.flush(response.getOutputStream());
            writer.close();
            String exportRecord="";
            int recordCount=0;
            if(dataTotal<=1000){
                exportRecord+="1K条";
            }else if(dataTotal>1000&&dataTotal<=10000){
                recordCount=dataTotal/1000;
                exportRecord+=recordCount+"K条";
            }else if(dataTotal>10000&&dataTotal<=1000000){
                recordCount=dataTotal/10000;
                exportRecord+=recordCount+"W条";
            }
            long endTime=System.currentTimeMillis();
            System.out.println("hutool 写入 "+exportRecord+" 记录耗时 "+(endTime-startTime)/1000 +"秒");
        }catch (Exception e) {
            //如果导出异常,则生成一个空的文件
            logger.info("######导出  excel异常  :{}",e.getMessage());
            try (ByteArrayOutputStream outputStream2 = new ByteArrayOutputStream()) {
                Workbook workbook = new XSSFWorkbook();
                workbook.createSheet("生成数据异常");
                StringBuilder fileName = new StringBuilder(recordFileName).append(".xlsx");
                workbook.write(outputStream2);
                outputStream2.flush();
                //AasExcelUtils.export(request, response, outputStream2.toByteArray(), fileName.toString());
            }catch (Exception e2) {}
        }

    }

    private static String generateFileNameByDataTotal(int dataTotal){
        String excelFileName="导出数据总记录";
        int recordCount=0;
        if(dataTotal<=1000){
            excelFileName+="1K条";
        }else if(dataTotal>1000&&dataTotal<=10000){
            recordCount=dataTotal/1000;
            excelFileName+=recordCount+"K条";
        }else if(dataTotal>10000&&dataTotal<=100000){
            recordCount=dataTotal/10000;
            excelFileName+=recordCount+"W条";
        }else if(dataTotal>100000&&dataTotal<=1000000){
            recordCount=dataTotal/10000;
            excelFileName+=recordCount+"W条";
        }
        return excelFileName;
    }
}

2.4 使用EasyExcel进行测试代码

import com.alibaba.excel.annotation.ExcelIgnore;
import com.alibaba.excel.annotation.ExcelProperty;
import lombok.Data;

import java.util.Date;

/**
 * 基础数据类
 *
 * @author Jiaju Zhuang
 **/
@Data
public class DownloadData {
    @ExcelProperty("姓名")
    private String name;
    @ExcelProperty("年龄")
    private Integer age;
    @ExcelProperty("成绩")
    private Double score;
    @ExcelProperty("考试日期")
    private Date testDate;
    @ExcelProperty("是否合格")
    private Boolean pass;
     /**
     * 忽略这个字段
     */
    @ExcelIgnore
    private String ignore;
}


import cn.hutool.core.date.DateUtil;
import cn.hutool.core.util.RandomUtil;
import com.alibaba.excel.EasyExcel;

import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.net.URLEncoder;
import java.util.ArrayList;
import java.util.List;

/**
 * 使用EasyExcel导出数据
 */
public class ExcelEasyExcelUtil {

    public  static List<DownloadData> getBigRowsDataMap(int dataTotal){
        List<DownloadData> dataList=new ArrayList<DownloadData>();
        for(int i=0;i<dataTotal;i++){
            DownloadData dataObj = new DownloadData();
            dataObj.setName("name_"+(i+1));
            dataObj.setAge(RandomUtil.randomInt(22,100));
            dataObj.setScore(RandomUtil.randomDouble(85,100));
            dataObj.setPass(i/2==0?true:false);
            dataObj.setTestDate(DateUtil.date());
            dataList.add(dataObj);
        }
        return dataList;
    }

    private static String generateFileNameByDataTotal(int dataTotal){
        String excelFileName="导出数据总记录";
        int recordCount=0;
        if(dataTotal<=1000){
            excelFileName+="1K条";
        }else if(dataTotal>1000&&dataTotal<=10000){
            recordCount=dataTotal/1000;
            excelFileName+=recordCount+"K条";
        }else if(dataTotal>10000&&dataTotal<=100000){
            recordCount=dataTotal/10000;
            excelFileName+=recordCount+"W条";
        }else if(dataTotal>100000&&dataTotal<=1000000){
            recordCount=dataTotal/10000;
            excelFileName+=recordCount+"W条";
        }
        return excelFileName;
    }

    public static void exportExcelByEasyExcel(HttpServletRequest request, HttpServletResponse response, int dataTotal) throws IOException {
        long startTime=System.currentTimeMillis();
        String recordFileName="EasyExcel导出Excel_"+generateFileNameByDataTotal(dataTotal);
        // 这里注意 有同学反应使用swagger 会导致各种问题,请直接用浏览器或者用postman
        response.setContentType("application/vnd.ms-excel");
        response.setCharacterEncoding("utf-8");
        // 这里URLEncoder.encode可以防止中文乱码 当然和easyexcel没有关系
        String fileName = URLEncoder.encode(recordFileName, "UTF-8");
        response.setHeader("Content-disposition", "attachment;filename=" + fileName + ".xlsx");
        EasyExcel.write(response.getOutputStream(), DownloadData.class).sheet("EasyExcel导出").doWrite(getBigRowsDataMap(dataTotal));
        long endTime=System.currentTimeMillis();
        String exportRecord="";
        int recordCount=0;
        if(dataTotal<=1000){
            exportRecord+="1K条";
        }else if(dataTotal>1000&&dataTotal<=10000){
            recordCount=dataTotal/1000;
            exportRecord+=recordCount+"K条";
        }else if(dataTotal>10000&&dataTotal<=1000000){
            recordCount=dataTotal/10000;
            exportRecord+=recordCount+"W条";
        }
        System.out.println("EasyExcel导出"+exportRecord+" 数据 耗时: "+(endTime-startTime)/1000 +" 秒");
    }

}

3、成果展现

项目中Excel导出大数据量记录解决方案及实战(POI,Hutools,EasyExcel)

项目中Excel导出大数据量记录解决方案及实战(POI,Hutools,EasyExcel)

项目中Excel导出大数据量记录解决方案及实战(POI,Hutools,EasyExcel)

4、总结

        最后因为系统以前引入的POI版本相对较低是3.9版本的;受限于其他功能之中用到低版本一些方法和属性,没有使用Hutool进行实现,刚好高于3.8版可以使用HSSFWorkbook以及SXSSFWorkbook,HSSFWorkbook用来处理较少的数据量,SXSSFWorkbook用来处理大数据量以及超大数据量的导出。于是选择使用第1种方式实现导出功能。最后效果很惊人。实现3000多条数据1秒导出。以下截图是调整前和调整后的效果。

项目中Excel导出大数据量记录解决方案及实战(POI,Hutools,EasyExcel)

项目中Excel导出大数据量记录解决方案及实战(POI,Hutools,EasyExcel)

       若导出数据量确实多大,已经试过数据模型不太多情况下100w条记录,Excel文件大小已达33M,建议可以采取分隔数据模式导出为多个Excel文件。可见参考文章

     我们在实际开发实践之中;虽然可以有多种实现方式;但是最后选择实现方式受限于环境因素;如同生活之中有很多选择,单最后能够达到不破不立的情况太少。所以需要选择一种适合具体场景的方式。

    通过本次实践,这种导入导出功能十分常见,可以在实际项目之中采取抽取出可以导出大数据量实现工具类,统一一下导入导出部分功能工具,实现功能单一职责原则。国人这块能够思考和实践太少啦,导致每天都是堆码

5、参考文章

JavaWEB_POI导出大量数据excel(50万左右)史上最全的Excel导入导出之easyexcel