读取pdf中的内容

时间：2019-12-10 17:17:49 阅读：171 评论：0 收藏：0 [点我收藏+]

标签：tco flush rac import exce static close 获取 int

import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import java.io.*;

public class Extract_Text {

       public static void main(String[] args) {

           //创建PdfDocument实例
           PdfDocument doc= new PdfDocument();

           //加载PDF文件
           doc.loadFromFile("test.pdf");

           StringBuilder sb= new StringBuilder();

           PdfPageBase page;

           //遍历PDF页面，获取文本
           for(int i=0;i<doc.getPages().getCount();i++){
               page=doc.getPages().get(i);
               sb.append(page.extractText(true));
           }

           FileWriter writer;

           try {
              //将文本写入文本文件
               writer = new FileWriter("ExtractText.txt");
               writer.write(sb.toString());
               writer.flush();
           } catch (IOException e) {
              e.printStackTrace();
}

doc.close();
}

读取pdf中的内容

标签：tco flush rac import exce static close 获取 int

原文地址：https://www.cnblogs.com/xianz666/p/12017366.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行