欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

C# PDF操作之-PDF转TXT

程序员文章站 2022-04-11 10:30:03
...

特别说明:需引用Aspose.PDF.dll

代码案例:

using System.IO;
using Aspose.Pdf;
using Aspose.Pdf.Text;
using System;

namespace Aspose.Pdf.Examples.CSharp.AsposePDF.Text
{
    public class ExtractTextAll
    {
        public static void Run()
        {
            // ExStart:ExtractTextAll
            // The path to the documents directory.
            string dataDir = RunExamples.GetDataDir_AsposePdf_Text();

            // Open document
            Document pdfDocument = new Document(dataDir + "ExtractTextAll.pdf");

            // Create TextAbsorber object to extract text
            TextAbsorber textAbsorber = new TextAbsorber();
            // Accept the absorber for all the pages
            pdfDocument.Pages.Accept(textAbsorber);
            // Get the extracted text
            string extractedText = textAbsorber.Text;
            // Create a writer and open the file
            TextWriter tw = new StreamWriter(dataDir + "extracted-text.txt");
            // Write a line of text to the file
            tw.WriteLine(extractedText);
            // Close the stream
            tw.Close();
            // ExEnd:ExtractTextAll          
            
        }
    }
}