How to Remove Blank Page in Word Using C#

How to Remove Blank Page in Word Using C#

This quick tutorial explains how to remove blank pages from Word documents (DOCX, DOC, etc.) using C#. The process involves loading a Word file, analyzing individual pages, identifying empty pages, and finally creating a new document without the blank pages.

Benefits of Removing Blank Pages in Word Documents

  1. Cleaner Document:
    • Improves readability and professionalism.
  2. Reduced File Size:
    • Efficient storage by eliminating unnecessary pages.
  3. Automation Capability:
    • Ideal for cleaning large documents automatically.

Prerequisites: Preparing the Environment

  1. Visual Studio or other .NET IDE.
  2. Aspose.Words added via NuGet Package Manager.

Step-by-Step Guide to Remove Blank Pages in Word Using C#

Step 1: Configure Environment

Install the Aspose.Words library through NuGet package manager.

Install-Package Aspose.Words

Step 2: Load the Word Document

Load your original Word file using the Document class object.

Document originalDoc = new Document("WordFileWithBlankPages.docx");

Step 3: Extract Each Page Separately

Loop through each page and extract each page into a separate Document for analysis.

int totalPages = originalDoc.PageCount;

for (int i = 0; i < totalPages; i++)
{
    Document singlePageDoc = originalDoc.ExtractPages(i, 1);
    // Analyze singlePageDoc in next steps
}

Step 4: Analyze Single-Page Documents

Check if the single-page document contains text or shapes.

int shapesCounter = 0;
string pageText = "";

foreach (Section docSection in singlePageDoc.Sections)
{
    pageText += docSection.Body.ToString(SaveFormat.Text);
    shapesCounter += docSection.Body.GetChildNodes(NodeType.Shape, true).Count;
}

Step 5: Maintain Non-Empty Pages List

Keep track of page numbers containing content.

ArrayList blankPages = new ArrayList();
blankPages.Add(-1);

if (string.IsNullOrEmpty(pageText.Trim()) && shapesCounter == 0)
    blankPages.Add(i); // i is page index in loop

Step 6: Append Non-Empty Pages to New Document

Create a new document and append only non-empty pages using the tracked list.

Document finalDoc = (Document)originalDoc.Clone(false);
finalDoc.RemoveAllChildren();

blankPages.Add(totalPages);

for (int i = 1; i < blankPages.Count; i++)
{
    int index = (int)blankPages[i - 1] + 1;
    int count = (int)blankPages[i] - index;

    if (count > 0)
        finalDoc.AppendDocument(originalDoc.ExtractPages(index, count), ImportFormatMode.KeepSourceFormatting);
}

Step 7: Save Modified Document

Save the new document with blank pages removed.

finalDoc.Save(@"cleaned.docx");

Complete Code Example to Delete Blank Pages in Word Using C#

Below is the complete executable code example demonstrating the above steps:

Document originalDoc = new Document("WordFileWithBlankPages.docx");

ArrayList blankPages = new ArrayList();
blankPages.Add(-1);

int totalPages = originalDoc.PageCount;

for (int i = 0; i < totalPages; i++)
{
    Document singlePageDoc = originalDoc.ExtractPages(i, 1);
    int shapesCounter = 0;
    string pageText = "";

    foreach (Section docSection in singlePageDoc.Sections)
    {
        pageText += docSection.Body.ToString(SaveFormat.Text);
        shapesCounter += docSection.Body.GetChildNodes(NodeType.Shape, true).Count;
    }

    if (string.IsNullOrEmpty(pageText.Trim()) && shapesCounter == 0)
        blankPages.Add(i);
}

blankPages.Add(totalPages);

Document finalDoc = (Document)originalDoc.Clone(false);
finalDoc.RemoveAllChildren();

for (int i = 1; i < blankPages.Count; i++)
{
    int index = (int)blankPages[i - 1] + 1;
    int count = (int)blankPages[i] - index;

    if (count > 0)
        finalDoc.AppendDocument(originalDoc.ExtractPages(index, count), ImportFormatMode.KeepSourceFormatting);
}

finalDoc.Save(@"NonEmptyPages.docx");
System.Console.WriteLine("Blank pages removed successfully.");

Conclusion

This article explained how to remove blank pages in Word files using C#. By following the provided steps, you can programmatically detect empty pages and remove them, resulting in a cleaner document. You may further explore Aspose.Words for more Word document manipulation tasks.

 English