How to Convert HTML to JSON using C#

How to Convert HTML to JSON using C#

Converting HTML to JSON allows developers to extract structured data from web formats and use it in data-driven applications. Aspose.Cells for .NET enables developers to load HTML files and export their contents as JSON efficiently and programmatically.

Why Convert HTML to JSON?

  1. Data Portability:
    • Transfer tabular HTML data into backend services or APIs as JSON.
  2. Web-to-App Integration:
    • Extract table or structured web content for further processing in apps.
  3. Automation Ready:
    • Ideal for automating web scraping or content extraction processes.

Step-by-Step Guide to Convert HTML to JSON

Step 1: Install Aspose.Cells via NuGet

Install Aspose.Cells for .NET:

dotnet add package Aspose.Cells

Step 2: Set Up License

Enable full functionality:

Metered matered = new Metered();
matered.SetMeteredKey("PublicKey", "PrivateKey");

Step 3: Load HTML File

Create a new workbook by loading the HTML input:

Workbook workbook = new Workbook("Sample.html");

Step 4: Access the Last Cell

Identify the last cell in the worksheet to define export boundaries:

Cell lastCell = workbook.Worksheets[0].Cells.LastCell;

Step 5: Define Range for Export

Create a range that spans the worksheet data:

Range range = workbook.Worksheets[0].Cells.CreateRange(0, 0, lastCell.Row + 1, lastCell.Column + 1);

Step 6: Configure JsonSaveOptions

Set any export options:

JsonSaveOptions options = new JsonSaveOptions();

Step 7: Export to JSON

Serialize the defined range to JSON:

string jsonData = Aspose.Cells.Utility.JsonUtility.ExportRangeToJson(range, options);

Step 8: Save JSON to File

Write the output to disk:

System.IO.File.WriteAllText("htmltojson.json", jsonData);

Common Issues and Fixes

1. Empty Output

  • Solution: Ensure the HTML file contains table-based structured content for valid data recognition.

2. Incorrect Range

  • Solution: Double-check that the range includes all relevant cells from the worksheet.

3. Export Formatting

  • Solution: Use JsonSaveOptions to control sheet indexing, skip empty rows, or customize hyperlinks.
 English