Reading Large excel file using SAX Approach.

Category: open xml format sdk


meKrishna on Fri, 21 Oct 2011 14:04:44

I  have large excel file (approx 500,000 rows) and i have to validate each column of the excel file. I want to use SAX approach to read excel file. The problem i am facing is , my excel file contains int, float, char, string value and i don't know how to read different data Types using SAX approach. 

Any suggestion is appreciated.


Bruce Song on Mon, 24 Oct 2011 03:54:15

Hi Krishna,

Thank you for posting. We are doing the research about the problem, there might be some delay about the response. Appreciate your patience.

Best Regards,

Will Buffington [MSFT] on Mon, 24 Oct 2011 21:47:54

Here is a C# code sample using the SAX approach for reach an Excel file:

WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();

OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
string text;
while (reader.Read())
    if (reader.ElementType == typeof(CellValue))
        text = reader.GetText();
        Console.Write(text + " ");

And here is a VB approach for doing the same:

Dim workbookPart As WorkbookPart = spreadsheetDocument.WorkbookPart
Dim worksheetPart As WorksheetPart = workbookPart.WorksheetParts.First()

Dim reader As OpenXmlReader = OpenXmlReader.Create(worksheetPart)
Dim text As String
While reader.Read()
    If reader.ElementType = GetType(CellValue) Then
        text = reader.GetText()
        Console.Write(text & " ")
    End If
End While

jimsurf on Thu, 14 Dec 2017 23:27:43

GetText doesn't return the value of strings it returns an index to the shared string table.

This is what I am doing, but I am not sure about the performance of it:

var c = (Cell)reader.LoadCurrentElement();
                        var cellValue = GetCellValue(c, workbookPart);

        private string GetCellValue(Cell c, WorkbookPart workbookPart)
            if (c.DataType != null && c.DataType == CellValues.SharedString)
                return workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>()


            return c.CellValue != null ? c.CellValue.InnerText : string.Empty;