Thursday 8 December 2011

Excel Output Libraries

In previous posts I've discuss using the Carlos AG Excel Writer library which outputs in XML format and have found it to be very usable. I have been finding a few issues which were bugging me and that was largely the difficulty in working with large datasets due to the file size and the fact that the library looks like it is no longer under active development.

I then looked at ExcelLibrary which had some recommendations but has almost no documentation and the defects page gives the indication that it is no longer being actively developed. This isn't as much of a problem as it was with the Carlos AG library but it was worth looking further afield.

A post on StackOverflow linked to ClosedXML which had plenty of documentation and the source was available for modification, even better it appeared to be under active development.

To do a basic test of the library I wrote a small program to do something I've been meaning to do for a while.

Given a Datalink file and a file containing a list of SQL Statements generate a Excel file with a worksheet for each statement. I also used this as an opportunity to improve how I handle command line arguments by using the CommandLineParser utility for XSD2DB.

The full code of the program I wrote to test this can be found on GitHub and a zip of the binaries can be downloaded.

At this point in time I've gone for the very simplest model and just created a new Worksheet using a DataTable as a source and left the formatting as a manual operation.
As you can see it's incredibly simple


   1:      class DataTableToExcel
   2:      {
   3:   
   4:          private DataTable dt;
   5:          private XLWorkbook wb;
   6:          private string filename;
   7:          
   8:   
   9:          public DataTableToExcel(DataTable dt, string filename)
  10:          {
  11:              this.dt = dt;
  12:              this.filename = filename;
  13:              if (File.Exists(filename))
  14:              {
  15:                  wb = new XLWorkbook(filename);
  16:              }
  17:              else
  18:              {
  19:                  wb = new XLWorkbook();
  20:              }
  21:          }
  22:   
  23:          public void Process()
  24:          {
  25:              var ws = wb.Worksheets.Add(dt);
  26:              ws.Name = "Sheet " + wb.Worksheets.Count;
  27:              wb.SaveAs(filename);
  28:          }
  29:      }

Planned Updates
In time I want to be able to add the following features by having a more complex configuration file.

  • Associating a name with a query so that sheets are correctly named
  • Running the queries in parallel and passing the datatables to the generator when necessary
  • Better error handling of bad statements
  • Support for auto inserting date and timestamp in the output file name