Tuesday, February 28, 2012

DataReader vs DataSet Performance

When accessing data with ADO.NET, you can use either a DataReader or a DataSet. For SqlClient provider, provider-specific object for DataReader is SqlDataReader which comes in
System.Data.SqlClient namespace. DataSet comes in System.Data namespace.

I will start with DataReader first. DataReader is part of ADO.NET connected architecture. I am not going to explain what is ADO.NET connected architecture and disconnected architecture. Because sometimes back I have blogged about Disconnected Data Access Architecture in ADO.NET. In DataReaders what hapens is, the DataReader loads one row at a time from the database. Each time the DataReader's Read() method is called, the DataReader discards the current row and advances to the next row. If there is a row it will return true and if there is not, it will return false. A DataReader is limited to being read-only and forward-only. That is, the information retrieved from the database cannot be modified by the DataReader, nor can the DataReader retrieve records in a random order. Instead, a DataReader is limited to accessing the records in sequential order, from the first one to the last one, one record at a time. So DataReaders are connected data objects because they require an active connection to the database.

DataSet is part of disconnected architecture in ADO.NET. Actually disconnected architecture is based on DataSets. DataSets can be thought of as in-memory databases. First get connected to the database, DataSet get filled up with data and connection to the database gets disconnected. Now you have all the data in the DataSet and you can manipulate them and write back changes to the database. Just like a database, DataSet is comprised of a set of tables, a DataSet is made up of a collection of DataTable objects. Whereas a database can have relationships among its tables, along with various data integrity constraints on the fields of the tables, so too can a DataSet have relationships among its DataTables and constraints on its DataTables' fields.

So, if we talk about the performance, there is a huge gap in these two. It's said that the DataReader is roughly thirty times more performant than the DataSet. In DataSets, since all the data will be loaded into memory, there is a overhead in memory.

Here is a nice article where you can see from the results, it doesn't seem to matter whether we are retrieving 10 or 10,000 records, the performance degradation of the DataSet is dramatic when compared to the DataReader.
     A Speed Freak's Guide to Retrieving Data in ADO.NET

Happy Coding.

Regards,
Jaliya

No comments:

Post a Comment