LINQ with VB.NET: Join SQL with a CSV file or List of objects

One of the big advantages with LINQ is that you can do operations on sets in the data access layer instead of doing them in the database.

An example of this is to join a table from a CSV-file with a table in a database. In this example the CSV-file has been loaded into a List(Of CsvRecord), where CsvRecord is an entity object representing the content of the file. Code for that class contains properties for each column. The code is not shown here. For this example it just contains the property CsvId, which is used for joining with the SQL server.

Update of the resultset back to SQL server is possible and easy (Wow), but not shown in this post.

Since the SQL server in this example is only used for read (and not write) then I prefer to write a dedicated view to use as source in SQL server. Lets define it as:

CREATE View VIEW_MYSQLDATA4LINQ AS
SELECT field1 FROM table1 --extend this select with joins, wheres, etc. if you need that
GO

Now switch to VisualStudio with a project open.
– Add new item – select “LINQ to SQL Classes” – Name the file MySqlData4Linq.dbml
=> this opens a DSL GUI called “Object Ralational Designer” (ORD)
– From “Server Explorer” make a connection to you SQL server – drag the view VIEW_MYSQLDATA4LINQ to ORD.
=> This will add the connectionstring to app.config, Settings.settings (and Settings.Designer.vb) and MySqlData4Linq.dbml.
If you don’t want a new connection string added then you can “undo checkout” on app.config, Settings.settings and change the connection string in MySqlData4Linq.dbml. Do that in the property explorer. Below set to “MyBaseConnectionString”.
In MySqlData4Linq.dbml there is written a connection string like this one:

<Connection Mode="AppSettings"
ConnectionString="Data Source=mysvr;Initial Catalog=MYBASE;Persist Security Info=True;User ID=myuser"
SettingsObjectName="My.MySettings" SettingsPropertyName="MyBaseConnectionString" Provider="System.Data.SqlClient" />

When you later on want to change/extend the view in the SQL server, you just have to delete the view in MySqlData4Linq.dbml and drag’n’drop the view once again from the server explorer.

In the “solution explorer” you can expand MySqlData4Linq.dbml. Press “Show All Files” in the top of the explorer. this will reveal two autogenerated files – one of them being MySqlData4Linq.designer.vb.
This defines a class that is used for getting data:

Partial Public Class MySqlData4LinqDataContext
	Inherits System.Data.Linq.DataContext

Linq.DataContext implements a method GetTable(), which retrieves the data from SQL server.

An add-in for Visual studio (VLinq) can help generate a query using Linq.
http://blogs.msdn.com/mitsu/archive/2008/04/02/visual-linq-query-builder-for-linq-to-sql-vlinq.aspx
In the first example we just “select *” in the VLinq generated code in MySqlData4LinqQueries.Designer.vb:

Imports System.Linq
Imports System.Data.Linq

Public Partial Class MySqlData4LinqQueries
	Public Shared Function GetVwMySqlData4LinqQuery(ByVal context As System.Data.Linq.DataContext ) As System.Linq.IQueryable(Of VIEW_MySqlData4Linq)
		Return (	From v In context.GetTable(Of VIEW_MySqlData4Linq)()  _
			Select v)
	End Function
End Class

'Using above query, data can be read from SQL server:
Public Sub SomeSub()
	Using db As New MySqlData4LinqDataContext
		'read from db
		Dim qResListVwMySqlData4Linq As IQueryable(Of VIEW_MySqlData4Linq) = MySqlData4Queries.GetVwMySqlData4Query(db)
		'Example of traversing the result
		For Each qResRowVwMySqlData4Linq As VIEW_MySqlData4Linq In qResListVwMySqlData4Linq
			Console.WriteLine(qResRowVwMySqlData4Linq.field1)
		Next
	End Using 'close db connection

Above query will probably send a select to SQL server each time the for each loop is entered. To avoid that and get all data at once you can call qResListVwMySqlData4Linq.ToList(). This will return a list of VIEW_MySqlData4Linq, which you then can traverse or use in another query.

If you do a join with a List(of CsvRecord), then also the total table from SQL server will be fetched. The join will happen in-memory and only return a subset of rows (the joined rows). Like in this example:

        Private Function GetJoinedCsv(ByVal db As DataContext, ByRef queryableList As IEnumerable(Of CsvRecord)) _
        As List(Of CsvRecord)
            Dim query As IEnumerable(Of CsvRecord)
            Dim sqlTable As Table(Of VIEW_MYSQLDATA4LINQ) = db.GetTable(Of VIEW_MYSQLDATA4LINQ)()

            query = From csvrow In queryableList _
                    Join dbrow In sqlTable On csvrow.CsvId Equals dbrow.Id _
                    Select csvrow

            'execute the query:
            'if the query implements IEnumerable, then it can be executed immedeately with ToList()
            'ToList() will select * from the view and do the join in memory
            Return query.ToList()
        End Function

 'Using above query, data can be read from SQL server:
 Public Sub SomeSub()
            Dim CsvList As List(Of CsvRecord) = LoadCsvList() 'load the list from a sub
            Dim joinedData As List(Of CsvRecord)
            Using db As New MySqlData4LinqDataContext
                'note that the CsvList could be a ADO DataTable, since that type implements AsEnumerable()
                joinedData = GetJoinedCsv(db, CsvList.AsEnumerable())
            End Using 'close db connection

            For Each o As CsvRecord In joinedData
                Debug.WriteLine(o.ToString())
            Next

So when you need joins with no update back to the SQL server, then you can just as well start with reading the SQL view, store it in-memory and then do the join(s) afterwards. Like in this example:

Public Partial Class MySqlData4LinqQueries
 Public Shared Function GetVwMySqlData4LinqQuery(ByVal context As System.Data.Linq.DataContext) As System.Linq.IQueryable(Of VIEW_MySqlData4Linq)
  Return (From v In context.GetTable(Of VIEW_MySqlData4Linq)() _
   Select v)
 End Function

 Public Function GetJoinedData(ByRef sqlList As IEnumerable(Of VIEW_MySqlData4Linq), ByRef queryableList As IEnumerable(Of CsvRecord)) _
 As List(Of CsvRecord)
  Dim query As IEnumerable(Of CsvRecord)
  query = From csvrow In queryableList _
   Join dbrow In sqlList On csvrow.CsvId Equals dbrow.Id _
   Select csvrow
  'execute the query:
  'if the query implements IEnumerable, then it can be executed immedeately with ToList()
  Return query.ToList()
 End Function
End Class

Public Sub SomeSub()
 Dim listVwMySqlData As List(Of VIEW_MySqlData4Linq)
 Using db As New MySqlData4LinqDataContext
  Dim qResListVwMySqlData As IQueryable(Of VIEW_MySqlData4Linq) = MySqlData4LinqQueries.GetVwMySqlData4LinqQuery(db)
  'Save Linq result in a list, so the vw will be in memory and the connection can be released
  listVwMySqlData = qResListVwMySqlData.ToList()
 End Using 'close db connection

 'Do the first join with in-memory sqldata
 Dim joinedData1 As List(Of CsvRecord)
 joinedData1 = q.GetJoinedData(listVwMySqlData.AsEnumerable(), CsvList.AsEnumerable())

 For Each o As CsvRecord In joinedData1
  Debug.WriteLine(o.ToString())
 Next

 'Do another join with in-memory sqldata
 joinedData1 = q.GetOtherJoinedData(listVwMySqlData.AsEnumerable(), OtherCsvList.AsEnumerable())

The next to do would probably be to select an object of another type than the left table in the join – a joined type.
That is done by selecting parts of each object using select New{csvrow.Id, csvrow.field1, dbrow.someProperty}
The end.

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: