Given the list of items on the left, how to divide them into three groups as equally as possible by weight? It doesn’t have to be the same number of items in each group. Just the closest match for weight across the three columns. The image is from Excel but show a solution in whatever tool you think is the best.
SQL (Structured Query Language) is a useful tool for Excel (not to mention a great technology for working with data in general).
Consider the table of data below. It’s a list of 20,000 people on a sheet named “person”.
The demonstration file below retrieves unique values for the title column. Download the spreadsheet and press the SQL button to see it in action.
Option Explicit
Sub load() 'Demonstration of an SQL statement
Dim sFile, sConnectionString, oConnection, oRecordSet, sSQL
sFile = ThisWorkbook.FullName 'Create a connection string that points to the data
sConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & sFile & ";Extended Properties=""Excel 12.0;HDR=Yes;IMEX=1"";"
Set oConnection = CreateObject("adodb.connection") 'Load an ADO connection object
Set oRecordSet = CreateObject("adodb.recordset") 'Load an ADO recordset object
oConnection.Open sConnectionString 'Connect the connection to the data source
sSQL = "SELECT DISTINCT title FROM [person$]" 'Prepare an SQL statement to retrieve the data
oRecordSet.Open sSQL, oConnection 'Run the SQL statement, the results go to the recordset
Range("A1").CopyFromRecordset oRecordSet 'Pour the result into Excel
oRecordSet.Close 'Close the recordset and connection objects
oConnection.Close
Set oRecordSet = Nothing 'Make sure the objects are gone
Set oConnection = Nothing
End Sub
Sub clearSheet() 'Clear the sheet
Cells.Select
Selection.ClearContents
Range("A1").Select
End Sub
'Copyright Darcy Whyte, 2020
'darcy@generalBI.com
We are using ADO (ActiveX Data Objects) in order to work with SQL. This requires a few lines of code to set this up. To use this more conveniently and abstractly we can create an SQL Class to encapsulate those lines. Then the code reduces to this.
Sub LoadTitle()
Dim s As New SQL
s.start "SELECT DISTINCT title FROM [person$]"
Range("A1").CopyFromRecordset s.oRS
End Sub
Now it’s much easier to leverage the convenience and power of SQL. Here are more examples.
Sub LoadTitle() 'Unique values of the title column
Dim s As New SQL
s.start "SELECT DISTINCT title FROM [person$]"
Range("A1").CopyFromRecordset s.oRS
End Sub
Sub LoadGender() 'Unique values of gender
Dim s As New SQL
s.start "SELECT DISTINCT gender FROM [person$]"
Range("A6").CopyFromRecordset s.oRS
End Sub
Sub LoadStateCounts() 'Counts by state ordered by count descending
Dim s As New SQL
s.start "SELECT state, count(*) as stateCount FROM [person$] group by state order by count(*) desc"
Range("C1").CopyFromRecordset s.oRS
End Sub
Sub LoadKapuskasing() 'All the people in Kapuskasing
Dim s As New SQL
s.start "SELECT * FROM [person$] where city = 'kapuskasing'"
Range("A15").CopyFromRecordset s.oRS
End Sub
You can also retrieve data from other Excel Workbooks or Access. This can be done just by changing the connection string. A connection string can be made for any database that supports ODBC (Open Database Connectivity).
Option Explicit
Sub goGetItLocal()
logNote "starting local"
Dim s As New SQL
s.start "select top 20 * from " & s.TableRef("person") & " order by givenname"
logNote "data loaded"
Sheet1.Range("A1").CopyFromRecordset s.oRS
logNote "data loaded to spreadsheet"
End Sub
Sub goGetItOtherExcel()
logNote "starting other excel"
Dim s As New SQL
s.setFile "B.xlsx"
s.start "select top 20 * from " & s.TableRef("person") & " order by givenname"
logNote "data loaded"
Sheet1.Range("A1").CopyFromRecordset s.oRS
logNote "data loaded to spreadsheet"
End Sub
Sub goGetItAccess()
logNote "starting access"
Dim s As New SQL
s.setFile "C.accdb"
s.start "select top 20 * from " & s.TableRef("person") & " order by givenname"
logNote "data loaded"
Sheet1.Range("A1").CopyFromRecordset s.oRS
logNote "data loaded to spreadsheet"
End Sub
Ever wonder what Power Query can do for you? Here’s a quick demonstration of using it to load data from the City of Ottawa’s Open Data repository. Also a quick demonstration how Power Query and Power Pivot can be used to easily gain insight into the data.
One of the issues that is solved in this demo is that of inconsistent vocabulary. The data refers to the city wards differently at different times. The technique demonstrated involves making a conversion table to normalize the ward vocabulary. Then Power Pivot can make the connection using a relationship or Power Query can merge the normalized vocabulary into the main file.
Another issue being demonstrated is loading and aggregating the data directly from the Web. You can then stay abreast of city updates using a refresh function.