Search for String containing spelling mistakes in SQL Server
1. Have MS SQL Server with data. Having use case to search across data but data could contain spelling mistakes.
2. Have basic knowledge of writing SQL Select statements.
3. Permission to add SQLCLR objects to SQL Server. You may contact your DBA to seek access to import SQL CLR into your SQL Server.
4. You have Visual Studio 2015 Community edition (Free) or better in order to compile provided source code to generate SQL CLR object.
1. The idea of this exercise is to create a tool inside SQL server to be used within your SQL query which can be called via SSMS or from another program via ADO.NET/JDBC. It addresses only spelling mistakes and does not address abbreviations, word swaps etc. Ex: “McDonald” against “MacDonald” can give a better percentage match. But expecting a match of “Sgt John” against “Sergeant John” may not be right criteria.
A query as below in SSMS may return “71.43” stating that the passed two strings match 71.43%
select dbo.DamLevDisCLR('raja venkatesh','raju vankudesh',1)
The third parameter may be 1 or 0. 1 is to Ignore Case. 0 is to Match Case.
Getting it work
You may download the DLL attached with this project and use it within SQL or copy the source code given here, compile using Visual Studio 2015 Community or better edition, create the DLL and use it.
Using the downloaded DLL.
Once you download the DLL, you may have to right click it in Windows explorer, go to properties screen and unblock the security warning.
In SSMS, Goto Database\Programmability\Assemblies
Right click Assemblies node, and click “New Assembly”. You’ll be prompted with a window as below.
Use Browse button and choose the downloaded DLL, key in appropriate permissions, etc. and click OK.
If everything is fine, you should see newly added DLL as in below screen in SSMS.
At start (in SQL statement), it is expected to throw error as below.
Msg 6263, Level 16, State 1, Line 1
Execution of user code in the .NET Framework is disabled. Enable "clr enabled" configuration option.
In SSMS run below SQL commands to enable CLR. You may have to get this done by DBA and she is not expected to enable CLR either. You may have to show the source code and relevant documentation to get approvals from relevant authorities in your organization.
sp_configure 'show advanced options', 1;
sp_configure 'clr enabled', 1;
Configuration option 'show advanced options' changed from 0 to 1. Run the RECONFIGURE statement to install.
Configuration option 'clr enabled' changed from 0 to 1. Run the RECONFIGURE statement to install.
Use the function
In SSMS, run the below SQL queries.
select dbo.DamLevDisCLR('raja venkatesh','raju vankudesh',1) as MatchRel
select dbo.DamLevDisCLR('Cat shot bat', 'The Bat kills Cat',1) as MatchRel
Match Relevance: 71.43 %
Match Relevance: 35.29%
1. Let us say we have a table [BioData] with one of the column called “LName” as String datatype (VARCHAR). The table is Full Text Indexed and the column “LName” is part of the FTS. You may want to search for a person whose spelling is partially known to you or partially correct in another table. A simple FTS query in such case can be
select * from BioData where dbo.DamLevDisCLR(LName,'schwarznagger',1)>=80
In this case names matching with spelling tolerance of 20% (match with 80%) is listed.
In next topic, we shall do a “Unstructured” Data storing in SQL server, perform
1. Full Text queries against the unstructured data.
2. Store Non-Text data like Images, Audio etc. and perform Full text query against the same data.
3. Combine the current “DamLevDisCLR” function with the same and perform a FTS query against unstructured data with spelling mistakes.