Selection with Fuzziness
Fuzzy Search
A fuzzy search is a technique that uses search algorithms to find strings that match patterns approximately.
It's particularly useful when looking for data without knowing exactly what they're looking for or how a word is spelled.
This Tutorial's Database: spicy.db
spicy.db
Table Name: hotpeppers; Showing only first 3 peppers
Jalapeno
8000
12
Serrano
5000
7
Habanero
350000
8
There are 23 named hot peppers in the dataset
name
-> TEXT column of hot pepper namesmaxSHU
-> INTEGER column of each pepper's maximum level of Scoville Heat Unit rangeinventory
-> INTEGER column of random counts of how many peppers are available
Exact Matching and Case-Insensitive Matching
Query
Output
Although the 2nd row contains a pepper called Serrano
since the S
is capitalized our query could not select the pepper we wanted because the =
needs an exact match
LIKE
operation
LIKE
operationWe can use the LIKE
operator to do case-insensitive comparison
Query
Output
Finding all pepper names that end in "no"
"no"
Query
Output
The %
wildcard character is used with the LIKE
operator to perform pattern matching in queries. It represents zero or more characters in a string.
%
at the Beginning:Matches any sequence of characters before the specified pattern.
Example:
LIKE '%apple'
matches strings ending with "apple" (e.g., "green apple", "red apple").
%
at the End:Matches any sequence of characters following the specified pattern.
Example:
LIKE 'apple%'
matches strings starting with "apple" (e.g., "apple pie", "apple tart").
%
in the Middle:Matches any sequence of characters surrounding the specified pattern.
Example:
LIKE '%apple%'
matches strings containing "apple" anywhere (e.g., "green apple pie", "pineapple").
Multiple
%
Wildcards:You can use multiple
%
wildcards to match complex patterns.Example:
LIKE '%a%ple%'
matches strings that have "a" followed by "ple" somewhere in between (e.g., "apple pie", "pineapple").
Finding all peppers in the database that are composed of two words
We are assuming that there will be a whitespace in between each word
Query
Output
Finding pepper data with names that don't have "Pepper"
as the second word of the name
"Pepper"
as the second word of the nameThis is when the NOT
operator can be handy
Query
Output
Exercises
Write a query that returns
maxSHU
values of all peppers with names that are not two wordedWrite a query that finds a pepper that has 5 characters only and has the letters
nch
in the middleWhich pepper is the spiciest and which pepper is not the spiciest?
Which pepper do we have the most of?
Connected Reading
Fuzzy Search (Wikipedia Link)
Last updated