fuzzy-logic-search (fls) brings fuzzy logic to document querying. Unlike traditional Boolean search that returns binary relevant/not-relevant results, fls produces a degree-of-membership score in [0, 1], indicating how well each document matches your query.
The Core Insight
Boolean search is rigid: a document either matches or it doesn’t. Fuzzy logic captures nuance through gradation:
from fuzzy_logic_search.fuzzy_query import FuzzyQuery
from fuzzy_logic_search.fuzzy_set import FuzzySet
# Construct a query
query = FuzzyQuery("(and python machine-learning)")
# Or use Python operators
q1 = FuzzyQuery("python")
q2 = FuzzyQuery("machine-learning")
query = q1 & q2 # Equivalent to (and python machine-learning)
Query Language
Queries use a Lisp-like syntax that maps to an AST:
; Simple conjunction
(and cat dog)
; With negation
(and cat dog (not fish))
; With fuzzy modifiers
(very (and cat dog))
; Complex nested query
(or (and python ml) (very (not java)))
Or construct directly with Python:
# Using operators
query = FuzzyQuery("cat") & FuzzyQuery("dog") & ~FuzzyQuery("fish")
# Using AST directly
query = FuzzyQuery(['and', 'cat', 'dog', ['not', 'fish']])
Fuzzy Modifiers
Linguistic hedges transform membership values:
# "Very" squares the membership (emphasizes strong matches)
very_query = FuzzyQuery("python").very()
# 0.9 → 0.81, 0.5 → 0.25
# "Somewhat" takes square root (broadens tolerance)
somewhat_query = FuzzyQuery("python").somewhat()
# 0.9 → 0.95, 0.25 → 0.5
# "Extremely" cubes the membership
extremely_query = FuzzyQuery("python").extremely()
# "Slightly" takes 10th root
slightly_query = FuzzyQuery("python").slightly()
Evaluating Queries
Evaluate queries against a document corpus:
# Documents as lists of terms
docs = [
["python", "machine-learning", "tensorflow"],
["java", "spring", "microservices"],
["python", "web", "flask"],
["machine-learning", "neural-networks", "pytorch"]
]
# Evaluate query
query = FuzzyQuery("python") & FuzzyQuery("machine-learning")
result = query.eval(docs) # Returns FuzzySet
# result.memberships = [1.0, 0.0, 0.0, 0.0]
# Only first document has both terms
Custom Membership Functions
Provide custom functions for nuanced matching:
def tf_idf_membership(term, doc):
"""Use TF-IDF instead of crisp membership."""
if term not in doc:
return 0.0
tf = doc.count(term) / len(doc)
# ... compute IDF from corpus
return min(tf * idf, 1.0)
result = query.eval(docs, membership_fn=tf_idf_membership)
Fuzzy Set Operations
Results are FuzzySet objects with set-theoretic operations:
# Evaluate two queries
result1 = query1.eval(docs) # FuzzySet
result2 = query2.eval(docs) # FuzzySet
# Fuzzy intersection (AND) - element-wise min
combined = result1 & result2
# Fuzzy union (OR) - element-wise max
either = result1 | result2
# Fuzzy complement (NOT)
opposite = ~result1
Logical Operators
| Operator | Fuzzy Operation | Effect |
|---|---|---|
and | minimum | Both conditions must match |
or | maximum | Either condition can match |
not | 1 - x | Inverts membership |
sym-diff | max - min | Symmetric difference |
diff | max(a - b, 0) | Set difference |
Homomorphism Property
A key mathematical property: the mapping from queries to results is a homomorphism. Operations on queries translate directly to operations on their result sets:
# These produce identical results:
result1 = (q1 & q2).eval(docs)
result2 = q1.eval(docs) & q2.eval(docs)
# For any operation op:
# (q1 op q2).eval(D) = q1.eval(D) op q2.eval(D)
This ensures consistency: whether you combine fuzzy sets at the query level or result level, you arrive at the same final degrees of membership.
JSON Path Queries
For structured JSON documents, use path expressions:
; Query nested fields
(> :user.age 25)
; String predicates
(starts-with? :name "John")
; Combine with logic
(and
(== :address.city "New York")
(not (< :age 25)))
Supported Predicates
==,>,<,>=,<=- Numeric comparisons with fuzzy tolerancestarts-with?,ends-with?,contains?- String matchingin?- Membership in set/rangeregex?- Regular expression matchingjaccard?- Jaccard similaritytf-idf?- TF-IDF scoringlev?- Levenshtein distance-based matching
Use Cases
- Search Engines: Graded results reflecting partial matches
- Recommendation Systems: Combine multiple preferences fuzzily
- Data Analysis: Query JSON datasets with flexible, human-like reasoning
- Information Retrieval: Beyond binary keyword matching
Installation
pip install fuzzy-logic-search
Resources
fuzzy-logic-search: Because real-world queries aren’t black and white.
Discussion