fuzzy-logic-search (fls) brings fuzzy logic to document querying. Unlike traditional Boolean search that returns binary relevant/not-relevant results, fls produces a degree-of-membership score in [0, 1], indicating how well each document matches your query.
The Problem with Boolean Search
Boolean search is rigid: a document either matches or it does not. If you search for “python AND machine-learning,” you get a binary split. A document about Python ML that never uses the exact term “machine-learning” gets zero, same as a document about medieval pottery.
Fuzzy logic captures the gradation that Boolean search throws away.
from fuzzy_logic_search.fuzzy_query import FuzzyQuery
from fuzzy_logic_search.fuzzy_set import FuzzySet
# Construct a query
query = FuzzyQuery("(and python machine-learning)")
# Or use Python operators
q1 = FuzzyQuery("python")
q2 = FuzzyQuery("machine-learning")
query = q1 & q2 # Equivalent to (and python machine-learning)
Query Language
Queries use a Lisp-like syntax that maps to an AST:
; Simple conjunction
(and cat dog)
; With negation
(and cat dog (not fish))
; With fuzzy modifiers
(very (and cat dog))
; Complex nested query
(or (and python ml) (very (not java)))
Or construct directly with Python:
# Using operators
query = FuzzyQuery("cat") & FuzzyQuery("dog") & ~FuzzyQuery("fish")
# Using AST directly
query = FuzzyQuery(['and', 'cat', 'dog', ['not', 'fish']])
I went with S-expressions for the query language because they map directly to the AST. No parsing ambiguity, trivial to serialize, and anyone who has written a Lisp evaluator can understand the implementation in about ten minutes.
Fuzzy Modifiers
Linguistic hedges transform membership values:
# "Very" squares the membership (emphasizes strong matches)
very_query = FuzzyQuery("python").very()
# 0.9 -> 0.81, 0.5 -> 0.25
# "Somewhat" takes square root (broadens tolerance)
somewhat_query = FuzzyQuery("python").somewhat()
# 0.9 -> 0.95, 0.25 -> 0.5
# "Extremely" cubes the membership
extremely_query = FuzzyQuery("python").extremely()
# "Slightly" takes 10th root
slightly_query = FuzzyQuery("python").slightly()
These come from Zadeh’s original fuzzy logic work. “Very” is concentration (squaring), “somewhat” is dilation (square root). They are mathematically clean and semantically intuitive: “very python” means “only documents that are strongly about Python.”
Evaluating Queries
Evaluate queries against a document corpus:
# Documents as lists of terms
docs = [
["python", "machine-learning", "tensorflow"],
["java", "spring", "microservices"],
["python", "web", "flask"],
["machine-learning", "neural-networks", "pytorch"]
]
# Evaluate query
query = FuzzyQuery("python") & FuzzyQuery("machine-learning")
result = query.evaluate(docs) # Returns FuzzySet
# result.memberships = [1.0, 0.0, 0.0, 0.0]
# Only first document has both terms
Custom Membership Functions
The default membership is crisp (term present or not), but you can provide custom functions for more nuanced matching:
def tf_idf_membership(term, doc):
"""Use TF-IDF instead of crisp membership."""
if term not in doc:
return 0.0
tf = doc.count(term) / len(doc)
# ... compute IDF from corpus
return min(tf * idf, 1.0)
result = query.evaluate(docs, membership_fn=tf_idf_membership)
This is where it becomes useful for real applications. Swap in TF-IDF, BM25, or embedding cosine similarity as your membership function, and you get fuzzy set operations on top of whatever relevance model you prefer.
Fuzzy Set Operations
Results are FuzzySet objects with set-theoretic operations:
# Evaluate two queries
result1 = query1.evaluate(docs) # FuzzySet
result2 = query2.evaluate(docs) # FuzzySet
# Fuzzy intersection (AND) - element-wise min
combined = result1 & result2
# Fuzzy union (OR) - element-wise max
either = result1 | result2
# Fuzzy complement (NOT)
opposite = ~result1
Logical Operators
| Operator | Fuzzy Operation | Effect |
|---|---|---|
and |
minimum | Both conditions must match |
or |
maximum | Either condition can match |
not |
1 - x | Inverts membership |
sym-diff |
max - min | Symmetric difference |
diff |
max(a - b, 0) | Set difference |
Homomorphism Property
A key mathematical property: the mapping from queries to results is a homomorphism. Operations on queries translate directly to operations on their result sets:
# These produce identical results:
result1 = (q1 & q2).evaluate(docs)
result2 = q1.evaluate(docs) & q2.evaluate(docs)
# For any operation op:
# (q1 op q2).evaluate(D) = q1.evaluate(D) op q2.evaluate(D)
This is not just a nice-to-have. It means you can optimize query processing by pushing operations down to the result level, or pull them up to the query level, without changing semantics. The algebra is consistent across the abstraction boundary.
JSON Path Queries
For structured JSON documents, use path expressions:
; Query nested fields
(> :user.age 25)
; String predicates
(starts-with? :name "John")
; Combine with logic
(and
(== :address.city "New York")
(not (< :age 25)))
Supported Predicates
==,>,<,>=,<=– Numeric comparisons with fuzzy tolerancestarts-with?,ends-with?,contains?– String matchingin?– Membership in set/rangeregex?– Regular expression matchingjaccard?– Jaccard similaritytf-idf?– TF-IDF scoringlev?– Levenshtein distance-based matching
Use Cases
- Search Engines: Graded results reflecting partial matches
- Recommendation Systems: Combine multiple preferences fuzzily
- Data Analysis: Query JSON datasets with flexible, human-like reasoning
- Information Retrieval: Beyond binary keyword matching
Installation
pip install fuzzy-logic-search
Discussion