find_similar_str
- pyhelpers.text.find_similar_str(x, lookup_list, n=1, ignore_punctuation=True, method='difflib', **kwargs)
From among a sequence of strings, find
n
ones that are similar tox
.- Parameters
x (str) – a string-type variable
lookup_list (List[str] or Tuple[str] or Sequence[str]) – a sequence of strings for lookup
n (int or None) – number of similar strings to return, defaults to
1
; ifn=None
, the function returns a sortedlookup_list
(in descending order of similarity)method (str or None) –
options include
'difflib'
(default) and'fuzzywuzzy'
if
method='difflib'
, the function relies on difflib.get_close_matchesif
method='fuzzywuzzy'
, the function relies on fuzzywuzzy.fuzz.token_set_ratio
ignore_punctuation (bool) – whether to ignore puctuations in the search for similar texts
kwargs – [optional] parameters of difflib.get_close_matches or fuzzywuzzy.fuzz.token_set_ratio, depending on
processor
- Returns
a string-type variable that should be similar to (or the same as)
x
- Return type
str or list or None
Examples:
>>> from pyhelpers.text import find_similar_str >>> lookup_lst = ['Anglia', ... 'East Coast', ... 'East Midlands', ... 'North and East', ... 'London North Western', ... 'Scotland', ... 'South East', ... 'Wales', ... 'Wessex', ... 'Western'] >>> str_similar = find_similar_str(x='angle', lookup_list=lookup_lst) >>> str_similar 'Anglia' >>> str_similar = find_similar_str(x='angle', lookup_list=lookup_lst, method='fuzzywuzzy') >>> str_similar 'Anglia' >>> str_similar = find_similar_str(x='x', lookup_list=lookup_lst) >>> str_similar # None >>> str_similar = find_similar_str(x='x', lookup_list=lookup_lst, method='fuzzywuzzy') >>> str_similar 'Wessex'