The following demonstrates how a histogram with regex-defined bins may be used. Clearly, it is better to “be” than “not”. The Hamlet text is included in this package for convenient testing. You are welcome.
Python source code: plot_regex_1D_example.py
from pyhistogram import Hist
import re
from pyhistogram.testdata import get_file # Hamlet testdata
hist = Hist(['To', 'be', 'or', 'not'])
# Split the words up into individual words
words = re.findall('\w+', get_file().read().lower())
for w in words:
hist.fill(w)
hist.plot()
Total running time of the example: 0.35 seconds