Some Python libraries and snippets

Published Feb 01, 2017

There are a bunch of standard and third-party Python libraries that are useful, but rarely used by beginners and even intermediate users of Python. This post highlights a few of my favourite libraries and gives practical demonstrations of some functionality. I intend this post to remain 'alive' -- i.e. I'll keep adding libraries and code snippets as I discover them.

inspect (standard library)

I developed in Python for many years before I first used the inspect library, but now I often use it. I find it useful to access the stack and to get the source code of functions.

A basic Python logger

The Python logging library is really nice and full-featured, and in nearly all cases you should be using it (or another logging library). As a quick-and-dirty alternative, let's see how easy it is to log an error message along with the date and time and the function name that threw the error.

import inspect  # gives us access to the stack
from datetime import datetime

LOG = True  # set to False to suppress output

def log(message):
  """Log message with datetime and function name"""
    if LOG:
    	n = datetime.utcnow()
    	f = inspect.stack()[1][3]
        print("{} - {}: {}".format(n, f, message))
        
        
def problem_function():
  """Trigger Exception and log result"""
  try:
    lst = [1,2,3]
    idx = 4
    return lst[idx]
  except IndexError:
    log("Couldn't get element {} from {}".format(idx, lst))
    

def call_problem_function():
  """Call problem function"""
  return problem_function()


def main():
  call_problem_function()


if __name__ == "__main__":
  main()

In the above contrived example, we have a problem_function that needs to log output of what went wrong. Instead of simply printing the result, we define a simple log function. This:

  • Checks to see if we want output (if the LOG global is set to True)
  • Prints out the custom message along with the name of the function that caused it and the current time

We can access the stack of function calls usinginspect.stack() We grab the second element off it (inspect.stack()[1]) as the first one will always be the log function itself, while one further back will the function that called log(). We get the second item (inspect.stack()[1][3]) as this is the name of the function.

So-called "print debugging" is definitely not best practice, and there normally better ways, but most people are guilty of using it and it's often useful. Sometimes adding some information from the stack makes it much easier to work out exactly what's going wrong with your code.

TextBlob (third party library)

Perhaps the best-known library for Natural Language Processing (NLP) in Python in NLTK. I am not a huge fan of NLTK. Another nice library is Spacy, which addresses many of the issues with NLTK. However, Spacy is much more modern and still lacks some functionality. It's also pretty resource intensive, and slow to initialise (which is frustrating for prototyping).

Another nice library is for NLP is Pattern, but this only available for Python 2. TextBlob is built on top of both NLTK and Pattern (but it works for Python 3). I find it is sometimes a nice compromise between simplicity and functionality. It turns strings of language into 'blobs', and you can easily perform common operations on these. For example:

from textblob import TextBlob

negative_sentence = "I hated today"
positive_sentence = "Had a wonderful time with my goose, my child, and my fish"
neutral_sentence = "Today was a day"

# sentiment analysis
neg_blob = TextBlob(negative_sentence)
pos_blob = TextBlob(positive_sentence)
neu_blob = TextBlob(neutral_sentence)

print(neg_blob.sentiment)
print(pos_blob.sentiment)
print(neu_blob.sentiment)

# output
# >>>Sentiment(polarity=-0.9, subjectivity=0.7)
# >>>Sentiment(polarity=1.0, subjectivity=1.0)
# >>>Sentiment(polarity=0.0, subjectivity=0.0)

# pluralization
print("{}->{}".format(pos_blob.words[6], pos_blob.words[6].pluralize()))
print("{}->{}".format(pos_blob.words[7], pos_blob.words[7].pluralize()))
print("{}->{}".format(pos_blob.words[8], pos_blob.words[8].pluralize()))
print("{}->{}".format(pos_blob.words[-1], pos_blob.words[-1].pluralize()))

# output
# >>>goose->geese
# >>>my->our
# >>>child->children
# >>>fish->fish

# Parts of Speech (POS) tagging
print(pos_blob.tags)

# ouput
# >>> [('Had', 'VBD'), ('a', 'DT'), ('wonderful', 'JJ'), ('time', 'NN'), ('with', 'IN')  ('my', 'PRP$'), ('goose', 'NN'), ('my', 'PRP$'), ('child', 'NN'), ('and', 'CC'), ('my', 'PRP$'), ('fish', 'NN')]
nouns = ' '.join([tag[0] for tag in pos_blob.tags if tag[1] == "NN"])
print(nouns)

# ouput
#>>> 'time goose child fish'

We can see that it can identify positive and negative sentiment pretty well. In TextBlob sentiment analysis consists of two parts: polarity (which is how positive or negative the sentiment is), and subjectivity (which is how opinionated the sentence is). I find the subjectivity score less useful and less accurate, and usually use only the polarity. The sentiment is simply a named tuple, so you can access on the polarity with neg_blob.sentiment[0], for example.

It also handles pluralisation of nouns pretty well, and can change goose to geese, etc. It isn't perfect (e.g. pants becomes pantss) but it's the simplest solution I found to pluralisation that works most of the time.

To get POS tags, we access the tags attribute, so we can easily extract all the nouns from a sentence, for example.

Conclusion

I've only added a couple of libraries and examples for now, but this post will keep growing. If there's anything that you feel should be included, feel free to comment below or tweet @sixhobbits and I'll consider adding it.

Discover and read more posts from Gareth Dwyer
get started
Enjoy this post?

Leave a like and comment for Gareth

6