Find Rhymes in Text with Rhyme-Grep

Let’s see how to implement (and have fun with!) a command-line tool to find rhyming words.

Quick Tutorial

1) Get the rhyme-grep/ folder and enter it:

$ git clone https://github.com/massimo-nazaria/rhyme-grep.git
$ cd rhyme-grep/

2) Make rhymp.sh executable:

$ chmod +x rhymp.sh

3) Find words that rhyme with “leaves” in Leaves of Grass by Walt Whitman:

$ ./rhymp.sh "leaves" leaves-of-grass.txt

Output:

4) Let’s print only the rhyming words by using the -o argument, and at the same time remove possible duplicates with sort -fu:

$ cat leaves-of-grass.txt | ./rhymp.sh "leaves" -o | sort -fu
believes
heaves
perceives
receives
recitatives
sleeves
thieves

That’s quite funny (and useful)!

How it Works

Rhyme-Grep extracts the following data from the CMU Pronouncing Dictionary:

  1. The English pronunciation (namely a list of phonemes) of the input word; as well as
  2. The list of dictionary words that have the same pronounciation phonemes as the input word starting from their primary accent.

Then it runs Grep to search for the found rhyming words in the input text.

ALGORITHM OVERVIEW

Let’s see how to search for words that rhyme with leaves in leaves-of-grass.txt.

Step 1: Input word pronunciation

Extract from the CMU dictionary the pronounciation of the word leaves, which is denoted by the list of phonemes L IY1 V Z.

Note the primary accent falls on the phoneme IY, which is marked by 1 in the list.

Step 2: List of rhyming dictionary words

Extract from the dictionary the list of words that rhyme with leaves, namely the words whose pronunciation ends with IY1 V Z.

Step 3: Rhyming words from the input text

Search for the rhyming words within leaves-of-grass.txt.

Let’s say the rhyming word are:

  • eves,
  • perceives.

Run Grep as follows:

$ cat leaves-of-grass.txt | grep -E -wi --color "eaves|perceives"

Implementation

For additional info, please get the code from GitHub and have fun playing with it!

Rhyme-Grep was initially inspired by Semantic Grep: a word2vec-based tool that searches for words with similar meanings to a given word.


© 2025 Massimo Nazaria

RSS

Licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.