Topic: detect syllables (english)
i like words and language, so i decided to make a script that builds English sentences (ok, a bit free-form sometimes, call it poetry) out of random words. a part of this script would be a way of detecting syllables, in order to arrange words in some metrically desirable way, or something.
anyway, here is the 'detect_syllables' script i just finished. it is a bit messy, and if people have suggestions for improvements etc, that is very welcome indeed. i did all this through loads of Google and experimentation, so that explains the mess a bit i think.
#!/bin/bash
# script to detect syllables in a word
# count the vowels in the word.
# subtract any silent vowels, (like the silent e at the end of a word, or the second vowel when two vowels are together in a syllable)
# subtract one vowel from every diphthong (diphthongs only count as one vowel sound.)
# the number of vowels sounds left is the same as the number of syllables.
# usage: detect_syllables [word] ([word] [word] ...)
# exit when no argument is given
if [ $# -lt 1 ]; then
echo "$(basename $0): no argument given." >&2
exit 1
fi
# continue if there is an argument
full_count=""
for arg in "$@"; do
# convert to lowercase
word="$( tr [:upper:] [:lower:] <<< $arg )"
# cleanup of the word for syllable-matching:
# remove silent vowel 'e' at the end, unless it is the only vowel in the word
# remove stuff ending in '...ened', which has 2 vowels but is 1 syllable
# remove diphthongs (ou,ai,ei), replace with 'o'
# include 'y' when flanked by consonants (which makes it a vowel), f.e. in 'dyke', replace with 'o'
# ! is special character, translates into a single vowel.
# '#' is special character, translates into double vowels.
# '_' is special character, translates into nothing
clean=$(sed 's/^...$/!/ ; s/^e[^aeiou!#_]/!/ ; s/coax\|ua\|ire\|ove/#/g ; s/i[eao][rt]/#/g ; s/ce$\|se$\|ve$\|mes$\|fe$/_/ ; s/[a-z][oiu].e/!/g ; s/ou/!/g ; s/theater/th#t!r/g ; s/[^aeiou#!_]le$/!/g ; s/e[rt]e/!/ ; s/[aeiou][aeiou]/!/g ; s/[bgt][aeiou][aeiou]/!/g ; s/[^aeiou#!_]y/!/g ; s/#/ii/g ; s/!/i/g' <<< $word)
# 1] 3 letter words are always one syllable
# 2] 'ua' must stay as 2 vowels
# 3] 'ou' must always become 1 vowel, even after 'th'
# 2] to catch 'theatre' (and hopefully more..?)
# 3] consonant l-e syllables. consonants followed by 'le'
# 6] catch double vowels (unless preceded by 'th')
# 7] catch double vowels a 2nd time, unless preceded by certain consonants
# 8] to catch the cases where the 'y' becomes a vowel
# debug:
echo -n "$clean, "
# --- PROBLEM WORDS:
# bake
# above
# immediately
# reactions
# cautious
# closer
# sixes
# strangely
# powerful
# meander
# equaly
# headquarters
# dozens
# custodians
# count the syllables
syll_count=$(grep -io [aeiou] <<< $clean | wc -w)
# when syllable count returns 0, it must be 1
if [ $syll_count == 0 ]; then
syll_count=1
fi
# add the syllable count to the full count
full_count="$full_count $syll_count"
done
echo $full_count | tail -c +1
exitLast edited by rhowaldt (2011-09-18 04:27:00)