08 April 2007

kata 6.

prag dave's anagrams resonated with me, because i'm working on hashing text down.

so follow along in irb, if you have /usr/share/dict/words:

class Symbol
def to_proc(*args) lambda {|*a| a.first.send self, *(args+a[1..-1])} end
alias [] to_proc
end # for faux currying

w = File.readlines('/usr/share/dict/words').map {|w| w.strip.downcase}.uniq ;:done
h = Hash.new([])
w.each {|word| nu = word.split('').sort.join; h[nu]+=[word]} ;:done
anas = h.values.find_all {|v| v.size > 1} ;:done
puts anas.map(&:join[',']) # all anagram n-tuples
puts "---"
puts anas.sort_by(&:size)[-30..-1].map(&:join[',']) # the top by set size
puts "---"
puts anas.sort_by {|a| a.first.size}[-30..-1].map(&:join[',']) # the top by word size

in fairness, the symbol-currying is in "sym2proc.r", so it's really just 10 lines of code, but that's the general idea.
(lots of the library functions look haskelly, but ruby just felt better for string processing)

1 comment:

Eric Mertens said...

Here's a Haskell solution to go along with the Ruby solution. (Since you mentioned Haskell, of course ;) )

Everything under main needs to be indented, but I'm not sure how to cleanly accomplish that using this comment system.

import Data.Char
import Data.List
import Data.Map (Map)
import qualified Data.Map as Map

comparing f x y = f x `compare` f y

main = do
table <- (Map.toList
. Map.filter (\e -> length e > 1)
. Map.fromListWith (++)
. map (\x -> (sort (map toLower x), [x]))
. words) `fmap` readFile "..\\wordlist.txt"

putStrLn "All anagrams:"
putStr . unlines . map (unwords . snd) $ table

putStrLn "Most letters:"
putStrLn . unwords . snd . maximumBy (comparing (length . fst)) $ table

putStrLn "Most anagrams:"
putStrLn . unwords . maximumBy (comparing length) . map snd $ table