Sysop: | Amessyroom |
---|---|
Location: | Fayetteville, NC |
Users: | 43 |
Nodes: | 6 (0 / 6) |
Uptime: | 98:28:02 |
Calls: | 290 |
Files: | 905 |
Messages: | 76,483 |
Here is an interesting, not entirely academic problem that me and a
colleague are "wrestling" with. Say there is a file, containing
entries like this:
foo 5
bar 20
baz 4
foo 6
foobar 23
foobar 3
...
There are a lot of lines in the file (~10000), but many of the words
repeat (there are ~500 unique words). We have endeavored to write a
program that would sum the occurences of each word, and display them
sorted alphabetically, e.g.:
bar 20
baz 4
foo 11
foobar 26
...
The file contains:
foo 5
bar 20
baz 4
foo 6
foobar 23
foobar 3
bar 68
baz 33
Gauche Scheme
(define (process file)
(let1 result '()
(with-input-from-file file
(cut generator-for-each
(lambda (item)
(ainc! result (symbol->string item) (read)))
read))
(sort result string<? car)))
(process "output.dat")
(("bar" . 88) ("baz" . 37) ("foo" . 11) ("foobar" . 26))
(flow "file"file-get-objects