Matt Mahoney's Home Page
Dissertation proposal:
The Complexity of Natural Language
(compressed PostScript)
Text Compression as a Test for Artificial Intelligence
Abstract:
The Turing test for artificial intelligence is
widely accepted, but is subjective, qualitative,
non-repeatable, and difficult to implement.
An alternative test without these drawbacks is
to insert a machine’s language model into a
predictive encoder and compress a corpus of
natural language text. A ratio of 1.3 bits per
character or less indicates that the machine has
AI. Three pieces of evidence support this claim.
First, text compression is shown to be more
stringent than the Turing test under reasonable
assumptions. Second, humans use high-level
knowledge in character prediction tests. Third,
compression, like AI, is unsolved: under conditions
in which human text-prediction tests show an
entropy of 1.3 bits per character or less, the
best compression algorithm known achieves 1.87
bits per character.
Full text:
PostScript
RTF (Word 6.0)
This paper is still in progress. Last update 10/20/98
Everything else is on my
other home page
matmahoney@aol.com