Actually, the 32-bit kernel is quite capable of handling more then 4GB of memory, as it has for years on Mac Pro's and XServe's.
The issue is how much memory the kernel itself has to work in. A 32-bit kernel can only use 4GB of RAM for itself and all the memory it manages. Managing the address space for system memory itself takes up RAM, as the kernel has to keep its page space file in memory. That means that as you install memory beyond 4GB for applications, the 32-bit kernel has to use ever more of its 4GB limit just to keep track of all that extra memory space. That gets inefficient as total installed RAM gets big (like 64GB or more).
The 64-bit kernel itself can access 16 exabytes of RAM, so it has relatively unlimited resources to handle large memory installations.
At my work, we have a Dell Xeon (4xquad-core) machine with 128GB or RAM. If running a 32-bit kernel to manage that, the kernel itself would be using half of its available RAM just to map the memory page space for applications. Running a 64-bit kernel (Linux, in this case) means that the kernel can manage all that RAM without getting cramped for other kernel process RAM.
Right now, Apple machines really do not gain a whole lot from the 64-bit kernel, because so few of them have more then 4GB RAM in the first place. And so much of the software we all typically use is not written to take advantage of 64-bit memory space. That's changing for sure, and OS X 10.6 is, IMO, really Apple's attempt to make sure that their OS is positioned to take advantage of hardware and software before it comes mainstream (as opposed to playing catch up afterwards).