Many current implementations of communication subsystems on workstation class computers transfer communication data to and from primary memory several times. This is due to software copying between user and operating system address spaces, presentation layer data conversion and other data manipulation functions. The consequence is that memory bandwidth is one of the major performance bottlenecks limiting high speed communication on these systems. We propose a communication subsystem architecture with a minimal-copy data path to widen this bottleneck. The architecture is tailored for protocol implementations using Integrated Layer Processing (ILP) and Application Layer Framing (ALF). We choose to implement these protocols in the address space of the application program. We present a new application program interface (API) between the protocols and the communication service in the operating system kernel. The API does not copy data, but instead passes pointers to page size data buffers. We analyze and discuss ILP loop and cache memory requirements on these buffers. Initial experiments show that the API can increase the communication performance with 50% compared to a standard BSD Unix socket interface.