Communication traffic processing architectures and methods are disclosed. Processing load on main Central Processing Units (CPUs) can be alleviated by offloading data processing tasks to separate hardware. In one implementation, a processing architecture includes a main processor configured to execute a first portion of a driver software to perform protocol control and management task associated with control or management packets in a packet-based protocol according to which packets are received from a device, an offload processor configured to execute a second portion of the driver software to perform data processing task for data packets received according to the packet-based protocol, an interface to enable communication with the device, and an interconnect coupled to the main processor, to the offload subsystem, and to the interface.